date:20170907

Re: [patch net-next] team: fall back to hash if table entry is empty

2017-09-07 Thread Jiri Pirko

Fri, Sep 08, 2017 at 04:25:12AM CEST, ha...@drivescale.com wrote:
>If the hash to port mapping table does not have a valid port (i.e. when
>a port goes down), fall back to the simple hashing mechanism to avoid
>dropping packets.
>
>Signed-off-by: Jim Hanko 

Acked-by: Jiri Pirko 

net-next is closed for now (will open after rc1 is out). Please re-send
the patch (including my ack) then.

Thanks.

[PATCH 0/3] net: TCP/IP: A few minor cleanups

2017-09-07 Thread Michael Witten

The following patch series is an ad hoc "cleanup" that I made
while perusing the code (I'm not well versed in this code, so I
would not be surprised if there were objections to the changes):

  [1] net: __sock_cmsg_send(): Remove unused parameter `msg'
  [2] net: inet_recvmsg(): Remove unnecessary bitwise operation.
  [3] net: skb_queue_purge(): lock/unlock the list only once

Each patch will be sent as an individiual email; the total diff
is appended below for your convenience.

You may also fetch these patches from GitHub:

  git checkout -b test 5969d1bb3082b41eba8fd2c826559abe38ccb6df
  git pull https://github.com/mfwitten/linux.git net/tcp-ip/01-cleanup/00

Overall:

  include/net/sock.h | 2 +-
  net/core/skbuff.c  | 6 +-
  net/core/sock.c| 4 ++--
  net/ipv4/af_inet.c | 2 +-
  net/ipv4/ip_sockglue.c | 2 +-
  net/ipv6/datagram.c| 2 +-
  6 files changed, 11 insertions(+), 7 deletions(-)

Sincerly,
Michael Witten

diff --git a/include/net/sock.h b/include/net/sock.h
index 03a362568357..83373d7148a9 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1562,7 +1562,7 @@ struct sockcm_cookie {
u16 tsflags;
 };
 
-int __sock_cmsg_send(struct sock *sk, struct msghdr *msg, struct cmsghdr *cmsg,
+int __sock_cmsg_send(struct sock *sk, struct cmsghdr *cmsg,
 struct sockcm_cookie *sockc);
 int sock_cmsg_send(struct sock *sk, struct msghdr *msg,
   struct sockcm_cookie *sockc);
diff --git a/net/core/sock.c b/net/core/sock.c
index 9b7b6bbb2a23..425e03fe1c56 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2091,7 +2091,7 @@ struct sk_buff *sock_alloc_send_skb(struct sock *sk, 
unsigned long size,
 }
 EXPORT_SYMBOL(sock_alloc_send_skb);
 
-int __sock_cmsg_send(struct sock *sk, struct msghdr *msg, struct cmsghdr *cmsg,
+int __sock_cmsg_send(struct sock *sk, struct cmsghdr *cmsg,
 struct sockcm_cookie *sockc)
 {
u32 tsflags;
@@ -2137,7 +2137,7 @@ int sock_cmsg_send(struct sock *sk, struct msghdr *msg,
return -EINVAL;
if (cmsg->cmsg_level != SOL_SOCKET)
continue;
-   ret = __sock_cmsg_send(sk, msg, cmsg, sockc);
+   ret = __sock_cmsg_send(sk, cmsg, sockc);
if (ret)
return ret;
}
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index e558e4f9597b..c79b7822b0b9 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -263,7 +263,7 @@ int ip_cmsg_send(struct sock *sk, struct msghdr *msg, 
struct ipcm_cookie *ipc,
}
 #endif
if (cmsg->cmsg_level == SOL_SOCKET) {
-   err = __sock_cmsg_send(sk, msg, cmsg, >sockc);
+   err = __sock_cmsg_send(sk, cmsg, >sockc);
if (err)
return err;
continue;
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index a1f918713006..1d1926a4cbe2 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -756,7 +756,7 @@ int ip6_datagram_send_ctl(struct net *net, struct sock *sk,
}
 
if (cmsg->cmsg_level == SOL_SOCKET) {
-   err = __sock_cmsg_send(sk, msg, cmsg, sockc);
+   err = __sock_cmsg_send(sk, cmsg, sockc);
if (err)
return err;
continue;
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index e31108e5ef79..2dbed042a412 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -791,7 +791,7 @@ int inet_recvmsg(struct socket *sock, struct msghdr *msg, 
size_t size,
sock_rps_record_flow(sk);
 
err = sk->sk_prot->recvmsg(sk, msg, size, flags & MSG_DONTWAIT,
-  flags & ~MSG_DONTWAIT, _len);
+  flags, _len);
if (err >= 0)
msg->msg_namelen = addr_len;
return err;
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 68065d7d383f..66c0731a2a5f 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2834,9 +2834,13 @@ EXPORT_SYMBOL(skb_dequeue_tail);
  */
 void skb_queue_purge(struct sk_buff_head *list)
 {
+   unsigned long flags;
struct sk_buff *skb;
-   while ((skb = skb_dequeue(list)) != NULL)
+
+   spin_lock_irqsave(>lock, flags);
+   while ((skb = __skb_dequeue(list)) != NULL)
kfree_skb(skb);
+   spin_unlock_irqrestore(>lock, flags);
 }
 EXPORT_SYMBOL(skb_queue_purge);
 
-- 
2.14.1

[PATCH v2] netfilter: xt_hashlimit: fix build error caused by 64bit division

2017-09-07 Thread Vishwanath Pai

64bit division causes build/link errors on 32bit architectures. It
prints out error messages like:

ERROR: "__aeabi_uldivmod" [net/netfilter/xt_hashlimit.ko] undefined!

The value of avg passed through by userspace in BYTE mode cannot exceed
U32_MAX. Which means 64bit division in user2rate_bytes is unnecessary.
To fix this I have changed the type of param 'user' to u32.

Since anything greater than U32_MAX is an invalid input we error out in
hashlimit_mt_check_common() when this is the case.

Changes in v2:
Making return type as u32 would cause an overflow for small
values of 'user' (for example 2, 3 etc). To avoid this I bumped up
'r' to u64 again as well as the return type. This is OK since the
variable that stores the result is u64. We still avoid 64bit
division here since 'user' is u32.

Fixes: bea74641e378 ("netfilter: xt_hashlimit: add rate match mode")
Signed-off-by: Vishwanath Pai 
---
 net/netfilter/xt_hashlimit.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/net/netfilter/xt_hashlimit.c b/net/netfilter/xt_hashlimit.c
index 10d4823..1c1941e 100644
--- a/net/netfilter/xt_hashlimit.c
+++ b/net/netfilter/xt_hashlimit.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 
 MODULE_LICENSE("GPL");
 MODULE_AUTHOR("Harald Welte ");
@@ -527,12 +528,12 @@ static u64 user2rate(u64 user)
}
 }
 
-static u64 user2rate_bytes(u64 user)
+static u64 user2rate_bytes(u32 user)
 {
u64 r;
 
-   r = user ? 0xULL / user : 0xULL;
-   r = (r - 1) << 4;
+   r = user ? U32_MAX / user : U32_MAX;
+   r = (r - 1) << XT_HASHLIMIT_BYTE_SHIFT;
return r;
 }
 
@@ -588,7 +589,8 @@ static void rateinfo_init(struct dsthash_ent *dh,
dh->rateinfo.prev_window = 0;
dh->rateinfo.current_rate = 0;
if (hinfo->cfg.mode & XT_HASHLIMIT_BYTES) {
-   dh->rateinfo.rate = user2rate_bytes(hinfo->cfg.avg);
+   dh->rateinfo.rate =
+   user2rate_bytes((u32)hinfo->cfg.avg);
if (hinfo->cfg.burst)
dh->rateinfo.burst =
hinfo->cfg.burst * dh->rateinfo.rate;
@@ -870,7 +872,7 @@ static int hashlimit_mt_check_common(const struct 
xt_mtchk_param *par,
 
/* Check for overflow. */
if (revision >= 3 && cfg->mode & XT_HASHLIMIT_RATE_MATCH) {
-   if (cfg->avg == 0) {
+   if (cfg->avg == 0 || cfg->avg > U32_MAX) {
pr_info("hashlimit invalid rate\n");
return -ERANGE;
}
-- 
1.9.1

Re: [PATCH net-next v2 01/10] net: dsa: add debugfs interface

2017-09-07 Thread Greg KH

I agree you shouldn't be using debugfs for this, but in the future, if
you do write debugfs code, please take the following review into
account:

On Mon, Aug 28, 2017 at 03:17:39PM -0400, Vivien Didelot wrote:
> +static int dsa_debugfs_create_port(struct dsa_switch *ds, int port)
> +{
> + struct dentry *dir;
> + char name[32];
> +
> + snprintf(name, sizeof(name), DSA_PORT_FMT, port);
> +
> + dir = debugfs_create_dir(name, ds->debugfs_dir);
> + if (IS_ERR_OR_NULL(dir))
> + return -EFAULT;

You should _never_ care about the return value of a debugfs call, and
you should not need to ever propagate the error upward.  The api was
written to not need this.

Just call the function, and return, that's it.  If you need to save the
return value (i.e. it's a dentry), you also don't care, just save it and
pass it to some other debugfs call, and all will still be fine.  Your
code should never do anything different if a debugfs call succeeds or
fails.

> +static int dsa_debugfs_create_switch(struct dsa_switch *ds)
> +{
> + char name[32];
> + int i, err;
> +
> + /* skip if there is no debugfs support */
> + if (!dsa_debugfs_dir)
> + return 0;

Again, you don't care, all of these functions should return void.

> + snprintf(name, sizeof(name), DSA_SWITCH_FMT, ds->index);
> +
> + ds->debugfs_dir = debugfs_create_dir(name, dsa_debugfs_dir);
> + if (IS_ERR_OR_NULL(ds->debugfs_dir))
> + return -EFAULT;

See, that's horrid, you should never need to make such a bad check.

Also, even if it were the correct way to do this you never return EFAULT
unless there is a memory copy error to/from userspace.  That is not the
case here, or in any of this code, right?

> +static void dsa_debugfs_destroy_switch(struct dsa_switch *ds)
> +{
> + /* handles NULL */
> + debugfs_remove_recursive(ds->debugfs_dir);

Of course it handles NULL, why comment that?  That's the whole goal of
debugfs, to be dirt simple, allow you to do anything you want, in almost
no lines of code.

Also, it will never be mounted on a "real" system, so you better not
rely on it for anything "real".

> + err = dsa_debugfs_create_switch(ds);
> + if (err) {
> + pr_warn("DSA: failed to create debugfs interface for 
> switch %d (%d)\n",
> + ds->index, err);

Never complain to the syslog about a debugfs issue.

> +void dsa_debugfs_destroy_module(void)
> +{
> + /* handles NULL */
> + debugfs_remove_recursive(dsa_debugfs_dir);

again, of course it does, do you think we don't know how to write an
api?  :)

thanks,

greg k-h

Re: [PATCH net] bpf: don't select potentially stale ri->map from buggy xdp progs

2017-09-07 Thread Jesper Dangaard Brouer

On Fri,  8 Sep 2017 00:14:51 +0200
Daniel Borkmann  wrote:

> + /* This is really only caused by a deliberately crappy
> +  * BPF program, normally we would never hit that case,
> +  * so no need to inform someone via tracepoints either,
> +  * just bail out.
> +  */
> + if (unlikely(map_owner != xdp_prog))
> + return -EINVAL;

IMHO we do need to call the tracepoint here.  It is not just crappy
BPF-progs that cause this situation, it is also drivers not implementing
XDP_REDIRECT yet (which is all but ixgbe).  Due to the level XDP
operates at, tracepoints are the only way users can runtime troubleshoot
their XDP programs.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

Re: Use after free in __dst_destroy_metrics_generic

2017-09-07 Thread Subash Abhinov Kasiviswanathan


[ 3489.194392]  __ion_alloc+0x180/0x988


I do not see the __ion_alloc function in my tree.


Hi David

This function seems to be defined in an Android specific change.

https://android.googlesource.com/kernel/msm/+/20a5411d0115b16826f3d327b6abb0192c8a2001

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

Re: [PATCH net v2] rds: Fix incorrect statistics counting

2017-09-07 Thread David Miller

From: Håkon Bugge 
Date: Wed,  6 Sep 2017 18:35:51 +0200

> In rds_send_xmit() there is logic to batch the sends. However, if
> another thread has acquired the lock and has incremented the send_gen,
> it is considered a race and we yield. The code incrementing the
> s_send_lock_queue_raced statistics counter did not count this event
> correctly.
> 
> This commit counts the race condition correctly.
> 
> Changes from v1:
> - Removed check for *someone_on_xmit()*
> - Fixed incorrect indentation
> 
> Signed-off-by: Håkon Bugge 
> Reviewed-by: Knut Omang 

Applied.

Re: [PATCH] isdn: isdnloop: fix logic error in isdnloop_sendbuf

2017-09-07 Thread David Miller

From: Arnd Bergmann 
Date: Wed,  6 Sep 2017 15:38:58 +0200

> gcc-7 found an ancient bug in the loop driver, leading to a condition that
> is always false, meaning we ignore the contents of 'card->flags' here:
> 
> drivers/isdn/isdnloop/isdnloop.c:412:37: error: ?: using integer constants in 
> boolean context, the expression will always evaluate to 'true' 
> [-Werror=int-in-bool-context]
> 
> This changes the braces in the expression to ensure we actually
> compare the flag bits, rather than comparing a constant. As Joe Perches
> pointed out, an earlier patch of mine incorrectly assumed this was a
> false-positive warning.
> 
> Cc: Joe Perches 
> Link: https://patchwork.kernel.org/patch/9840289/
> Signed-off-by: Arnd Bergmann 

Applied, thank you.

Re: [PATCH net] udp: drop head states only when all skb references are gone

2017-09-07 Thread David Miller

From: Paolo Abeni 
Date: Wed,  6 Sep 2017 14:44:36 +0200

> After commit 0ddf3fb2c43d ("udp: preserve skb->dst if required
> for IP options processing") we clear the skb head state as soon
> as the skb carrying them is first processed.
> 
> Since the same skb can be processed several times when MSG_PEEK
> is used, we can end up lacking the required head states, and
> eventually oopsing.
> 
> Fix this clearing the skb head state only when processing the
> last skb reference.
> 
> Reported-by: Eric Dumazet 
> Fixes: 0ddf3fb2c43d ("udp: preserve skb->dst if required for IP options 
> processing")
> Signed-off-by: Paolo Abeni 

Applied and queued up for -stable, thanks.

[PATCH] net:netfilter alloc xt_byteslimit_htable with wrong size

2017-09-07 Thread zhizhou . tian

From: Zhizhou Tian 

struct xt_byteslimit_htable used hlist_head,
but alloc memory with sizeof(struct list_head)

Change-Id: I75bc60e47e0823700d4303c9d763b7995e3b7bb3
Signed-off-by: Zhizhou Tian 
---
 net/netfilter/xt_hashlimit.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/xt_hashlimit.c b/net/netfilter/xt_hashlimit.c
index 10d4823..962ea4a 100644
--- a/net/netfilter/xt_hashlimit.c
+++ b/net/netfilter/xt_hashlimit.c
@@ -279,7 +279,7 @@ static int htable_create(struct net *net, struct 
hashlimit_cfg3 *cfg,
size = cfg->size;
} else {
size = (totalram_pages << PAGE_SHIFT) / 16384 /
-  sizeof(struct list_head);
+  sizeof(struct hlist_head);
if (totalram_pages > 1024 * 1024 * 1024 / PAGE_SIZE)
size = 8192;
if (size < 16)
@@ -287,7 +287,7 @@ static int htable_create(struct net *net, struct 
hashlimit_cfg3 *cfg,
}
/* FIXME: don't use vmalloc() here or anywhere else -HW */
hinfo = vmalloc(sizeof(struct xt_hashlimit_htable) +
-   sizeof(struct list_head) * size);
+   sizeof(struct hlist_head) * size);
if (hinfo == NULL)
return -ENOMEM;
*out_hinfo = hinfo;
-- 
2.7.4

Re: [PATCH net] ip6_gre: update mtu properly in ip6gre_err

2017-09-07 Thread David Miller

From: Xin Long 
Date: Tue,  5 Sep 2017 17:26:33 +0800

> Now when probessing ICMPV6_PKT_TOOBIG, ip6gre_err only subtracts the
> offset of gre header from mtu info. The expected mtu of gre device
> should also subtract gre header. Otherwise, the next packets still
> can't be sent out.
> 
> Jianlin found this issue when using the topo:
>   client(ip6gre)<>(nic1)route(nic2)<->(ip6gre)server
> 
> and reducing nic2's mtu, then both tcp and sctp's performance with
> big size data became 0.
> 
> This patch is to fix it by also subtracting grehdr (tun->tun_hlen)
> from mtu info when updating gre device's mtu in ip6gre_err(). It
> also needs to subtract ETH_HLEN if gre dev'type is ARPHRD_ETHER.
> 
> Reported-by: Jianlin Shi 
> Signed-off-by: Xin Long 

Applied and queued up for -stable, thanks.

[patch net-next] team: fall back to hash if table entry is empty

2017-09-07 Thread Jim Hanko

If the hash to port mapping table does not have a valid port (i.e. when
a port goes down), fall back to the simple hashing mechanism to avoid
dropping packets.

Signed-off-by: Jim Hanko 
---
 drivers/net/team/team_mode_loadbalance.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/team/team_mode_loadbalance.c 
b/drivers/net/team/team_mode_loadbalance.c
index 1468ddf..a5ef970 100644
--- a/drivers/net/team/team_mode_loadbalance.c
+++ b/drivers/net/team/team_mode_loadbalance.c
@@ -137,7 +137,13 @@ static struct team_port *lb_htpm_select_tx_port(struct 
team *team,
struct sk_buff *skb,
unsigned char hash)
 {
-   return rcu_dereference_bh(LB_HTPM_PORT_BY_HASH(lb_priv, hash));
+   struct team_port *port;
+
+   port = rcu_dereference_bh(LB_HTPM_PORT_BY_HASH(lb_priv, hash));
+   if (likely(port))
+   return port;
+   /* If no valid port in the table, fall back to simple hash */
+   return lb_hash_select_tx_port(team, lb_priv, skb, hash);
 }
 
 struct lb_select_tx_port {
-- 
2.7.4

Re: [PATCH] net: ipv6: fix regression of no RTM_DELADDR sent after DAD failure

2017-09-07 Thread David Miller

From: Mike Manning 
Date: Mon,  4 Sep 2017 15:52:55 +0100

> Commit f784ad3d79e5 ("ipv6: do not send RTM_DELADDR for tentative
> addresses") incorrectly assumes that no RTM_NEWADDR are sent for
> addresses in tentative state, as this does happen for the standard
> IPv6 use-case of DAD failure, see the call to ipv6_ifa_notify() in
> addconf_dad_stop(). So as a result of this change, no RTM_DELADDR is
> sent after DAD failure for a link-local when strict DAD (accept_dad=2)
> is configured, or on the next admin down in other cases. The absence
> of this notification breaks backwards compatibility and causes problems
> after DAD failure if this notification was being relied on. The
> solution is to allow RTM_DELADDR to still be sent after DAD failure.
> 
> Fixes: f784ad3d79e5("ipv6: do not send RTM_DELADDR for tentative addresses")
> Signed-off-by: Mike Manning 
> Cc: Mahesh Bandewar 

Mahesh, please review this patch.

Re: [patch net] net: sched: fix memleak for chain zero

2017-09-07 Thread David Miller

From: Jiri Pirko 
Date: Wed,  6 Sep 2017 13:14:19 +0200

> From: Jiri Pirko 
> 
> There's a memleak happening for chain 0. The thing is, chain 0 needs to
> be always present, not created on demand. Therefore tcf_block_get upon
> creation of block calls the tcf_chain_create function directly. The
> chain is created with refcnt == 1, which is not correct in this case and
> causes the memleak. So move the refcnt increment into tcf_chain_get
> function even for the case when chain needs to be created.
> 
> Reported-by: Jakub Kicinski 
> Fixes: 5bc1701881e3 ("net: sched: introduce multichain support for filters")
> Signed-off-by: Jiri Pirko 

This doesn't apply cleanly any more, please respin.

Thanks Jiri.

[PATCH] netfilter: xt_hashlimit: fix build error caused by 64bit division

2017-09-07 Thread Vishwanath Pai

64bit division causes build/link errors on 32bit architectures. It
prints out error messages like:

ERROR: "__aeabi_uldivmod" [net/netfilter/xt_hashlimit.ko] undefined!

The value of avg passed through by userspace in BYTE mode cannot exceed
U32_MAX. Which means 64bit division in user2rate_bytes is unnecessary.

This fix changes the size of both the param as well as return type on
user2rate_bytes to u32.

Since anything greater than U32_MAX is an invalid input we error out in
hashlimit_mt_check_common() when this is the case.

Also fixed warning about const pointer conversion in cfg_copy().

Fixes: bea74641e378 ("netfilter: xt_hashlimit: add rate match mode")
Signed-off-by: Vishwanath Pai 
---
 net/netfilter/xt_hashlimit.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/net/netfilter/xt_hashlimit.c b/net/netfilter/xt_hashlimit.c
index 10d4823..1d818f1 100644
--- a/net/netfilter/xt_hashlimit.c
+++ b/net/netfilter/xt_hashlimit.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 
 MODULE_LICENSE("GPL");
 MODULE_AUTHOR("Harald Welte ");
@@ -527,12 +528,12 @@ static u64 user2rate(u64 user)
}
 }
 
-static u64 user2rate_bytes(u64 user)
+static u32 user2rate_bytes(u32 user)
 {
-   u64 r;
+   u32 r;
 
-   r = user ? 0xULL / user : 0xULL;
-   r = (r - 1) << 4;
+   r = user ? U32_MAX / user : U32_MAX;
+   r = (r - 1) << XT_HASHLIMIT_BYTE_SHIFT;
return r;
 }
 
@@ -588,7 +589,8 @@ static void rateinfo_init(struct dsthash_ent *dh,
dh->rateinfo.prev_window = 0;
dh->rateinfo.current_rate = 0;
if (hinfo->cfg.mode & XT_HASHLIMIT_BYTES) {
-   dh->rateinfo.rate = user2rate_bytes(hinfo->cfg.avg);
+   dh->rateinfo.rate =
+   user2rate_bytes((u32)hinfo->cfg.avg);
if (hinfo->cfg.burst)
dh->rateinfo.burst =
hinfo->cfg.burst * dh->rateinfo.rate;
@@ -870,7 +872,7 @@ static int hashlimit_mt_check_common(const struct 
xt_mtchk_param *par,
 
/* Check for overflow. */
if (revision >= 3 && cfg->mode & XT_HASHLIMIT_RATE_MATCH) {
-   if (cfg->avg == 0) {
+   if (cfg->avg == 0 || cfg->avg > U32_MAX) {
pr_info("hashlimit invalid rate\n");
return -ERANGE;
}
-- 
1.9.1

Re: Use after free in __dst_destroy_metrics_generic

2017-09-07 Thread David Miller

From: Subash Abhinov Kasiviswanathan 
Date: Thu, 07 Sep 2017 18:52:02 -0600

> [ 3489.194392]  __ion_alloc+0x180/0x988

I do not see the __ion_alloc function in my tree.

Re: [PATCH] net: bonding: Fix transmit load balancing in balance-alb mode if specified by sysfs

2017-09-07 Thread Kosuke Tatsukawa

Hi,

> On  7.09.2017 01:47, Kosuke Tatsukawa wrote:
>> Commit cbf5ecb30560 ("net: bonding: Fix transmit load balancing in
>> balance-alb mode") tried to fix transmit dynamic load balancing in
>> balance-alb mode, which wasn't working after commit 8b426dc54cf4
>> ("bonding: remove hardcoded value").
>> 
>> It turned out that my previous patch only fixed the case when
>> balance-alb was specified as bonding module parameter, and not when
>> balance-alb mode was set using /sys/class/net/*/bonding/mode (the most
>> common usage).  In the latter case, tlb_dynamic_lb was set up according
>> to the default mode of the bonding interface, which happens to be
>> balance-rr.
>> 
>> This additional patch addresses this issue by setting up tlb_dynamic_lb
>> to 1 if "mode" is set to balance-alb through the sysfs interface.
>> 
>> I didn't add code to change tlb_balance_lb back to the default value for
>> other modes, because "mode" is usually set up only once during
>> initialization, and it's not worthwhile to change the static variable
>> bonding_defaults in bond_main.c to a global variable just for this
>> purpose.
>> 
>> Commit 8b426dc54cf4 also changes the value of tlb_dynamic_lb for
>> balance-tlb mode if it is set up using the sysfs interface.  I didn't
>> change that behavior, because the value of tlb_balance_lb can be changed
>> using the sysfs interface for balance-tlb, and I didn't like changing
>> the default value back and forth for balance-tlb.
>> 
>> As for balance-alb, /sys/class/net/*/bonding/tlb_balance_lb cannot be
>> written to.  However, I think balance-alb with tlb_dynamic_lb set to 0
>> is not an intended usage, so there is little use making it writable at
>> this moment.
>> 
>> Fixes: 8b426dc54cf4 ("bonding: remove hardcoded value")
>> Reported-by: Reinis Rozitis 
>> Signed-off-by: Kosuke Tatsukawa 
>> Cc: sta...@vger.kernel.org  # v4.12+
>> ---
>>  drivers/net/bonding/bond_options.c |3 +++
>>  1 files changed, 3 insertions(+), 0 deletions(-)
>> 
> 
> I don't believe this to be the right solution, hardcoding it like this
> changes user-visible behaviour. The issue is that if someone configured
> it to be 0 in tlb mode, suddenly it will become 1 and will silently
> override their config if they switch the mode to alb. Also it robs users
> from their choice.
> 
> If you think this should be settable in ALB mode, then IMO you should
> edit tlb_dynamic_lb option's unsuppmodes and allow it to be set in ALB.
> That would also be consistent with how it's handled in TLB mode.

No, I don't think tlb_dynamic_lb should be settable in balance-alb at
this point.  All the current commits regarding tlb_dynamic_lb are for
balance-tlb mode, so I don't think balance-alb with tlb_dynamic_lb set
to 0 is an intended usage.


> Going back and looking at your previous fix I'd argue that it is also
> wrong, you should've removed the mode check altogether to return the
> original behaviour where the dynamic_lb is set to 1 (enabled) by
> default and then ALB mode would've had it, of course that would've left
> the case of setting it to 0 in TLB mode and switching to ALB, but that
> is a different issue.

Maybe balance-alb shouldn't be dependent on tlb_dynamic_lb.
tlb_dynamic_lb is referenced in the following functions.

 + bond_do_alb_xmit()  -- Used by both balance-tlb and balance-alb
 + bond_tlb_xmit()  -- Only used by balance-tlb
 + bond_open()  -- Used by both balance-tlb and balance-alb
 + bond_check_params()  -- Used during module initialization
 + bond_fill_info()  -- Used to get/set value
 + bond_option_tlb_dynamic_lb_set()  -- Used to get/set value
 + bonding_show_tlb_dynamic_lb()  -- Used to get/set value
 + bond_is_nondyn_tlb()  -- Only referenced if balance-tlb mode

The following untested patch adds code to make balance-alb work as if
tlb_dynamic_lb==1 for the functions which affect balance-alb mode.  It
also reverts my previous patch.

What do you think about this approach?
---
Kosuke TATSUKAWA  | 1st Platform Software Division
  | NEC Solution Innovators
  | ta...@ab.jp.nec.com


 drivers/net/bonding/bond_alb.c  |6 --
 drivers/net/bonding/bond_main.c |5 +++--
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index c02cc81..a9229b5 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -1325,7 +1325,8 @@ static int bond_do_alb_xmit(struct sk_buff *skb, struct 
bonding *bond,
if (!tx_slave) {
/* unbalanced or unassigned, send through primary */
tx_slave = rcu_dereference(bond->curr_active_slave);
-   if (bond->params.tlb_dynamic_lb)
+   if (bond->params.tlb_dynamic_lb ||
+   (BOND_MODE(bond) == BOND_MODE_ALB))
bond_info->unbalanced_load += skb->len;

Re: [PATCH] net: bonding: Fix transmit load balancing in balance-alb mode if specified by sysfs

2017-09-07 Thread महेश बंडेवार

On Thu, Sep 7, 2017 at 5:47 PM, Mahesh Bandewar (महेश बंडेवार)
 wrote:
> On Thu, Sep 7, 2017 at 5:39 PM, Mahesh Bandewar (महेश बंडेवार)
>  wrote:
>> On Thu, Sep 7, 2017 at 4:09 PM, Nikolay Aleksandrov
>>  wrote:
>>> On  7.09.2017 01:47, Kosuke Tatsukawa wrote:
 Commit cbf5ecb30560 ("net: bonding: Fix transmit load balancing in
 balance-alb mode") tried to fix transmit dynamic load balancing in
 balance-alb mode, which wasn't working after commit 8b426dc54cf4
 ("bonding: remove hardcoded value").

 It turned out that my previous patch only fixed the case when
 balance-alb was specified as bonding module parameter, and not when
 balance-alb mode was set using /sys/class/net/*/bonding/mode (the most
 common usage).  In the latter case, tlb_dynamic_lb was set up according
 to the default mode of the bonding interface, which happens to be
 balance-rr.

 This additional patch addresses this issue by setting up tlb_dynamic_lb
 to 1 if "mode" is set to balance-alb through the sysfs interface.

 I didn't add code to change tlb_balance_lb back to the default value for
 other modes, because "mode" is usually set up only once during
 initialization, and it's not worthwhile to change the static variable
 bonding_defaults in bond_main.c to a global variable just for this
 purpose.

 Commit 8b426dc54cf4 also changes the value of tlb_dynamic_lb for
 balance-tlb mode if it is set up using the sysfs interface.  I didn't
 change that behavior, because the value of tlb_balance_lb can be changed
 using the sysfs interface for balance-tlb, and I didn't like changing
 the default value back and forth for balance-tlb.

 As for balance-alb, /sys/class/net/*/bonding/tlb_balance_lb cannot be
 written to.  However, I think balance-alb with tlb_dynamic_lb set to 0
 is not an intended usage, so there is little use making it writable at
 this moment.

 Fixes: 8b426dc54cf4 ("bonding: remove hardcoded value")
>>
>> This is wrong! I think you are confusing various things here. ALB is
>> different mode from TLB and TLB-dynamic-lb is *only* a special case of
>> TLB. Your earlier patch is also wrong for the same reasons. However,
>> since the default value of tlb_dynamic_lb is set to 0  it's not
>> causing issues for ALB mode otherwise it would break ALB mode.
> I take this back. The default value is 1 so ALB is broken because of
> the referenced change.
>
>> tlb_dynamic_lb has absolutely nothing to do with ALB mode. Please
>> revert the earlier change (cbf5ecb30560).
>>
>> It's not clear to me what you saw as broken, so can't really suggest
>> what really need to be done.
Please scratch all I said.

[PATCH net] perf/bpf: fix a clang compilation issue

2017-09-07 Thread Yonghong Song

clang does not support variable length array for structure member.
It has the following error during compilation:

kernel/trace/trace_syscalls.c:568:17: error: fields must have a constant size:
'variable length array in structure' extension will never be supported
unsigned long args[sys_data->nb_args];
  ^

The fix is to use a fixed array length instead.

Reported-by: Nick Desaulniers 
Signed-off-by: Yonghong Song 
---
 include/linux/syscalls.h  | 2 ++
 kernel/trace/trace_syscalls.c | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 88951b7..95606a2 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -200,6 +200,8 @@ static inline int is_syscall_trace_event(struct 
trace_event_call *tp_event)
 #define SYSCALL_DEFINE5(name, ...) SYSCALL_DEFINEx(5, _##name, __VA_ARGS__)
 #define SYSCALL_DEFINE6(name, ...) SYSCALL_DEFINEx(6, _##name, __VA_ARGS__)
 
+#define SYSCALL_DEFINE_MAXARGS 6
+
 #define SYSCALL_DEFINEx(x, sname, ...) \
SYSCALL_METADATA(sname, x, __VA_ARGS__) \
__SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 9c4eef2..696afe7 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -565,7 +565,7 @@ static int perf_call_bpf_enter(struct bpf_prog *prog, 
struct pt_regs *regs,
struct syscall_tp_t {
unsigned long long regs;
unsigned long syscall_nr;
-   unsigned long args[sys_data->nb_args];
+   unsigned long args[SYSCALL_DEFINE_MAXARGS];
} param;
int i;
 
-- 
2.9.5

Re: Use after free in __dst_destroy_metrics_generic

2017-09-07 Thread Subash Abhinov Kasiviswanathan


Should be fixed by:

commit ad65a2f05695aced349e308193c6e2a6b1d87112
Author: Wei Wang 
Date:   Sat Jun 17 10:42:35 2017 -0700

ipv6: call dst_hold_safe() properly



Thanks for the info Stefano.

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

Re: [RFC net-next 0/5] TSN: Add qdisc-based config interfaces for traffic shapers

2017-09-07 Thread Vinicius Costa Gomes

Henrik Austad  writes:

> On Thu, Aug 31, 2017 at 06:26:20PM -0700, Vinicius Costa Gomes wrote:
>> Hi,
>>
>> This patchset is an RFC on a proposal of how the Traffic Control subsystem 
>> can
>> be used to offload the configuration of traffic shapers into network devices
>> that provide support for them in HW. Our goal here is to start upstreaming
>> support for features related to the Time-Sensitive Networking (TSN) set of
>> standards into the kernel.
>
> Nice to see that others are working on this as well! :)
>
> A short disclaimer; I'm pretty much anchored in the view "linux is the
> end-station in a TSN domain", is this your approach as well, or are you
> looking at this driver to be used in bridges as well? (because that will
> affect the comments on time-aware shaper and frame preemption)
>
> Yet another disclaimer; I am not a linux networking subsystem expert. Not
> by a long shot! There are black magic happening in the internals of the
> networking subsystem that I am not even aware of. So if something I say or
> ask does not make sense _at_all_, that's probably why..
>
> I do know a tiny bit about TSN though, and I have been messing around
> with it for a little while, hence my comments below
>
>> As part of this work, we've assessed previous public discussions related to 
>> TSN
>> enabling: patches from Henrik Austad (Cisco), the presentation from Eric Mann
>> at Linux Plumbers 2012, patches from Gangfeng Huang (National Instruments) 
>> and
>> the current state of the OpenAVNU project 
>> (https://github.com/AVnu/OpenAvnu/).
>
> /me eyes Cc ;p
>
>> Overview
>> 
>>
>> Time-sensitive Networking (TSN) is a set of standards that aim to address
>> resources availability for providing bandwidth reservation and bounded 
>> latency
>> on Ethernet based LANs. The proposal described here aims to cover mainly 
>> what is
>> needed to enable the following standards: 802.1Qat, 802.1Qav, 802.1Qbv and
>> 802.1Qbu.
>>
>> The initial target of this work is the Intel i210 NIC, but other controllers'
>> datasheet were also taken into account, like the Renesas RZ/A1H RZ/A1M group 
>> and
>> the Synopsis DesignWare Ethernet QoS controller.
>
> NXP has a TSN aware chip on the i.MX7 sabre board as well 

Cool. Will take a look.

>
>> Proposal
>> 
>>
>> Feature-wise, what is covered here are configuration interfaces for HW
>> implementations of the Credit-Based shaper (CBS, 802.1Qav), Time-Aware shaper
>> (802.1Qbv) and Frame Preemption (802.1Qbu). CBS is a per-queue shaper, while
>> Qbv and Qbu must be configured per port, with the configuration covering all
>> queues. Given that these features are related to traffic shaping, and that 
>> the
>> traffic control subsystem already provides a queueing discipline that 
>> offloads
>> config into the device driver (i.e. mqprio), designing new qdiscs for the
>> specific purpose of offloading the config for each shaper seemed like a good
>> fit.
>
> just to be clear, you register sch_cbs as a subclass to mqprio, not as a
> root class?

That's right.

>
>> For steering traffic into the correct queues, we use the socket option
>> SO_PRIORITY and then a mechanism to map priority to traffic classes / Tx 
>> queues.
>> The qdisc mqprio is currently used in our tests.
>
> Right, fair enough, I'd prefer the TSN qdisc to be the root-device and
> rather have mqprio for high priority traffic and another for 'everything
> else'', but this would work too. This is not that relevant at this stage I
> guess :)

That's a scenario I haven't considered, will give it some thought.

>
>> As for the shapers config interface:
>>
>>  * CBS (802.1Qav)
>>
>>This patchset is proposing a new qdisc called 'cbs'. Its 'tc' cmd line is:
>>$ tc qdisc add dev IFACE parent ID cbs locredit N hicredit M sendslope S \
>>  idleslope I
>
> So this confuses me a bit, why specify sendSlope?
>
> sendSlope = portTransmitRate - idleSlope
>
> and portTransmitRate is the speed of the MAC (which you get from the
> driver). Adding sendSlope here is just redundant I think.
>
> Also, does this mean that when you create the qdisc, you have locked the
> bandwidth for the scheduler? Meaning, if I later want to add another
> stream that requires more bandwidth, I have to close all active streams,
> reconfigure the qdisc and then restart?
>
>>Note that the parameters for this qdisc are the ones defined by the
>>802.1Q-2014 spec, so no hardware specific functionality is exposed here.
>
> You do need to know if the link is brought up as 100 or 1000 though - which
> the driver already knows.
>
>>  * Time-aware shaper (802.1Qbv):
>>
>>The idea we are currently exploring is to add a "time-aware", priority 
>> based
>>qdisc, that also exposes the Tx queues available and provides a mechanism 
>> for
>>mapping priority <-> traffic class <-> Tx queues in a similar fashion as
>>mqprio. We are calling this qdisc 'taprio', and its 'tc' cmd line would 
>> be:
>
> As

Re: [PATCH] rtl8xxxu: Don't printk raw binary if serial number is not burned in.

2017-09-07 Thread Stefano Brivio

On Fri,  8 Sep 2017 01:51:03 +0200
Adam Borowski  wrote:

> I assume that a blank efuse comes with all ones, thus I did not bother
> recognizing other possible junk values.  This matches 100% of dongles
> I've seen (a single Gembird 8192eu).
> 
> Signed-off-by: Adam Borowski 
> ---
>  drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8192e.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8192e.c 
> b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8192e.c
> index 80fee699f58a..bdc37e7272ca 100644
> --- a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8192e.c
> +++ b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8192e.c
> @@ -614,7 +614,11 @@ static int rtl8192eu_parse_efuse(struct rtl8xxxu_priv 
> *priv)
>  
>   dev_info(>udev->dev, "Vendor: %.7s\n", efuse->vendor_name);
>   dev_info(>udev->dev, "Product: %.11s\n", efuse->device_name);
> - dev_info(>udev->dev, "Serial: %.11s\n", efuse->serial);
> + if (strncmp(efuse->serial,
> + "\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff", 11))

You might want to use memchr_inv():

if (memchr_inv(efuse->serial, 0xff, 11))
dev_info(>udev->dev, "Serial: %.11s\n", efuse->serial);
...

Mostly cosmetic though.

--
Stefano

Re: [PATCH RFC 3/6] The file ksz_common.c contains common code shared by all KSZ switch drivers

2017-09-07 Thread Andrew Lunn

Hi Tristram

You patches are so badly mangled that they won't apply. So i'm
guessing a bit here...

What i think is you can do this in three patches:

1) Straight rename of ksz_common.c to ksz9477.c. Makefile changes to
   fit, no code changes.
2) Move common code back into ksz_common.c. Makefile changes to fit.
3) ksz9477.c prefix name changes

It looks like there is not too much common code in ksz_common.c. So
doing it this way makes the changes minimal, and things compile at
each stage.

Remember to use -M with git format-patch.

 Andrew

Re: Use after free in __dst_destroy_metrics_generic

2017-09-07 Thread Stefano Brivio

On Thu, 07 Sep 2017 18:52:02 -0600
Subash Abhinov Kasiviswanathan  wrote:

> We are seeing a possible use after free in ip6_dst_destroy.
> 
> It appears as if memory of the __DST_METRICS_PTR(old) was freed in some 
> path and allocated
> to ion driver. ion driver has also freed it. Finally the memory is freed 
> by the
> fib gc and crashes since it is already deallocated.
> 
> Target is running an ARM64 Android based 4.9 kernel.
> Issue was seen once on a regression rack (sorry, no reproducer).
> Any pointers to debug this is highly appreciated.
> 
> [ 3489.470581] [] object_err+0x4c/0x5c
> [ 3489.470586] [] free_debug_processing+0x2e0/0x398
> [ 3489.470589] [] __slab_free+0x300/0x3e0
> [ 3489.470593] [] kfree+0x28c/0x290
> [ 3489.470601] [] 
> __dst_destroy_metrics_generic+0x6c/0x78
> [ 3489.470609] [] ip6_dst_destroy+0xb0/0xb4

Should be fixed by:

commit ad65a2f05695aced349e308193c6e2a6b1d87112
Author: Wei Wang 
Date:   Sat Jun 17 10:42:35 2017 -0700

ipv6: call dst_hold_safe() properly


--
Stefano

Use after free in __dst_destroy_metrics_generic

2017-09-07 Thread Subash Abhinov Kasiviswanathan


We are seeing a possible use after free in ip6_dst_destroy.

It appears as if memory of the __DST_METRICS_PTR(old) was freed in some 
path and allocated
to ion driver. ion driver has also freed it. Finally the memory is freed 
by the

fib gc and crashes since it is already deallocated.

Target is running an ARM64 Android based 4.9 kernel.
Issue was seen once on a regression rack (sorry, no reproducer).
Any pointers to debug this is highly appreciated.

[ 3489.470581] [] object_err+0x4c/0x5c
[ 3489.470586] [] free_debug_processing+0x2e0/0x398
[ 3489.470589] [] __slab_free+0x300/0x3e0
[ 3489.470593] [] kfree+0x28c/0x290
[ 3489.470601] [] 
__dst_destroy_metrics_generic+0x6c/0x78

[ 3489.470609] [] ip6_dst_destroy+0xb0/0xb4
[ 3489.470612] [] dst_destroy+0x88/0x174
[ 3489.470616] [] icmp6_dst_gc+0x90/0xc0
[ 3489.470621] [] fib6_gc_timer_cb+0x40/0xc0
[ 3489.470630] [] call_timer_fn+0x58/0x1d0
[ 3489.470635] [] expire_timers+0x100/0x18c
[ 3489.470638] [] run_timer_softirq+0x98/0x270
[ 3489.470642] [] __do_softirq+0x150/0x438
[ 3489.470649] [] irq_exit+0xe0/0x138

[ 3489.127227] 
=
[ 3489.135489] BUG kmalloc-128 (Tainted: GW  O   ): Object 
already free
[ 3489.142591] 
-

[ 3489.142591]
[ 3489.152313] Disabling lock debugging due to kernel taint
[ 3489.157688] INFO: Allocated in alloc_largest_available+0x58/0x1f0 
age=17 cpu=4 pid=649

[ 3489.165667]  alloc_debug_processing+0x114/0x1a0
[ 3489.170233]  ___slab_alloc.constprop.72+0x654/0x690
[ 3489.175150]  __slab_alloc.isra.68.constprop.71+0x48/0x80
[ 3489.180505]  kmem_cache_alloc_trace+0x198/0x2c4
[ 3489.185073]  alloc_largest_available+0x58/0x1f0
[ 3489.189643]  ion_system_heap_allocate+0x1b0/0x6e8
[ 3489.194392]  __ion_alloc+0x180/0x988
[ 3489.198004]  ion_ioctl+0x26c/0x590
[ 3489.201437]  do_vfs_ioctl+0xcc/0x888
[ 3489.205037]  SyS_ioctl+0x90/0xa4
[ 3489.208298]  el0_svc_naked+0x24/0x28
[ 3489.211909] INFO: Freed in process_info+0x188/0x1bc age=21 cpu=4 
pid=649

[ 3489.218661]  free_debug_processing+0x180/0x398
[ 3489.223137]  __slab_free+0x300/0x3e0
[ 3489.226745]  kfree+0x28c/0x290
[ 3489.229827]  process_info+0x188/0x1bc
[ 3489.233526]  ion_system_heap_allocate+0x4b4/0x6e8
[ 3489.238266]  __ion_alloc+0x180/0x988
[ 3489.241875]  ion_ioctl+0x26c/0x590
[ 3489.245308]  do_vfs_ioctl+0xcc/0x888
[ 3489.248917]  SyS_ioctl+0x90/0xa4
[ 3489.252171]  el0_svc_naked+0x24/0x28

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

Re: [PATCH] net: bonding: Fix transmit load balancing in balance-alb mode if specified by sysfs

2017-09-07 Thread महेश बंडेवार

On Thu, Sep 7, 2017 at 5:39 PM, Mahesh Bandewar (महेश बंडेवार)
 wrote:
> On Thu, Sep 7, 2017 at 4:09 PM, Nikolay Aleksandrov
>  wrote:
>> On  7.09.2017 01:47, Kosuke Tatsukawa wrote:
>>> Commit cbf5ecb30560 ("net: bonding: Fix transmit load balancing in
>>> balance-alb mode") tried to fix transmit dynamic load balancing in
>>> balance-alb mode, which wasn't working after commit 8b426dc54cf4
>>> ("bonding: remove hardcoded value").
>>>
>>> It turned out that my previous patch only fixed the case when
>>> balance-alb was specified as bonding module parameter, and not when
>>> balance-alb mode was set using /sys/class/net/*/bonding/mode (the most
>>> common usage).  In the latter case, tlb_dynamic_lb was set up according
>>> to the default mode of the bonding interface, which happens to be
>>> balance-rr.
>>>
>>> This additional patch addresses this issue by setting up tlb_dynamic_lb
>>> to 1 if "mode" is set to balance-alb through the sysfs interface.
>>>
>>> I didn't add code to change tlb_balance_lb back to the default value for
>>> other modes, because "mode" is usually set up only once during
>>> initialization, and it's not worthwhile to change the static variable
>>> bonding_defaults in bond_main.c to a global variable just for this
>>> purpose.
>>>
>>> Commit 8b426dc54cf4 also changes the value of tlb_dynamic_lb for
>>> balance-tlb mode if it is set up using the sysfs interface.  I didn't
>>> change that behavior, because the value of tlb_balance_lb can be changed
>>> using the sysfs interface for balance-tlb, and I didn't like changing
>>> the default value back and forth for balance-tlb.
>>>
>>> As for balance-alb, /sys/class/net/*/bonding/tlb_balance_lb cannot be
>>> written to.  However, I think balance-alb with tlb_dynamic_lb set to 0
>>> is not an intended usage, so there is little use making it writable at
>>> this moment.
>>>
>>> Fixes: 8b426dc54cf4 ("bonding: remove hardcoded value")
>
> This is wrong! I think you are confusing various things here. ALB is
> different mode from TLB and TLB-dynamic-lb is *only* a special case of
> TLB. Your earlier patch is also wrong for the same reasons. However,
> since the default value of tlb_dynamic_lb is set to 0  it's not
> causing issues for ALB mode otherwise it would break ALB mode.
I take this back. The default value is 1 so ALB is broken because of
the referenced change.

> tlb_dynamic_lb has absolutely nothing to do with ALB mode. Please
> revert the earlier change (cbf5ecb30560).
>
> It's not clear to me what you saw as broken, so can't really suggest
> what really need to be done.

Re: [PATCH RFC 2/6] Create new file ksz9477.c from KSZ9477 code in ksz_common.c

2017-09-07 Thread Andrew Lunn

> +static void ksz_port_cfg(struct ksz_device *dev, int port, int offset, u8 
> bits,
> +  bool set)

Hi Tristram

As you move these functions around, it would be good to change the
prefix to ksz9477_

You have similar functions in the KSZ8795 driver. It is good to keep
the names different. This is useful when looking at stack dumps, etc.

Andrew

Re: [PATCH RFC 1/6] The file ksz_common.c will be used by other KSZ switch drivers.

2017-09-07 Thread Andrew Lunn

On Thu, Sep 07, 2017 at 09:08:58PM +, tristram...@microchip.com wrote:
> From: Tristram Ha 

Hi Tristram

Another process thing you are missing. Patch subject should follow a
pattern:

git log --oneline drivers/net/dsa/mv88e6xxx
bb0a2675f72b net: dsa: mv88e6xxx: Enable CMODE config support for 6390X
b3e05aa12319 net: dsa: mv88e6xxx: add a multi_chip info flag
68b8f60cf70d net: dsa: mv88e6xxx: add Energy Detect ops
9069c13a4867 net: dsa: mv88e6xxx: add a global2_addr info flag
9e907d739cc3 net: dsa: mv88e6xxx: add POT operation
a2a05db8a5ed net: dsa: mv88e6xxx: add POT flag to 88E6390
51c901a77562 net: dsa: mv88e6xxx: distinguish Global 2 Rsvd2CPU
d6c5e6aff50c net: dsa: mv88e6xxx: add number of Global 2 IRQs
74e60241ce14 net: dsa: mv88e6xxx: remove 88E6185 G2 interrupt
2466f64ae4e9 net: dsa: mv88e6xxx: remove unused capabilities

Please set the subject line in your patches similarly.

   Andrew

Re: [PATCH] net: bonding: Fix transmit load balancing in balance-alb mode if specified by sysfs

2017-09-07 Thread महेश बंडेवार

On Thu, Sep 7, 2017 at 4:09 PM, Nikolay Aleksandrov
 wrote:
> On  7.09.2017 01:47, Kosuke Tatsukawa wrote:
>> Commit cbf5ecb30560 ("net: bonding: Fix transmit load balancing in
>> balance-alb mode") tried to fix transmit dynamic load balancing in
>> balance-alb mode, which wasn't working after commit 8b426dc54cf4
>> ("bonding: remove hardcoded value").
>>
>> It turned out that my previous patch only fixed the case when
>> balance-alb was specified as bonding module parameter, and not when
>> balance-alb mode was set using /sys/class/net/*/bonding/mode (the most
>> common usage).  In the latter case, tlb_dynamic_lb was set up according
>> to the default mode of the bonding interface, which happens to be
>> balance-rr.
>>
>> This additional patch addresses this issue by setting up tlb_dynamic_lb
>> to 1 if "mode" is set to balance-alb through the sysfs interface.
>>
>> I didn't add code to change tlb_balance_lb back to the default value for
>> other modes, because "mode" is usually set up only once during
>> initialization, and it's not worthwhile to change the static variable
>> bonding_defaults in bond_main.c to a global variable just for this
>> purpose.
>>
>> Commit 8b426dc54cf4 also changes the value of tlb_dynamic_lb for
>> balance-tlb mode if it is set up using the sysfs interface.  I didn't
>> change that behavior, because the value of tlb_balance_lb can be changed
>> using the sysfs interface for balance-tlb, and I didn't like changing
>> the default value back and forth for balance-tlb.
>>
>> As for balance-alb, /sys/class/net/*/bonding/tlb_balance_lb cannot be
>> written to.  However, I think balance-alb with tlb_dynamic_lb set to 0
>> is not an intended usage, so there is little use making it writable at
>> this moment.
>>
>> Fixes: 8b426dc54cf4 ("bonding: remove hardcoded value")

This is wrong! I think you are confusing various things here. ALB is
different mode from TLB and TLB-dynamic-lb is *only* a special case of
TLB. Your earlier patch is also wrong for the same reasons. However,
since the default value of tlb_dynamic_lb is set to 0  it's not
causing issues for ALB mode otherwise it would break ALB mode.
tlb_dynamic_lb has absolutely nothing to do with ALB mode. Please
revert the earlier change (cbf5ecb30560).

It's not clear to me what you saw as broken, so can't really suggest
what really need to be done.

ATENCIÓN;

2017-09-07 Thread administrador

ATENCIÓN;

Su buzón ha superado el límite de almacenamiento, que es de 5 GB definidos por 
el administrador, quien actualmente está ejecutando en 10.9GB, no puede ser 
capaz de enviar o recibir correo nuevo hasta que vuelva a validar su buzón de 
correo electrónico. Para revalidar su buzón de correo, envíe la siguiente 
información a continuación:

nombre: 
Nombre de usuario: 
contraseña:
Confirmar contraseña:
E-mail: 
teléfono:
Si usted no puede revalidar su buzón, el buzón se deshabilitará!

Disculpa las molestias.
Código de verificación: es: 006524
Correo Soporte Técnico © 2016

¡gracias
Sistemas administrador

Re: WARN_ON() in __nf_hook_entries_try_shrink()

2017-09-07 Thread Stefano Brivio

On Thu, 7 Sep 2017 17:01:12 -0700
Linus Torvalds  wrote:

> On shutdown I get this (edited down a bit to be more legible):
> 
>   Stopping firewalld - dynamic firewall daemon...
>   NETFILTER_CFG table=nat family=2 entries=55
>   NETFILTER_CFG table=mangle family=2 entries=40
>   NETFILTER_CFG table=raw family=2 entries=28
>   NETFILTER_CFG table=security family=2 entries=13
>   NETFILTER_CFG table=filter family=2 entries=93
>   [ cut here ]
>   WARNING: CPU: 1 PID: 21512 at net/netfilter/core.c:218
> __nf_hook_entries_try_shrink+0x106/0x130

Should be fixed by:

http://patchwork.ozlabs.org/patch/810570/

--
Stefano

WARN_ON() in __nf_hook_entries_try_shrink()

2017-09-07 Thread Linus Torvalds

On shutdown I get this (edited down a bit to be more legible):

  Stopping firewalld - dynamic firewall daemon...
  NETFILTER_CFG table=nat family=2 entries=55
  NETFILTER_CFG table=mangle family=2 entries=40
  NETFILTER_CFG table=raw family=2 entries=28
  NETFILTER_CFG table=security family=2 entries=13
  NETFILTER_CFG table=filter family=2 entries=93
  [ cut here ]
  WARNING: CPU: 1 PID: 21512 at net/netfilter/core.c:218
__nf_hook_entries_try_shrink+0x106/0x130
  CPU: 1 PID: 21512 Comm: iptables-restor Not tainted
4.13.0-08555-gc0da4fa0d1a5 #7
  Hardware name: Dell Inc. XPS 13 9350/09JHRY, BIOS 1.4.17 05/10/2017
  RIP: 0010:__nf_hook_entries_try_shrink+0x106/0x130
  Call Trace:
 nf_unregister_net_hooks+0x117/0x240
 ipv4_hooks_unregister+0x60/0x70 [nf_conntrack_ipv4]
 nf_ct_netns_put+0x48/0x80 [nf_conntrack]
 conntrack_mt_destroy+0x15/0x20 [xt_conntrack]
 cleanup_match+0x43/0x70
 cleanup_entry+0x42/0xc0
 __do_replace+0x17a/0x1f0
 do_ipt_set_ctl+0x146/0x1b0
 nf_setsockopt+0x46/0x80
 ip_setsockopt+0x82/0xb0
 raw_setsockopt+0x34/0x40
 sock_common_setsockopt+0x14/0x20
 SyS_setsockopt+0x80/0xe0
 entry_SYSCALL_64_fastpath+0x13/0x94
[ .. warning repeats a few times .. ]
  ---[ end trace 56a6f5b20d97161d ]---
  NETFILTER_CFG table=broute family=7 entries=0
  NETFILTER_CFG table=nat family=7 entries=0
  NETFILTER_CFG table=filter family=7 entries=0
  NETFILTER_CFG table=mangle family=2 entries=6

and some searching notes that the kernel test robot already reported
this a few days ago but nobody reacted.

The kernel test robot seems to blame commit d3ad2c17b404 ("netfilter:
core: batch nf_unregister_net_hooks synchronize_net calls").

Hmm?

   Linus

[PATCH] rtl8xxxu: Don't printk raw binary if serial number is not burned in.

2017-09-07 Thread Adam Borowski

I assume that a blank efuse comes with all ones, thus I did not bother
recognizing other possible junk values.  This matches 100% of dongles
I've seen (a single Gembird 8192eu).

Signed-off-by: Adam Borowski 
---
 drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8192e.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8192e.c 
b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8192e.c
index 80fee699f58a..bdc37e7272ca 100644
--- a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8192e.c
+++ b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_8192e.c
@@ -614,7 +614,11 @@ static int rtl8192eu_parse_efuse(struct rtl8xxxu_priv 
*priv)
 
dev_info(>udev->dev, "Vendor: %.7s\n", efuse->vendor_name);
dev_info(>udev->dev, "Product: %.11s\n", efuse->device_name);
-   dev_info(>udev->dev, "Serial: %.11s\n", efuse->serial);
+   if (strncmp(efuse->serial,
+   "\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff", 11))
+   dev_info(>udev->dev, "Serial: %.11s\n", efuse->serial);
+   else
+   dev_info(>udev->dev, "Serial not available.\n");
 
if (rtl8xxxu_debug & RTL8XXXU_DEBUG_EFUSE) {
unsigned char *raw = priv->efuse_wifi.raw;
-- 
2.14.1

Re: [PATCH] net: bonding: Fix transmit load balancing in balance-alb mode if specified by sysfs

2017-09-07 Thread Nikolay Aleksandrov

On  7.09.2017 01:47, Kosuke Tatsukawa wrote:
> Commit cbf5ecb30560 ("net: bonding: Fix transmit load balancing in
> balance-alb mode") tried to fix transmit dynamic load balancing in
> balance-alb mode, which wasn't working after commit 8b426dc54cf4
> ("bonding: remove hardcoded value").
> 
> It turned out that my previous patch only fixed the case when
> balance-alb was specified as bonding module parameter, and not when
> balance-alb mode was set using /sys/class/net/*/bonding/mode (the most
> common usage).  In the latter case, tlb_dynamic_lb was set up according
> to the default mode of the bonding interface, which happens to be
> balance-rr.
> 
> This additional patch addresses this issue by setting up tlb_dynamic_lb
> to 1 if "mode" is set to balance-alb through the sysfs interface.
> 
> I didn't add code to change tlb_balance_lb back to the default value for
> other modes, because "mode" is usually set up only once during
> initialization, and it's not worthwhile to change the static variable
> bonding_defaults in bond_main.c to a global variable just for this
> purpose.
> 
> Commit 8b426dc54cf4 also changes the value of tlb_dynamic_lb for
> balance-tlb mode if it is set up using the sysfs interface.  I didn't
> change that behavior, because the value of tlb_balance_lb can be changed
> using the sysfs interface for balance-tlb, and I didn't like changing
> the default value back and forth for balance-tlb.
> 
> As for balance-alb, /sys/class/net/*/bonding/tlb_balance_lb cannot be
> written to.  However, I think balance-alb with tlb_dynamic_lb set to 0
> is not an intended usage, so there is little use making it writable at
> this moment.
> 
> Fixes: 8b426dc54cf4 ("bonding: remove hardcoded value")
> Reported-by: Reinis Rozitis 
> Signed-off-by: Kosuke Tatsukawa 
> Cc: sta...@vger.kernel.org  # v4.12+
> ---
>  drivers/net/bonding/bond_options.c |3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
> 

I don't believe this to be the right solution, hardcoding it like this
changes user-visible behaviour. The issue is that if someone configured
it to be 0 in tlb mode, suddenly it will become 1 and will silently
override their config if they switch the mode to alb. Also it robs users
from their choice.

If you think this should be settable in ALB mode, then IMO you should
edit tlb_dynamic_lb option's unsuppmodes and allow it to be set in ALB.
That would also be consistent with how it's handled in TLB mode.

Going back and looking at your previous fix I'd argue that it is also
wrong, you should've removed the mode check altogether to return the
original behaviour where the dynamic_lb is set to 1 (enabled) by
default and then ALB mode would've had it, of course that would've left
the case of setting it to 0 in TLB mode and switching to ALB, but that
is a different issue.

Cheers,
 Nik

Re: [PATCH RFC 2/5] Add KSZ8795 switch driver support in Makefile

2017-09-07 Thread Andrew Lunn

On Thu, Sep 07, 2017 at 10:29:34PM +, tristram...@microchip.com wrote:
> > -Original Message-
> > From: Andrew Lunn [mailto:and...@lunn.ch]
> > Sent: Thursday, September 07, 2017 2:56 PM
> > To: Tristram Ha - C24268
> > Cc: muva...@gmail.com; pa...@ucw.cz; nathan.leigh.con...@gmail.com;
> > vivien.dide...@savoirfairelinux.com; f.faine...@gmail.com;
> > netdev@vger.kernel.org; linux-ker...@vger.kernel.org; Woojung Huh - C21699
> > Subject: Re: [PATCH RFC 2/5] Add KSZ8795 switch driver support in Makefile
> > 
> > On Thu, Sep 07, 2017 at 09:17:10PM +, tristram...@microchip.com wrote:
> > > From: Tristram Ha 
> > >
> > > Add KSZ8795 switch support with SPI access.
> > >
> > > Signed-off-by: Tristram Ha 
> > > ---
> > > diff --git a/drivers/net/dsa/microchip/Makefile
> > b/drivers/net/dsa/microchip/Makefile
> > > index 0961c30..0d8ba48 100644
> > > --- a/drivers/net/dsa/microchip/Makefile
> > > +++ b/drivers/net/dsa/microchip/Makefile
> > > @@ -1,2 +1,4 @@
> > >  obj-$(CONFIG_MICROCHIP_KSZ)  += ksz9477.o ksz_common.o
> > >  obj-$(CONFIG_MICROCHIP_KSZ_SPI_DRIVER)   += ksz_spi.o
> > > +obj-$(CONFIG_MICROCHIP_KSZ8795)  += ksz8795.o ksz_common.o
> > > +obj-$(CONFIG_MICROCHIP_KSZ8795_SPI_DRIVER)   += ksz8795_spi.o
> > 
> > I've not tried it, but i think this breaks the build
> > 
> >  Andrew
> 
> So you would like to have all 5 patches in 1 patch file?

Or maybe this one last? Would that stop the build from breaking?

   Andrew

Re: [PATCH RFC 2/6] Create new file ksz9477.c from KSZ9477 code in ksz_common.c

2017-09-07 Thread Andrew Lunn

> Sorry about that.  It seems my e-mail system wraps the line too soon.

git send-email should solve your problems.

Andrew

Re: [PATCH RFC 1/6] The file ksz_common.c will be used by other KSZ switch drivers.

2017-09-07 Thread Andrew Lunn

> > > Signed-off-by: Tristram Ha 
> > > ---
> > > diff --git a/drivers/net/dsa/microchip/Makefile
> > > b/drivers/net/dsa/microchip/Makefile
> > > index ed335e2..0961c30 100644
> > > --- a/drivers/net/dsa/microchip/Makefile
> > > +++ b/drivers/net/dsa/microchip/Makefile
> > > @@ -1,2 +1,2 @@
> > > -obj-$(CONFIG_MICROCHIP_KSZ)  += ksz_common.o
> > > +obj-$(CONFIG_MICROCHIP_KSZ)  += ksz9477.o ksz_common.o
> > >  obj-$(CONFIG_MICROCHIP_KSZ_SPI_DRIVER)   += ksz_spi.o
> > 
> > Hi Tristram
> > 
> > I would of thought this would break the build. You don't add ksz9477.c 
> > until the
> > next patch.
> > 
> > Each patch needs to compile, otherwise you break git bisect.
> > 
> >  Andrew
> 

> Eventually the file will need to be broken in two, so you would like
> to see all 3 changes (Makefile, ksz_common.c, and ksz9477.c) in 1
> patch file?

You cannot break the build. Each patch must compile on its own.

Breaking changes up into smaller chunks is good. Makes it easier to
review.  So think about how you can do it without breaking the build,
but have smaller changes. For example, move a group of functions at a
time?

Andrew

Re: latest netdev net-next kernel panic

2017-09-07 Thread Paweł Staszewski

What is weird - changed in bios and enabled p-states and no panic from 
30 minutes


w
 00:38:07 up 33 min,  2 users,  load average: 0.10, 0.13, 0.17
USER TTY    LOGIN@   IDLE   JCPU   PCPU WHAT

So the problem can be related to some cpu weird stuff with c/p states or 
it is random.




W dniu 2017-09-08 o 00:08, Paweł Staszewski pisze:

Also this panic occured about 3 minutes after traffic rise to 40G

before host was working without trafic for about 30minutes - after 
traffic start to flow this bug start.


Now i have this host without traffic and it is working for almost 1 
hour :)




W dniu 2017-09-07 o 23:47, Paweł Staszewski pisze:

Hi Eric


today upgraded some host from 4.13.0-rc7+ to latest net-next from git

and run some tests simulate clients traffic 40Gbit / 5Mpps mixed tcp 
+ udp


There is also bgpd running that feeds FIB with full bgp feed - about 
700k prefixes (so routing tables is about 700k routes with about 300 
nexthops)



There is also simulated about 1000 ip neigh hosts (arp) with about 
300 ip interfaces assigned on vlans




With kernel 4.13.0-rc7+ (git from 05.09.2017) all tests are good 
(besides that there is panic after about 10 hours - every time same 
like in this link:


https://bugzilla.kernel.org/attachment.cgi?id=258257

But looks like acpi related or c/p states with new intel cpus... dono 
now.





W dniu 2017-09-07 o 23:39, Eric Dumazet pisze:

On Thu, 2017-09-07 at 23:30 +0200, Paweł Staszewski wrote:

I want to mention also here cause last trace for RIP was dst_dev_put

https://bugzilla.kernel.org/attachment.cgi?id=258259

Can you provide a bit more context ?

Thanks !

Re: [PATCH net] bpf: don't select potentially stale ri->map from buggy xdp progs

2017-09-07 Thread Alexei Starovoitov


On 9/7/17 3:14 PM, Daniel Borkmann wrote:

Fixes: 97f91a7cf04f ("bpf: add bpf_redirect_map helper routine")
Reported-by: Jesper Dangaard Brouer 
Signed-off-by: Daniel Borkmann 
Signed-off-by: John Fastabend 
---
 kernel/bpf/verifier.c | 16 
 net/core/filter.c | 21 +++--
 2 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index d690c7d..477b693 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4203,6 +4203,22 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env)
continue;
}

+   if (insn->imm == BPF_FUNC_redirect_map) {
+   u64 addr = (unsigned long)prog;
+   struct bpf_insn r4_ld[] = {
+   BPF_LD_IMM64(BPF_REG_4, addr),
+   *insn,
+   };
+   cnt = ARRAY_SIZE(r4_ld);
+
+   new_prog = bpf_patch_insn_data(env, i + delta, r4_ld, 
cnt);


that's a neat trick.
I think we'll be seeing more of such pattern in the future.
Definitely less intrusive fix than asking drivers or net/core
to clear it.
Acked-by: Alexei Starovoitov

Re: [PATCH RFC 3/5] Add KSZ8795 switch driver

2017-09-07 Thread Andrew Lunn

On Thu, Sep 07, 2017 at 09:17:16PM +, tristram...@microchip.com wrote:
> From: Tristram Ha 
> 
> Add KSZ8795 switch support with function code.
> 
> Signed-off-by: Tristram Ha 
> ---
> diff --git a/drivers/net/dsa/microchip/ksz8795.c 
> b/drivers/net/dsa/microchip/ksz8795.c
> new file mode 100644
> index 000..e4d4e6a
> --- /dev/null
> +++ b/drivers/net/dsa/microchip/ksz8795.c
> @@ -0,0 +1,2066 @@
> +/*
> + * Microchip KSZ8795 switch driver
> + *
> + * Copyright (C) 2017 Microchip Technology Inc.
> + *   Tristram Ha 
> + *
> + * Permission to use, copy, modify, and/or distribute this software for any
> + * purpose with or without fee is hereby granted, provided that the above
> + * copyright notice and this permission notice appear in all copies.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
> + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
> + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
> + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
> + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
> + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
> + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "ksz_priv.h"
> +#include "ksz8795.h"
> +
> +/**
> + * Some counters do not need to be read too often because they are less 
> likely
> + * to increase much.
> + */

What does comment mean? Are you caching statistics, and updating
different values at different rates?

> +static const struct {
> + int interval;
> + char string[ETH_GSTRING_LEN];
> +} mib_names[TOTAL_SWITCH_COUNTER_NUM] = {
> + { 1, "rx_hi" },
> + { 4, "rx_undersize" },
> + { 4, "rx_fragments" },
> + { 4, "rx_oversize" },
> + { 4, "rx_jabbers" },
> + { 4, "rx_symbol_err" },
> + { 4, "rx_crc_err" },
> + { 4, "rx_align_err" },
> + { 4, "rx_mac_ctrl" },
> + { 1, "rx_pause" },
> + { 1, "rx_bcast" },
> + { 1, "rx_mcast" },
> + { 1, "rx_ucast" },
> + { 2, "rx_64_or_less" },
> + { 2, "rx_65_127" },
> + { 2, "rx_128_255" },
> + { 2, "rx_256_511" },
> + { 2, "rx_512_1023" },
> + { 2, "rx_1024_1522" },
> + { 2, "rx_1523_2000" },
> + { 2, "rx_2001" },
> + { 1, "tx_hi" },
> + { 4, "tx_late_col" },
> + { 1, "tx_pause" },
> + { 1, "tx_bcast" },
> + { 1, "tx_mcast" },
> + { 1, "tx_ucast" },
> + { 4, "tx_deferred" },
> + { 4, "tx_total_col" },
> + { 4, "tx_exc_col" },
> + { 4, "tx_single_col" },
> + { 4, "tx_mult_col" },
> + { 1, "rx_total" },
> + { 1, "tx_total" },
> + { 2, "rx_discards" },
> + { 2, "tx_discards" },
> +};
> +
> +static void ksz_cfg(struct ksz_device *dev, u32 addr, u8 bits, bool set)
> +{
> + u8 data;
> +
> + ksz_read8(dev, addr, );
> + if (set)
> + data |= bits;
> + else
> + data &= ~bits;
> + ksz_write8(dev, addr, data);
> +}

Shouldn't this function be in the shared code? There is an exact copy
currently in ksz_common.c.

> +static void ksz_port_cfg(struct ksz_device *dev, int port, int offset, u8 
> bits,
> +  bool set)
> +{
> + u32 addr;
> + u8 data;
> +
> + addr = PORT_CTRL_ADDR(port, offset);
> + ksz_read8(dev, addr, );
> +
> + if (set)
> + data |= bits;
> + else
> + data &= ~bits;
> +
> + ksz_write8(dev, addr, data);
> +}

And this also appears to be shared.

> +static void port_r_mib_cnt(struct ksz_device *dev, int port, u16 addr, u64 
> *cnt)

It would be good to use the ksz8795_ prefix for functions specific to
the KSZ8795.

> +{
> + u32 data;
> + u16 ctrl_addr;
> + u8 check;
> + int timeout;
> +
> + ctrl_addr = addr + SWITCH_COUNTER_NUM * port;
> +
> + ctrl_addr |= IND_ACC_TABLE(TABLE_MIB | TABLE_READ);
> + mutex_lock(>stats_mutex);
> + ksz_write16(dev, REG_IND_CTRL_0, ctrl_addr);
> +
> + for (timeout = 1; timeout > 0; timeout--) {

A rather short timeount!

> + ksz_read8(dev, REG_IND_MIB_CHECK, );
> +
> + if (check & MIB_COUNTER_VALID) {
> + ksz_read32(dev, REG_IND_DATA_LO, );
> + if (check & MIB_COUNTER_OVERFLOW)
> + *cnt += MIB_COUNTER_VALUE + 1;
> + *cnt += data & MIB_COUNTER_VALUE;
> + break;
> + }

Should there be a dev_err() here about timing out?


> + }
> + mutex_unlock(>stats_mutex);
> +}
> +
> +static int valid_dyn_entry(struct ksz_device *dev, u8 *data)
> +{
> + int timeout = 100;
> +
> + do {
> + ksz_read8(dev, REG_IND_DATA_CHECK,

RE: [PATCH RFC 2/5] Add KSZ8795 switch driver support in Makefile

2017-09-07 Thread Tristram.Ha

> -Original Message-
> From: Andrew Lunn [mailto:and...@lunn.ch]
> Sent: Thursday, September 07, 2017 2:56 PM
> To: Tristram Ha - C24268
> Cc: muva...@gmail.com; pa...@ucw.cz; nathan.leigh.con...@gmail.com;
> vivien.dide...@savoirfairelinux.com; f.faine...@gmail.com;
> netdev@vger.kernel.org; linux-ker...@vger.kernel.org; Woojung Huh - C21699
> Subject: Re: [PATCH RFC 2/5] Add KSZ8795 switch driver support in Makefile
> 
> On Thu, Sep 07, 2017 at 09:17:10PM +, tristram...@microchip.com wrote:
> > From: Tristram Ha 
> >
> > Add KSZ8795 switch support with SPI access.
> >
> > Signed-off-by: Tristram Ha 
> > ---
> > diff --git a/drivers/net/dsa/microchip/Makefile
> b/drivers/net/dsa/microchip/Makefile
> > index 0961c30..0d8ba48 100644
> > --- a/drivers/net/dsa/microchip/Makefile
> > +++ b/drivers/net/dsa/microchip/Makefile
> > @@ -1,2 +1,4 @@
> >  obj-$(CONFIG_MICROCHIP_KSZ)+= ksz9477.o ksz_common.o
> >  obj-$(CONFIG_MICROCHIP_KSZ_SPI_DRIVER) += ksz_spi.o
> > +obj-$(CONFIG_MICROCHIP_KSZ8795)+= ksz8795.o ksz_common.o
> > +obj-$(CONFIG_MICROCHIP_KSZ8795_SPI_DRIVER) += ksz8795_spi.o
> 
> I've not tried it, but i think this breaks the build
> 
>  Andrew

So you would like to have all 5 patches in 1 patch file?

RE: [PATCH RFC 1/6] The file ksz_common.c will be used by other KSZ switch drivers.

2017-09-07 Thread Tristram.Ha

> -Original Message-
> From: Andrew Lunn [mailto:and...@lunn.ch]
> Sent: Thursday, September 07, 2017 2:25 PM
> To: Tristram Ha - C24268
> Cc: muva...@gmail.com; pa...@ucw.cz; nathan.leigh.con...@gmail.com;
> vivien.dide...@savoirfairelinux.com; f.faine...@gmail.com;
> netdev@vger.kernel.org; linux-ker...@vger.kernel.org; Woojung Huh - C21699
> Subject: Re: [PATCH RFC 1/6] The file ksz_common.c will be used by other KSZ
> switch drivers.
> 
> On Thu, Sep 07, 2017 at 09:08:58PM +, tristram...@microchip.com wrote:
> > From: Tristram Ha 
> >
> > Break ksz_common.c into 2 files so that the common code can be used by other
> KSZ switch drivers.
> >
> > Signed-off-by: Tristram Ha 
> > ---
> > diff --git a/drivers/net/dsa/microchip/Makefile
> > b/drivers/net/dsa/microchip/Makefile
> > index ed335e2..0961c30 100644
> > --- a/drivers/net/dsa/microchip/Makefile
> > +++ b/drivers/net/dsa/microchip/Makefile
> > @@ -1,2 +1,2 @@
> > -obj-$(CONFIG_MICROCHIP_KSZ)+= ksz_common.o
> > +obj-$(CONFIG_MICROCHIP_KSZ)+= ksz9477.o ksz_common.o
> >  obj-$(CONFIG_MICROCHIP_KSZ_SPI_DRIVER) += ksz_spi.o
> 
> Hi Tristram
> 
> I would of thought this would break the build. You don't add ksz9477.c until 
> the
> next patch.
> 
> Each patch needs to compile, otherwise you break git bisect.
> 
>  Andrew

Eventually the file will need to be broken in two, so you would like to see all 
3 changes (Makefile, ksz_common.c, and ksz9477.c) in 1 patch file?

RE: [PATCH RFC 2/6] Create new file ksz9477.c from KSZ9477 code in ksz_common.c

2017-09-07 Thread Tristram.Ha

> -Original Message-
> From: Andrew Lunn [mailto:and...@lunn.ch]
> Sent: Thursday, September 07, 2017 2:33 PM
> To: Tristram Ha - C24268
> Cc: muva...@gmail.com; pa...@ucw.cz; nathan.leigh.con...@gmail.com;
> vivien.dide...@savoirfairelinux.com; f.faine...@gmail.com;
> netdev@vger.kernel.org; linux-ker...@vger.kernel.org; Woojung Huh - C21699
> Subject: Re: [PATCH RFC 2/6] Create new file ksz9477.c from KSZ9477 code in
> ksz_common.c
> 
> On Thu, Sep 07, 2017 at 09:09:04PM +, tristram...@microchip.com wrote:
> > From: Tristram Ha 
> >
> > Create new ksz9477.c file from original ksz_common.c.
> >
> > Signed-off-by: Tristram Ha 
> > ---
> > diff --git a/drivers/net/dsa/microchip/ksz9477.c
> > b/drivers/net/dsa/microchip/ksz9477.c
> > new file mode 100644
> > index 000..bc722b4
> > --- /dev/null
> > +++ b/drivers/net/dsa/microchip/ksz9477.c
> > @@ -0,0 +1,1317 @@
> > +/*
> > + * Microchip switch driver main logic
> > + *
> > + * Copyright (C) 2017
> > + *
> > + * Permission to use, copy, modify, and/or distribute this software
> > +for any
> > + * purpose with or without fee is hereby granted, provided that the
> > +above
> 
> Tristram
> 
> It looks like something hand mangled this comment. "any" and "above"
> appear to be on a line on there own.
> 
> > + * copyright notice and this permission notice appear in all copies.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL
> > +WARRANTIES
> > + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES
> OF
> > + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE
> > +LIABLE FOR
> > + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY
> > +DAMAGES
> > + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
> IN
> > +AN
> > + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
> ARISING
> > +OUT OF
> > + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
> > + */
> > +
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#include "ksz_priv.h"
> > +#include "ksz_9477_reg.h"
> > +
> > +static const struct {
> > +   int index;
> > +   char string[ETH_GSTRING_LEN];
> > +} mib_names[TOTAL_SWITCH_COUNTER_NUM] = {
> > +   { 0x00, "rx_hi" },
> > +   { 0x01, "rx_undersize" },
> > +   { 0x02, "rx_fragments" },
> > +   { 0x03, "rx_oversize" },
> > +   { 0x04, "rx_jabbers" },
> > +   { 0x05, "rx_symbol_err" },
> > +   { 0x06, "rx_crc_err" },
> > +   { 0x07, "rx_align_err" },
> > +   { 0x08, "rx_mac_ctrl" },
> > +   { 0x09, "rx_pause" },
> > +   { 0x0A, "rx_bcast" },
> > +   { 0x0B, "rx_mcast" },
> > +   { 0x0C, "rx_ucast" },
> > +   { 0x0D, "rx_64_or_less" },
> > +   { 0x0E, "rx_65_127" },
> > +   { 0x0F, "rx_128_255" },
> > +   { 0x10, "rx_256_511" },
> > +   { 0x11, "rx_512_1023" },
> > +   { 0x12, "rx_1024_1522" },
> > +   { 0x13, "rx_1523_2000" },
> > +   { 0x14, "rx_2001" },
> > +   { 0x15, "tx_hi" },
> > +   { 0x16, "tx_late_col" },
> > +   { 0x17, "tx_pause" },
> > +   { 0x18, "tx_bcast" },
> > +   { 0x19, "tx_mcast" },
> > +   { 0x1A, "tx_ucast" },
> > +   { 0x1B, "tx_deferred" },
> > +   { 0x1C, "tx_total_col" },
> > +   { 0x1D, "tx_exc_col" },
> > +   { 0x1E, "tx_single_col" },
> > +   { 0x1F, "tx_mult_col" },
> > +   { 0x80, "rx_total" },
> > +   { 0x81, "tx_total" },
> > +   { 0x82, "rx_discards" },
> > +   { 0x83, "tx_discards" },
> > +};
> > +
> > +static void ksz_cfg(struct ksz_device *dev, u32 addr, u8 bits, bool
> > +set) {
> 
> More mangling? Should set be on the end of the previous line?
> 
> > +static void read_table(struct dsa_switch *ds, u32 *table) {
> > +   struct ksz_device *dev = ds->priv;
> > +
> > +   ksz_read32(dev, REG_SW_ALU_VAL_A, [0]);
> > +   ksz_read32(dev, REG_SW_ALU_VAL_B, [1]);
> > +   ksz_read32(dev, REG_SW_ALU_VAL_C, [2]);
> > +   ksz_read32(dev, REG_SW_ALU_VAL_D, [3]); }
> > +
> > +static void write_table(struct dsa_switch *ds, u32 *table) {
> > +   struct ksz_device *dev = ds->priv;
> > +
> > +   ksz_write32(dev, REG_SW_ALU_VAL_A, table[0]);
> > +   ksz_write32(dev, REG_SW_ALU_VAL_B, table[1]);
> > +   ksz_write32(dev, REG_SW_ALU_VAL_C, table[2]);
> > +   ksz_write32(dev, REG_SW_ALU_VAL_D, table[3]); }
> 
> More mangling? } at the end of a line?
> 
> I will stop reading now and wait for a v2 this is not corrupt.
> 
>   Andrew

Sorry about that.  It seems my e-mail system wraps the line too soon.

RE: [Intel-wired-lan] [PATCH] e1000e: changed some expensive calls of udelay to usleep_range

2017-09-07 Thread Brown, Aaron F

> From: Intel-wired-lan [mailto:intel-wired-lan-boun...@osuosl.org] On
> Behalf Of Pavel Machek
> Sent: Monday, September 4, 2017 9:26 AM
> To: Matthew Tan 
> Cc: michael.kardo...@nxp.com; Williams, Mitch A
> ; linux-ker...@vger.kernel.org;
> john.ronc...@intel.com; intel-wired-...@lists.osuosl.org;
> netdev@vger.kernel.org
> Subject: Re: [Intel-wired-lan] [PATCH] e1000e: changed some expensive calls
> of udelay to usleep_range
> 
> Hi!
> 
> > @@ -183,7 +183,7 @@ s32 e1000e_read_phy_reg_mdic(struct e1000_hw
> *hw, u32 offset, u16 *data)
> >  * reading duplicate data in the next MDIC transaction.
> >  */
> > if (hw->mac.type == e1000_pch2lan)
> > -   udelay(100);
> > +   usleep_range(90, 100);
> >
> > return 0;
> >  }
> 
> Can you explain why shortening the delay is acceptable here?

Maybe it's not.

This patch is causing speed / duplex tests to fail on several of my test 
systems.  Specifically a Lenova laptop with an 82577 and a NUC with an i218 
(though that does not mean it is limited to those or that it's not related to 
the individual link partner.)

In both cases they "blow up" when attempting to link at 10 Mb with a Call Trace 
on the console / log and a watchdog: BUG: soft lockup - CPU#X stuck appearing 
in the current login session.  The simplest way to produce the crash is simply 
load the interface and attempt to bring it up with the link partner configured 
to 10 Mb half (force or autoneg) or forced to 10 Mb full (autoneg to 10 Mb full 
did not blow up.)  The problem is also appearing at some, but not all 100 Mb 
speeds and duplexes.  100 Mb also crashes, apparently with the same pattern.  
With the link partner set to 100 Mb half Autoneg on, 100 Mb half forced and 100 
Mb full forced the system crashes, 100 Full autoneg seems to work fine.

The call trace repeats every 30 seconds, is captured in the log and looks a lot 
like this:

Sep  7 14:17:10 u1459 kernel: watchdog: BUG: soft lockup - CPU#2 stuck for 22s! 
[kworker/2:0:23]
Sep  7 14:17:10 u1459 kernel: Modules linked in: e1000e ptp pps_core 
ip6table_filter ip6_tables ebtable_nat ebtables nf_conntrack_ipv4 
nf_defrag_ipv4 xt_state nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 
xt_CHECKSUM iptable_mangle iptable_filter ip_tables bridge stp llc nfsd lockd 
grace nfs_acl auth_rpcgss sunrpc autofs4 ipv6 crc_ccitt dm_mirror 
dm_region_hash dm_log vhost_net vhost tap tun kvm_intel kvm irqbypass uinput 
joydev ax88179_178a usbnet mii iTCO_wdt iTCO_vendor_support sg i2c_i801 lpc_ich 
mfd_core xhci_pci snd_hda_codec_realtek xhci_hcd snd_hda_codec_hdmi 
snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep 
snd_seq snd_seq_device snd_pcm snd_timer snd soundcore dm_mod(E) dax(E) ext4(E) 
jbd2(E) mbcache(E) sd_mod(E) ahci(E) libahci(E) i915(E) drm_kms_helper(E) 
drm(E) fb_sys_fops(E)
Sep  7 14:17:10 u1459 kernel: sysimgblt(E) sysfillrect(E) syscopyarea(E) 
i2c_algo_bit(E) i2c_core(E) iosf_mbi(E) video(E)
Sep  7 14:17:10 u1459 kernel: CPU: 2 PID: 23 Comm: kworker/2:0 Tainted: G   
 EL  4.13.0-rc6_next-queue_dev-queue_regress #38
Sep  7 14:17:10 u1459 kernel: Hardware name:  /NUC5i5MYBE, BIOS 
MYBDWi5v.86A.0030.2016.0527.1657 05/27/2016
Sep  7 14:17:10 u1459 kernel: Workqueue: events linkwatch_event
Sep  7 14:17:10 u1459 kernel: task: 88024cf84680 task.stack: 
c9d44000
Sep  7 14:17:10 u1459 kernel: RIP: 0010:queued_spin_lock_slowpath+0x56/0x1d0
Sep  7 14:17:10 u1459 kernel: RSP: 0018:c9d478c8 EFLAGS: 0202 
ORIG_RAX: ff10
Sep  7 14:17:10 u1459 kernel: RAX: 0101 RBX: 880239fe8000 RCX: 
0101
Sep  7 14:17:10 u1459 kernel: RDX: c9d47948 RSI: 0001 RDI: 
880239feb1a0
Sep  7 14:17:10 u1459 kernel: RBP: c9d47968 R08: 0001 R09: 
88024ce181a4
Sep  7 14:17:10 u1459 kernel: R10:  R11:  R12: 
880239fe8840
Sep  7 14:17:10 u1459 kernel: R13: 88024ce180e4 R14: 88024ce180e4 R15: 
c9d47a48
Sep  7 14:17:10 u1459 kernel: FS:  () 
GS:880255d0() knlGS:
Sep  7 14:17:10 u1459 kernel: CS:  0010 DS:  ES:  CR0: 80050033
Sep  7 14:17:10 u1459 kernel: CR2: 7ffd4c3579b8 CR3: 01c09000 CR4: 
003406e0
Sep  7 14:17:10 u1459 kernel: DR0:  DR1:  DR2: 

Sep  7 14:17:10 u1459 kernel: DR3:  DR6: fffe0ff0 DR7: 
0400
Sep  7 14:17:10 u1459 kernel: Call Trace:
Sep  7 14:17:10 u1459 kernel: ? netlink_broadcast_filtered+0x15f/0x1a0
Sep  7 14:17:10 u1459 kernel: _raw_spin_lock+0x21/0x30
Sep  7 14:17:10 u1459 kernel: e1000e_get_stats64+0x2b/0x140 [e1000e]
Sep  7 14:17:10 u1459 kernel: dev_get_stats+0x3d/0xc0
Sep  7 14:17:10 u1459 kernel: ? __nla_reserve+0x53/0x70
Sep  7 14:17:10 u1459 kernel:

[PATCH net] bpf: don't select potentially stale ri->map from buggy xdp progs

2017-09-07 Thread Daniel Borkmann

We can potentially run into a couple of issues with the XDP
bpf_redirect_map() helper. The ri->map in the per CPU storage
can become stale in several ways, mostly due to misuse, where
we can then trigger a use after free on the map:

i) prog A is calling bpf_redirect_map(), returning XDP_REDIRECT
and running on a driver not supporting XDP_REDIRECT yet. The
ri->map on that CPU becomes stale when the XDP program is unloaded
on the driver, and a prog B loaded on a different driver which
supports XDP_REDIRECT return code. prog B would have to omit
calling to bpf_redirect_map() and just return XDP_REDIRECT, which
would then access the freed map in xdp_do_redirect() since not
cleared for that CPU.

ii) prog A is calling bpf_redirect_map(), returning a code other
than XDP_REDIRECT. prog A is then detached, which triggers release
of the map. prog B is attached which, similarly as in i), would
just return XDP_REDIRECT without having called bpf_redirect_map()
and thus be accessing the freed map in xdp_do_redirect() since
not cleared for that CPU.

iii) prog A is attached to generic XDP, calling the bpf_redirect_map()
helper and returning XDP_REDIRECT. xdp_do_generic_redirect() is
currently not handling ri->map (will be fixed by Jesper), so it's
not being reset. Later loading a e.g. native prog B which would,
say, call bpf_xdp_redirect() and then returns XDP_REDIRECT would
find in xdp_do_redirect() that a map was set and uses that causing
use after free on map access.

Fix thus needs to avoid accessing stale ri->map pointers, naive
way would be to call a BPF function from drivers that just resets
it to NULL for all XDP return codes but XDP_REDIRECT and including
XDP_REDIRECT for drivers not supporting it yet (and let ri->map
being handled in xdp_do_generic_redirect()). There is a less
intrusive way w/o letting drivers call a reset for each BPF run.

The verifier knows we're calling into bpf_xdp_redirect_map()
helper, so it can do a small insn rewrite transparent to the prog
itself in the sense that it fills R4 with a pointer to the own
bpf_prog. We have that pointer at verification time anyway and
R4 is allowed to be used as per calling convention we scratch
R0 to R5 anyway, so they become inaccessible and program cannot
read them prior to a write. Then, the helper would store the prog
pointer in the current CPUs struct redirect_info. Later in
xdp_do_*_redirect() we check whether the redirect_info's prog
pointer is the same as passed xdp_prog pointer, and if that's
the case then all good, since the prog holds a ref on the map
anyway, so it is always valid at that point in time and must
have a reference count of at least 1. If in the unlikely case
they are not equal, it means we got a stale pointer, so we clear
and bail out right there. Also do reset map and the owning prog
in bpf_xdp_redirect(), so that bpf_xdp_redirect_map() and
bpf_xdp_redirect() won't get mixed up, only the last call should
take precedence. A tc bpf_redirect() doesn't use map anywhere
yet, so no need to clear it there since never accessed in that
layer.

Note that in case the prog is released, and thus the map as
well we're still under RCU read critical section at that time
and have preemption disabled as well. Once we commit with the
__dev_map_insert_ctx() from xdp_do_redirect_map() and set the
map to ri->map_to_flush, we still wait for a xdp_do_flush_map()
to finish in devmap dismantle time once flush_needed bit is set,
so that is fine.

Fixes: 97f91a7cf04f ("bpf: add bpf_redirect_map helper routine")
Reported-by: Jesper Dangaard Brouer 
Signed-off-by: Daniel Borkmann 
Signed-off-by: John Fastabend 
---
 kernel/bpf/verifier.c | 16 
 net/core/filter.c | 21 +++--
 2 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index d690c7d..477b693 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4203,6 +4203,22 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env)
continue;
}
 
+   if (insn->imm == BPF_FUNC_redirect_map) {
+   u64 addr = (unsigned long)prog;
+   struct bpf_insn r4_ld[] = {
+   BPF_LD_IMM64(BPF_REG_4, addr),
+   *insn,
+   };
+   cnt = ARRAY_SIZE(r4_ld);
+
+   new_prog = bpf_patch_insn_data(env, i + delta, r4_ld, 
cnt);
+   if (!new_prog)
+   return -ENOMEM;
+
+   delta+= cnt - 1;
+   env->prog = prog = new_prog;
+   insn  = new_prog->insnsi + i + delta;
+   }
 patch_call_imm:
fn = prog->aux->ops->get_func_proto(insn->imm);
/* all functions that have prototype and verifier allowed
diff --git

Re: [PATCH iproute2 0/4] Add support for dpipe's global header formatting

2017-09-07 Thread Stephen Hemminger

On Thu,  7 Sep 2017 17:26:39 +0300
Arkadi Sharshevsky  wrote:

> Some dpipe's global header values need special formatting, for example
> Ethernet and IP addresses. This patchset adds support for IPv4/6 and
> Ethernet's special format.
> 
> Arkadi Sharshevsky (4):
>   devlink: Make match/action parsing more flexible
>   devlink: Add support for special format protocol headers
>   devlink: Update devlink UAPI file
>   devlink: Add support for protocol IPv4/IPv6/Ethernet special formats
> 
>  devlink/devlink.c   | 319 
> +---
>  include/linux/devlink.h |  18 +++
>  2 files changed, 265 insertions(+), 72 deletions(-)
> 

Applied 1,2 and 4.  #3 was not needed.

Re: [PATCH iproute2 3/4] devlink: Update devlink UAPI file

2017-09-07 Thread Stephen Hemminger

On Thu,  7 Sep 2017 17:26:42 +0300
Arkadi Sharshevsky  wrote:

> Update devlink UAPI file.
> 
> Signed-off-by: Arkadi Sharshevsky 
> Signed-off-by: Jiri Pirko 
> ---
>  include/linux/devlink.h | 18 ++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/include/linux/devlink.h b/include/linux/devlink.h
> index 7644005..a62695e 100644
> --- a/include/linux/devlink.h
> +++ b/include/linux/devlink.h
> @@ -226,4 +226,22 @@ enum devlink_dpipe_action_type {
>   DEVLINK_DPIPE_ACTION_TYPE_FIELD_MODIFY,
>  };
>  
> +enum devlink_dpipe_field_ethernet_id {
> + DEVLINK_DPIPE_FIELD_ETHERNET_DST_MAC,
> +};
> +
> +enum devlink_dpipe_field_ipv4_id {
> + DEVLINK_DPIPE_FIELD_IPV4_DST_IP,
> +};
> +
> +enum devlink_dpipe_field_ipv6_id {
> + DEVLINK_DPIPE_FIELD_IPV6_DST_IP,
> +};
> +
> +enum devlink_dpipe_header_id {
> + DEVLINK_DPIPE_HEADER_ETHERNET,
> + DEVLINK_DPIPE_HEADER_IPV4,
> + DEVLINK_DPIPE_HEADER_IPV6,
> +};
> +
>  #endif /* _LINUX_DEVLINK_H_ */

This patch was unnecessary. Already had updated file present.

Re: latest netdev net-next kernel panic

2017-09-07 Thread Paweł Staszewski


Also this panic occured about 3 minutes after traffic rise to 40G

before host was working without trafic for about 30minutes - after 
traffic start to flow this bug start.


Now i have this host without traffic and it is working for almost 1 hour :)



W dniu 2017-09-07 o 23:47, Paweł Staszewski pisze:

Hi Eric


today upgraded some host from 4.13.0-rc7+ to latest net-next from git

and run some tests simulate clients traffic 40Gbit / 5Mpps mixed tcp + 
udp


There is also bgpd running that feeds FIB with full bgp feed - about 
700k prefixes (so routing tables is about 700k routes with about 300 
nexthops)



There is also simulated about 1000 ip neigh hosts (arp) with about 300 
ip interfaces assigned on vlans




With kernel 4.13.0-rc7+ (git from 05.09.2017) all tests are good 
(besides that there is panic after about 10 hours - every time same 
like in this link:


https://bugzilla.kernel.org/attachment.cgi?id=258257

But looks like acpi related or c/p states with new intel cpus... dono 
now.





W dniu 2017-09-07 o 23:39, Eric Dumazet pisze:

On Thu, 2017-09-07 at 23:30 +0200, Paweł Staszewski wrote:

I want to mention also here cause last trace for RIP was dst_dev_put

https://bugzilla.kernel.org/attachment.cgi?id=258259

Can you provide a bit more context ?

Thanks !

Re: [PATCH RFC 2/5] Add KSZ8795 switch driver support in Makefile

2017-09-07 Thread Andrew Lunn

On Thu, Sep 07, 2017 at 09:17:10PM +, tristram...@microchip.com wrote:
> From: Tristram Ha 
> 
> Add KSZ8795 switch support with SPI access.
> 
> Signed-off-by: Tristram Ha 
> ---
> diff --git a/drivers/net/dsa/microchip/Makefile 
> b/drivers/net/dsa/microchip/Makefile
> index 0961c30..0d8ba48 100644
> --- a/drivers/net/dsa/microchip/Makefile
> +++ b/drivers/net/dsa/microchip/Makefile
> @@ -1,2 +1,4 @@
>  obj-$(CONFIG_MICROCHIP_KSZ)  += ksz9477.o ksz_common.o
>  obj-$(CONFIG_MICROCHIP_KSZ_SPI_DRIVER)   += ksz_spi.o
> +obj-$(CONFIG_MICROCHIP_KSZ8795)  += ksz8795.o ksz_common.o
> +obj-$(CONFIG_MICROCHIP_KSZ8795_SPI_DRIVER)   += ksz8795_spi.o

I've not tried it, but i think this breaks the build

 Andrew

Re: [PATCH RFC] Update documentation for KSZ DSA drivers so that new drivers can be added

2017-09-07 Thread Andrew Lunn

> -- compatible: For external switch chips, compatible string must be exactly 
> one
> -  of: "microchip,ksz9477"
> +- compatible: Should be "microchip,ksz9477" for KSZ9477 chip,
> +   "microchip,ksz8795" for KSZ8795 chip,
> +   "microchip,ksz8794" for KSZ8794 chip,
> +   "microchip,ksz8765" for KSZ8765 chip,
> +   "microchip,ksz8895" for KSZ8895 chip,
> +   "microchip,ksz8864" for KSZ8864 chip,
> +   "microchip,ksz8873" for KSZ8873 chip,
> +   "microchip,ksz8863" for KSZ8863 chip,
> +   "microchip,ksz8463" for KSZ8463 chip

This part of this patch should be in a patch of the series that
actually adds support for these chips. Don't document chips until you
actually support them.

>  See Documentation/devicetree/bindings/dsa/dsa.txt for a list of additional  
> required and optional properties.
> @@ -13,60 +20,60 @@ Examples:
>  
>  Ethernet switch connected via SPI to the host, CPU port wired to eth0:
>  
> - eth0: ethernet@10001000 {
> - fixed-link {
> - speed = <1000>;
> - full-duplex;
> - };
> - };
> + eth0: ethernet@10001000 {
> + fixed-link {
> + speed = <1000>;
> + full-duplex;
> + };
> + };
>  
> - spi1: spi@f8008000 {
> - pinctrl-0 = <_spi_ksz>;
> - cs-gpios = < 25 0>;
> - id = <1>;
> - status = "okay";
> + spi1: spi@f8008000 {
> + cs-gpios = < 25 0>;
> + id = <1>;
> + status = "okay";
>  
> - ksz9477: ksz9477@0 {
> - compatible = 
> "microchip,ksz9477";
> - reg = <0>;
> + ksz9477: ksz9477@0 {
> + compatible = "microchip,ksz9477";
> + reg = <0>;
>  
> - 
> spi-max-frequency = <4400>;
> - spi-cpha;
> - spi-cpol;
> + spi-max-frequency = <4400>;
> + spi-cpha;
> + spi-cpol;
> +
> + status = "okay";
> + ports {
> + #address-cells = <1>;
> + #size-cells = <0>;
> + port@0 {
> + reg = <0>;
> + label = "lan1";
> + };
> + port@1 {
> + reg = <1>;
> + label = "lan2";
> + };
> + port@2 {
> + reg = <2>;
> + label = "lan3";
> + };
> + port@3 {
> + reg = <3>;
> + label = "lan4";
> + };
> + port@4 {
> + reg = <4>;
> + label = "lan5";
> + };
> + port@5 {
> + reg = <5>;
> + label = "cpu";
> + ethernet = <>;
> + fixed-link {
> + speed = <1000>;
> + full-duplex;
> + };
> + };
> + };
> + };
> + };
>  
> - status = "okay";
> - ports {
> - 
> #address-cells = <1>;
> - 
> #size-cells = <0>;
> - 
> port@0 {
> - 
> reg = <0>;
> - 
> label =

RE: [PATCH net-next] mdio_bus: Remove unneeded gpiod NULL check

2017-09-07 Thread Woojung.Huh

> >>> If someone is using GPIO descriptors with GPIO disabled, i.e. calling
> >>> gpiod_get() and friends, this is very likely to be a bug, and what
> >>> the driver wants to do is:
> >>>
> >>>  depends on GPIOLIB
> >>>
> >>> or
> >>>
> >>>  select GPIOLIB
> >>>
> >>> in Kconfig. The whole optional thing is mainly a leftover from when it
> >>> was possible to have a local implementation of the GPIOLIB API in
> >>> some custom header file, noone sane should be doing that anymore,
> >>> and if they do, they can very well face the warnings.
> >>>
> >>> If someone is facing a lot of WARN_ON() messages to this, it is a clear
> >>> indication that they need to fix their Kconfig and in that case it is 
> >>> proper.
> >> Linus & Andrew,
> >>
> >> I knew that it is already in David's pulling request.
> >> Configuring GPIOLIB is the right solution  even if platform doesn't use it?
> >
> > I guess?
> >
> > "Platform doesn't use it" what does that mean?
> >
> > Does it mean it does not call the
> > APIs of the GPIOLIB, does it mean it doesn't have a GPIO driver
> > at probe (but may have one by having it probed from a module)
> > or does it mean the platform can never have it?
> 
> I think it means CONFIG_GPIOLIB=n in the kernel because it's not needed,
> yet you run code (like drivers/net/phy/mdio_bus.c) that unconditionally
> calls into GPIOLIB and attempts to configure a given GPIO if available.
Yes. I'm facing issue on PC which won't need GPIOLIB as default.
Warning message goes away when GPIOLIB is enabled, and fortunately,
Ubuntu default config has it.
So, it may not be seen by many users when with full/default configuration.

> This thread is actually what prompted me to write this email:
>
> https://lkml.org/lkml/2017/9/2/3
Thanks for the link.

Re: [RFC net-next 0/5] TSN: Add qdisc-based config interfaces for traffic shapers

2017-09-07 Thread Guedes, Andre

On Thu, 2017-09-07 at 18:18 +0200, Henrik Austad wrote:
> On Thu, Sep 07, 2017 at 05:53:15PM +0200, Richard Cochran wrote:
> > On Thu, Sep 07, 2017 at 05:27:51PM +0200, Henrik Austad wrote:
> > > On Thu, Sep 07, 2017 at 02:40:18PM +0200, Richard Cochran wrote:
> > > And if you want to this driver to act as a bridge, how do you accomodate 
> > > change in network requirements? (i.e. how does this work with switchdev?)
> > 
> > To my understanding, this Qdisc idea provides QoS for the host's
> > transmitted traffic, and nothing more.
> 
> Ok, then we're on the same page.
> 
> > > - Or am I overthinking this?
> > 
> > Being able to configure the external ports of a switchdev is probably
> > a nice feature, but that is another story.  (But maybe I misunderstood
> > the authors' intent!)
> 
> ok, chalk that one up for later perhaps

Just to clarify, we've been most focused on end-station use-cases. We've
considered some bridge use-cases as well just to verify that the proposed
design won't be an issue if someone else goes for it.

- Andre

smime.p7s
Description: S/MIME cryptographic signature

Re: latest netdev net-next kernel panic

2017-09-07 Thread Paweł Staszewski


Hi Eric


today upgraded some host from 4.13.0-rc7+ to latest net-next from git

and run some tests simulate clients traffic 40Gbit / 5Mpps mixed tcp + udp

There is also bgpd running that feeds FIB with full bgp feed - about 
700k prefixes (so routing tables is about 700k routes with about 300 
nexthops)



There is also simulated about 1000 ip neigh hosts (arp) with about 300 
ip interfaces assigned on vlans




With kernel 4.13.0-rc7+ (git from 05.09.2017) all tests are good 
(besides that there is panic after about 10 hours - every time same like 
in this link:


https://bugzilla.kernel.org/attachment.cgi?id=258257

But looks like acpi related or c/p states with new intel cpus... dono now.




W dniu 2017-09-07 o 23:39, Eric Dumazet pisze:

On Thu, 2017-09-07 at 23:30 +0200, Paweł Staszewski wrote:

I want to mention also here cause last trace for RIP was dst_dev_put

https://bugzilla.kernel.org/attachment.cgi?id=258259

Can you provide a bit more context ?

Thanks !

Re: [PATCH RFC 6/6] Modify tag_ksz.c to support other KSZ switch drivers

2017-09-07 Thread Andrew Lunn

On Thu, Sep 07, 2017 at 09:09:30PM +, tristram...@microchip.com wrote:
> From: Tristram Ha 
> 
> The KSZ tail tag code can be used by different KSZ switch drivers.
> 
> Signed-off-by: Tristram Ha 
> ---
> diff --git a/net/dsa/tag_ksz.c b/net/dsa/tag_ksz.c index 010ca0a..d5faf14 
> 100644
> --- a/net/dsa/tag_ksz.c
> +++ b/net/dsa/tag_ksz.c
> @@ -13,35 +13,21 @@
>  #include 
>  #include 
>  #include "dsa_priv.h"
> -
> -/* For Ingress (Host -> KSZ), 2 bytes are added before FCS.
> - * 
> ---
> - * 
> DA(6bytes)|SA(6bytes)||Data(nbytes)|tag0(1byte)|tag1(1byte)|FCS(4bytes)
> - * 
> ---
> - * tag0 : Prioritization (not used now)
> - * tag1 : each bit represents port (eg, 0x01=port1, 0x02=port2, 0x10=port5)
> - *
> - * For Egress (KSZ -> Host), 1 byte is added before FCS.
> - * 
> ---
> - * DA(6bytes)|SA(6bytes)||Data(nbytes)|tag0(1byte)|FCS(4bytes)
> - * 
> ---
> - * tag0 : zero-based value represents port
> - * (eg, 0x00=port1, 0x02=port3, 0x06=port7)
> - */
> -
> -#define  KSZ_INGRESS_TAG_LEN 2
> -#define  KSZ_EGRESS_TAG_LEN  1
> +#include "../../drivers/net/dsa/microchip/ksz_priv.h"

No.

Move the header file to somewhere under include/linux. You can split
it into private and public parts if you want, and just put the public
bits in include/linux.

>  static struct sk_buff *ksz_xmit(struct sk_buff *skb, struct net_device *dev) 
>  {
>   struct dsa_slave_priv *p = netdev_priv(dev);
> + struct dsa_switch *ds = p->dp->ds;
> + struct ksz_device *swdev = ds->priv;
>   struct sk_buff *nskb;
> + int len;
>   int padlen;
> - u8 *tag;
>  
>   padlen = (skb->len >= ETH_ZLEN) ? 0 : ETH_ZLEN - skb->len;
>  
> - if (skb_tailroom(skb) >= padlen + KSZ_INGRESS_TAG_LEN) {
> + len = swdev->dev_ops->get_tx_len(swdev);

This is ugly. We have a clean separation between a switch driver and a
tag driver. Here you are mixing them together. Don't. Look at the
mailing lists for what Florian and I suggested to Pavel. It will also
solve your include file issue above.

 Andrew

Re: [PATCH RFC 5/6] Switch SPI driver calls its own driver switch register function

2017-09-07 Thread Andrew Lunn

On Thu, Sep 07, 2017 at 09:09:22PM +, tristram...@microchip.com wrote:
> From: Tristram Ha 
> 
> SPI driver calls own specific switch register function.
> Shutdown callback function is added to reset switch to default state.
> 
> Signed-off-by: Tristram Ha 
> ---
> diff --git a/drivers/net/dsa/microchip/ksz_spi.c 
> b/drivers/net/dsa/microchip/ksz_spi.c
> index c519469..d03eb83 100644
> --- a/drivers/net/dsa/microchip/ksz_spi.c
> +++ b/drivers/net/dsa/microchip/ksz_spi.c
> @@ -25,6 +25,8 @@
>  
>  #include "ksz_priv.h"
>  
> +int ksz9477_switch_register(struct ksz_device *dev);

Hi Tristram

This should be placed in a header file somewhere, not here.

> +static void ksz_spi_shutdown(struct spi_device *spi) {
> + struct ksz_device *dev = spi_get_drvdata(spi);
> +
> + if (dev)
> + dev->dev_ops->reset(dev);

shutdown seems like a better name for this op, not reset.

> +}
> +
>  static const struct of_device_id ksz_dt_ids[] = {
>   { .compatible = "microchip,ksz9477" },
>   {},
> @@ -207,6 +217,7 @@ static int ksz_spi_remove(struct spi_device *spi)
>   },
>   .probe  = ksz_spi_probe,
>   .remove = ksz_spi_remove,
> + .shutdown = ksz_spi_shutdown,
>  };
>  
>  module_spi_driver(ksz_spi_driver);

Re: [PATCH net-next] mdio_bus: Remove unneeded gpiod NULL check

2017-09-07 Thread Florian Fainelli

On 09/07/2017 02:33 PM, Linus Walleij wrote:
> On Thu, Sep 7, 2017 at 11:30 PM,   wrote:
>>> If someone is using GPIO descriptors with GPIO disabled, i.e. calling
>>> gpiod_get() and friends, this is very likely to be a bug, and what
>>> the driver wants to do is:
>>>
>>>  depends on GPIOLIB
>>>
>>> or
>>>
>>>  select GPIOLIB
>>>
>>> in Kconfig. The whole optional thing is mainly a leftover from when it
>>> was possible to have a local implementation of the GPIOLIB API in
>>> some custom header file, noone sane should be doing that anymore,
>>> and if they do, they can very well face the warnings.
>>>
>>> If someone is facing a lot of WARN_ON() messages to this, it is a clear
>>> indication that they need to fix their Kconfig and in that case it is 
>>> proper.
>> Linus & Andrew,
>>
>> I knew that it is already in David's pulling request.
>> Configuring GPIOLIB is the right solution  even if platform doesn't use it?
> 
> I guess?
> 
> "Platform doesn't use it" what does that mean?
> 
> Does it mean it does not call the
> APIs of the GPIOLIB, does it mean it doesn't have a GPIO driver
> at probe (but may have one by having it probed from a module)
> or does it mean the platform can never have it?

I think it means CONFIG_GPIOLIB=n in the kernel because it's not needed,
yet you run code (like drivers/net/phy/mdio_bus.c) that unconditionally
calls into GPIOLIB and attempts to configure a given GPIO if available.
This thread is actually what prompted me to write this email:

https://lkml.org/lkml/2017/9/2/3

So that's the takeway should we just conditionalize all the GPIOLIB
calls from drivers/net/phy/mdio_bus.c under an #if
IS_ENABLED(CONFIG_GPIOLIB) of some sort?

> 
> If it calls the APIs, it is using it.
> 

It's more subtle than that, the GPIO resource may not be available just
like you have disabled support for GPIOLIB in the kernel, in which case,
the calls are still there, they default to their inline stubs, you get
warnings, everyone panics and screams.
-- 
Florian

Re: latest netdev net-next kernel panic

2017-09-07 Thread Eric Dumazet

On Thu, 2017-09-07 at 23:30 +0200, Paweł Staszewski wrote:
> I want to mention also here cause last trace for RIP was dst_dev_put
> 
> https://bugzilla.kernel.org/attachment.cgi?id=258259

Can you provide a bit more context ?

Thanks !

Re: [PATCH net-next] mdio_bus: Remove unneeded gpiod NULL check

2017-09-07 Thread Linus Walleij

On Thu, Sep 7, 2017 at 11:30 PM,   wrote:
>> If someone is using GPIO descriptors with GPIO disabled, i.e. calling
>> gpiod_get() and friends, this is very likely to be a bug, and what
>> the driver wants to do is:
>>
>>  depends on GPIOLIB
>>
>> or
>>
>>  select GPIOLIB
>>
>> in Kconfig. The whole optional thing is mainly a leftover from when it
>> was possible to have a local implementation of the GPIOLIB API in
>> some custom header file, noone sane should be doing that anymore,
>> and if they do, they can very well face the warnings.
>>
>> If someone is facing a lot of WARN_ON() messages to this, it is a clear
>> indication that they need to fix their Kconfig and in that case it is proper.
> Linus & Andrew,
>
> I knew that it is already in David's pulling request.
> Configuring GPIOLIB is the right solution  even if platform doesn't use it?

I guess?

"Platform doesn't use it" what does that mean?

Does it mean it does not call the
APIs of the GPIOLIB, does it mean it doesn't have a GPIO driver
at probe (but may have one by having it probed from a module)
or does it mean the platform can never have it?

If it calls the APIs, it is using it.

Yours,
Linus Walleij

Re: [PATCH RFC 2/6] Create new file ksz9477.c from KSZ9477 code in ksz_common.c

2017-09-07 Thread Andrew Lunn

On Thu, Sep 07, 2017 at 09:09:04PM +, tristram...@microchip.com wrote:
> From: Tristram Ha 
> 
> Create new ksz9477.c file from original ksz_common.c.
> 
> Signed-off-by: Tristram Ha 
> ---
> diff --git a/drivers/net/dsa/microchip/ksz9477.c 
> b/drivers/net/dsa/microchip/ksz9477.c
> new file mode 100644
> index 000..bc722b4
> --- /dev/null
> +++ b/drivers/net/dsa/microchip/ksz9477.c
> @@ -0,0 +1,1317 @@
> +/*
> + * Microchip switch driver main logic
> + *
> + * Copyright (C) 2017
> + *
> + * Permission to use, copy, modify, and/or distribute this software for 
> +any
> + * purpose with or without fee is hereby granted, provided that the 
> +above

Tristram

It looks like something hand mangled this comment. "any" and "above"
appear to be on a line on there own.

> + * copyright notice and this permission notice appear in all copies.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL 
> +WARRANTIES
> + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
> + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE 
> +FOR
> + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY 
> +DAMAGES
> + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN 
> +AN
> + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT 
> +OF
> + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "ksz_priv.h"
> +#include "ksz_9477_reg.h"
> +
> +static const struct {
> + int index;
> + char string[ETH_GSTRING_LEN];
> +} mib_names[TOTAL_SWITCH_COUNTER_NUM] = {
> + { 0x00, "rx_hi" },
> + { 0x01, "rx_undersize" },
> + { 0x02, "rx_fragments" },
> + { 0x03, "rx_oversize" },
> + { 0x04, "rx_jabbers" },
> + { 0x05, "rx_symbol_err" },
> + { 0x06, "rx_crc_err" },
> + { 0x07, "rx_align_err" },
> + { 0x08, "rx_mac_ctrl" },
> + { 0x09, "rx_pause" },
> + { 0x0A, "rx_bcast" },
> + { 0x0B, "rx_mcast" },
> + { 0x0C, "rx_ucast" },
> + { 0x0D, "rx_64_or_less" },
> + { 0x0E, "rx_65_127" },
> + { 0x0F, "rx_128_255" },
> + { 0x10, "rx_256_511" },
> + { 0x11, "rx_512_1023" },
> + { 0x12, "rx_1024_1522" },
> + { 0x13, "rx_1523_2000" },
> + { 0x14, "rx_2001" },
> + { 0x15, "tx_hi" },
> + { 0x16, "tx_late_col" },
> + { 0x17, "tx_pause" },
> + { 0x18, "tx_bcast" },
> + { 0x19, "tx_mcast" },
> + { 0x1A, "tx_ucast" },
> + { 0x1B, "tx_deferred" },
> + { 0x1C, "tx_total_col" },
> + { 0x1D, "tx_exc_col" },
> + { 0x1E, "tx_single_col" },
> + { 0x1F, "tx_mult_col" },
> + { 0x80, "rx_total" },
> + { 0x81, "tx_total" },
> + { 0x82, "rx_discards" },
> + { 0x83, "tx_discards" },
> +};
> +
> +static void ksz_cfg(struct ksz_device *dev, u32 addr, u8 bits, bool 
> +set) {

More mangling? Should set be on the end of the previous line?

> +static void read_table(struct dsa_switch *ds, u32 *table) {
> + struct ksz_device *dev = ds->priv;
> +
> + ksz_read32(dev, REG_SW_ALU_VAL_A, [0]);
> + ksz_read32(dev, REG_SW_ALU_VAL_B, [1]);
> + ksz_read32(dev, REG_SW_ALU_VAL_C, [2]);
> + ksz_read32(dev, REG_SW_ALU_VAL_D, [3]); }
> +
> +static void write_table(struct dsa_switch *ds, u32 *table) {
> + struct ksz_device *dev = ds->priv;
> +
> + ksz_write32(dev, REG_SW_ALU_VAL_A, table[0]);
> + ksz_write32(dev, REG_SW_ALU_VAL_B, table[1]);
> + ksz_write32(dev, REG_SW_ALU_VAL_C, table[2]);
> + ksz_write32(dev, REG_SW_ALU_VAL_D, table[3]); }

More mangling? } at the end of a line?

I will stop reading now and wait for a v2 this is not corrupt.

  Andrew

latest netdev net-next kernel panic

2017-09-07 Thread Paweł Staszewski


I want to mention also here cause last trace for RIP was dst_dev_put

https://bugzilla.kernel.org/attachment.cgi?id=258259

RE: [PATCH net-next] mdio_bus: Remove unneeded gpiod NULL check

2017-09-07 Thread Woojung.Huh

> If someone is using GPIO descriptors with GPIO disabled, i.e. calling
> gpiod_get() and friends, this is very likely to be a bug, and what
> the driver wants to do is:
> 
>  depends on GPIOLIB
> 
> or
> 
>  select GPIOLIB
> 
> in Kconfig. The whole optional thing is mainly a leftover from when it
> was possible to have a local implementation of the GPIOLIB API in
> some custom header file, noone sane should be doing that anymore,
> and if they do, they can very well face the warnings.
> 
> If someone is facing a lot of WARN_ON() messages to this, it is a clear
> indication that they need to fix their Kconfig and in that case it is proper.
Linus & Andrew,

I knew that it is already in David's pulling request.
Configuring GPIOLIB is the right solution  even if platform doesn't use it?

Did I miss any new patch?

Thanks.
Woojung

Re: [PATCH RFC 1/6] The file ksz_common.c will be used by other KSZ switch drivers.

2017-09-07 Thread Andrew Lunn

On Thu, Sep 07, 2017 at 09:08:58PM +, tristram...@microchip.com wrote:
> From: Tristram Ha 
> 
> Break ksz_common.c into 2 files so that the common code can be used by other 
> KSZ switch drivers.
> 
> Signed-off-by: Tristram Ha 
> ---
> diff --git a/drivers/net/dsa/microchip/Makefile 
> b/drivers/net/dsa/microchip/Makefile
> index ed335e2..0961c30 100644
> --- a/drivers/net/dsa/microchip/Makefile
> +++ b/drivers/net/dsa/microchip/Makefile
> @@ -1,2 +1,2 @@
> -obj-$(CONFIG_MICROCHIP_KSZ)  += ksz_common.o
> +obj-$(CONFIG_MICROCHIP_KSZ)  += ksz9477.o ksz_common.o
>  obj-$(CONFIG_MICROCHIP_KSZ_SPI_DRIVER)   += ksz_spi.o

Hi Tristram

I would of thought this would break the build. You don't add ksz9477.c
until the next patch.

Each patch needs to compile, otherwise you break git bisect.

 Andrew

Re: Can you help to analyze this panic issue when using VPN?

2017-09-07 Thread Cong Wang

On Thu, Sep 7, 2017 at 3:18 AM, Songchuan  wrote:
> <4>[ 1730.085113s][pid:1928,cpu3,HeapTaskDaemon]CPU: 3 PID: 1928 Comm: 
> HeapTaskDaemon Tainted: GB   W   4.1.18-kasan-g6e99722-dirty #1

Sorry, 4.1 is too old for upstream. As you know 4.13 is just out.
Please try to reproduce it on the latest or reasonably recent kernel.

[PATCH RFC 1/5] Add KSZ8795 switch driver support in Kconfig

2017-09-07 Thread Tristram.Ha

From: Tristram Ha 

Add KSZ8795 switch support with SPI access.

Signed-off-by: Tristram Ha 
---
diff --git a/drivers/net/dsa/microchip/Kconfig 
b/drivers/net/dsa/microchip/Kconfig
index a8b8f59..0b6225e 100644
--- a/drivers/net/dsa/microchip/Kconfig
+++ b/drivers/net/dsa/microchip/Kconfig
@@ -10,3 +10,21 @@ config MICROCHIP_KSZ_SPI_DRIVER
depends on MICROCHIP_KSZ && SPI
help
  Select to enable support for registering switches configured through 
SPI.
+
+menuconfig MICROCHIP_KSZ8795
+   tristate "Microchip KSZ8795 series switch support"
+   depends on NET_DSA
+   select NET_DSA_TAG_KSZ
+   help
+ This driver adds support for Microchip KSZ8795 switch chips.
+
+config MICROCHIP_KSZ8795_SPI_DRIVER
+   tristate "KSZ8795 series SPI connected switch driver"
+   depends on MICROCHIP_KSZ8795 && SPI
+   default y
+   help
+ This driver accesses KSZ8795 chip through SPI.
+
+ It is required to use the KSZ8795 switch driver as the only access
+ is through SPI.
+

[PATCH RFC 0/5] Add DSA driver support for KSZ8795 switch

2017-09-07 Thread Tristram.Ha

From: Tristram Ha 

This series of patches is to add DSA driver support for KSZ8795 switch.

Previous patches for KSZ9477 have to be applied first before these.

This patch set is against net-next.

 drivers/net/dsa/microchip/Kconfig   |   18 +
 drivers/net/dsa/microchip/Makefile  |2 +
 drivers/net/dsa/microchip/ksz8795.c | 2066 +++
 drivers/net/dsa/microchip/ksz8795.h | 1015 +++
 drivers/net/dsa/microchip/ksz8795_spi.c |  229 
 5 files changed, 3330 insertions(+)
 create mode 100644 drivers/net/dsa/microchip/ksz8795.c
 create mode 100644 drivers/net/dsa/microchip/ksz8795.h
 create mode 100644 drivers/net/dsa/microchip/ksz8795_spi.c

[PATCH RFC 2/5] Add KSZ8795 switch driver support in Makefile

2017-09-07 Thread Tristram.Ha

From: Tristram Ha 

Add KSZ8795 switch support with SPI access.

Signed-off-by: Tristram Ha 
---
diff --git a/drivers/net/dsa/microchip/Makefile 
b/drivers/net/dsa/microchip/Makefile
index 0961c30..0d8ba48 100644
--- a/drivers/net/dsa/microchip/Makefile
+++ b/drivers/net/dsa/microchip/Makefile
@@ -1,2 +1,4 @@
 obj-$(CONFIG_MICROCHIP_KSZ)+= ksz9477.o ksz_common.o
 obj-$(CONFIG_MICROCHIP_KSZ_SPI_DRIVER) += ksz_spi.o
+obj-$(CONFIG_MICROCHIP_KSZ8795)+= ksz8795.o ksz_common.o
+obj-$(CONFIG_MICROCHIP_KSZ8795_SPI_DRIVER) += ksz8795_spi.o

[PATCH RFC 4/5] Add KSZ8795 register definitions

2017-09-07 Thread Tristram.Ha

From: Tristram Ha 

Add KSZ8795 switch support with register definitions.

Signed-off-by: Tristram Ha 
---
diff --git a/drivers/net/dsa/microchip/ksz8795.h 
b/drivers/net/dsa/microchip/ksz8795.h
new file mode 100644
index 000..005eda5
--- /dev/null
+++ b/drivers/net/dsa/microchip/ksz8795.h
@@ -0,0 +1,1015 @@
+/**
+ * Microchip KSZ8795 definition file
+ *
+ * Copyright (c) 2017 Microchip Technology Inc.
+ * Tristram Ha 
+ *
+ * Permission to use, copy, modify, and/or distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+
+#ifndef __KSZ8795_H
+#define __KSZ8795_H
+
+#define KS_PORT_M  0x1F
+
+#define KS_PRIO_M  0x3
+#define KS_PRIO_S  2
+
+#define REG_CHIP_ID0   0x00
+
+#define FAMILY_ID  0x87
+
+#define REG_CHIP_ID1   0x01
+
+#define SW_CHIP_ID_M   0xF0
+#define SW_CHIP_ID_S   4
+#define SW_REVISION_M  0x0E
+#define SW_REVISION_S  1
+#define SW_START   0x01
+
+#define CHIP_ID_94 0x60
+#define CHIP_ID_95 0x90
+
+#define REG_SW_CTRL_0  0x02
+
+#define SW_NEW_BACKOFF BIT(7)
+#define SW_GLOBAL_RESETBIT(6)
+#define SW_FLUSH_DYN_MAC_TABLE BIT(5)
+#define SW_FLUSH_STA_MAC_TABLE BIT(4)
+#define SW_LINK_AUTO_AGING BIT(0)
+
+#define REG_SW_CTRL_1  0x03
+
+#define SW_HUGE_PACKET BIT(6)
+#define SW_TX_FLOW_CTRL_DISABLEBIT(5)
+#define SW_RX_FLOW_CTRL_DISABLEBIT(4)
+#define SW_CHECK_LENGTHBIT(3)
+#define SW_AGING_ENABLEBIT(2)
+#define SW_FAST_AGING  BIT(1)
+#define SW_AGGR_BACKOFFBIT(0)
+
+#define REG_SW_CTRL_2  0x04
+
+#define UNICAST_VLAN_BOUNDARY  BIT(7)
+#define MULTICAST_STORM_DISABLEBIT(6)
+#define SW_BACK_PRESSURE   BIT(5)
+#define FAIR_FLOW_CTRL BIT(4)
+#define NO_EXC_COLLISION_DROP  BIT(3)
+#define SW_LEGAL_PACKET_DISABLEBIT(1)
+
+#define REG_SW_CTRL_3  0x05
+ #define WEIGHTED_FAIR_QUEUE_ENABLEBIT(3)
+
+#define SW_VLAN_ENABLE BIT(7)
+#define SW_IGMP_SNOOP  BIT(6)
+#define SW_MIRROR_RX_TXBIT(0)
+
+#define REG_SW_CTRL_4  0x06
+
+#define SW_HALF_DUPLEX_FLOW_CTRL   BIT(7)
+#define SW_HALF_DUPLEX BIT(6)
+#define SW_FLOW_CTRL   BIT(5)
+#define SW_10_MBIT BIT(4)
+#define SW_REPLACE_VID BIT(3)
+#define BROADCAST_STORM_RATE_HI0x07
+
+#define REG_SW_CTRL_5  0x07
+
+#define BROADCAST_STORM_RATE_LO0xFF
+#define BROADCAST_STORM_RATE   0x07FF
+
+#define REG_SW_CTRL_6  0x08
+
+#define SW_MIB_COUNTER_FLUSH   BIT(7)
+#define SW_MIB_COUNTER_FREEZE  BIT(6)
+#define SW_MIB_COUNTER_CTRL_ENABLE KS_PORT_M
+
+#define REG_SW_CTRL_9  0x0B
+
+#define SPI_CLK_125_MHZ0x80
+#define SPI_CLK_62_5_MHZ   0x40
+#define SPI_CLK_31_25_MHZ  0x00
+
+#define SW_LED_MODE_M  0x3
+#define SW_LED_MODE_S  4
+#define SW_LED_LINK_ACT_SPEED  0
+#define SW_LED_LINK_ACT1
+#define SW_LED_LINK_ACT_DUPLEX 2
+#define SW_LED_LINK_DUPLEX 3
+
+#define REG_SW_CTRL_10 0x0C
+
+#define SW_TAIL_TAG_ENABLE BIT(1)
+#define SW_PASS_PAUSE  BIT(0)
+
+#define REG_SW_CTRL_11 0x0D
+
+#define REG_POWER_MANAGEMENT_1 0x0E
+
+#define SW_PLL_POWER_DOWN  BIT(5)
+#define SW_POWER_MANAGEMENT_MODE_M 0x3
+#define SW_POWER_MANAGEMENT_MODE_S 3
+#define SW_POWER_NORMAL0
+#define SW_ENERGY_DETECTION1
+#define SW_SOFTWARE_POWER_DOWN 2
+
+#define REG_POWER_MANAGEMENT_2 0x0F
+
+#define REG_PORT_1_CTRL_0  0x10
+#define REG_PORT_2_CTRL_0  0x20
+#define

[PATCH RFC 5/5] Add KSZ8795 SPI driver

2017-09-07 Thread Tristram.Ha

From: Tristram Ha 

Add KSZ8795 switch support with SPI access.

Signed-off-by: Tristram Ha 
---
diff --git a/drivers/net/dsa/microchip/ksz8795_spi.c 
b/drivers/net/dsa/microchip/ksz8795_spi.c
new file mode 100644
index 000..0f9c731
--- /dev/null
+++ b/drivers/net/dsa/microchip/ksz8795_spi.c
@@ -0,0 +1,229 @@
+/*
+ * Microchip KSZ8795 series register access through SPI
+ *
+ * Copyright (C) 2017 Microchip Technology Inc.
+ * Tristram Ha 
+ *
+ * Permission to use, copy, modify, and/or distribute this software for 
+any
+ * purpose with or without fee is hereby granted, provided that the 
+above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL 
+WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE 
+FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY 
+DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN 
+AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT 
+OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+#include "ksz_priv.h"
+
+int ksz8795_switch_register(struct ksz_device *dev);
+
+/* SPI frame opcodes */
+#define KS_SPIOP_RD3
+#define KS_SPIOP_WR2
+
+#define SPI_ADDR_S 12
+#define SPI_ADDR_M (BIT(SPI_ADDR_S) - 1)
+#define SPI_TURNAROUND_S   1
+
+/* Enough to read all switch registers. */
+#define SPI_TX_BUF_LEN 0x100
+
+static int ksz_spi_read_reg(struct spi_device *spi, u32 reg, u8 *val,
+   unsigned int len)
+{
+   u16 txbuf;
+   int ret;
+
+   txbuf = reg & SPI_ADDR_M;
+   txbuf |= KS_SPIOP_RD << SPI_ADDR_S;
+   txbuf <<= SPI_TURNAROUND_S;
+   txbuf = cpu_to_be16(txbuf);
+
+   ret = spi_write_then_read(spi, , 2, val, len);
+   return ret;
+}
+
+static int ksz_spi_read(struct ksz_device *dev, u32 reg, u8 *data,
+   unsigned int len)
+{
+   struct spi_device *spi = dev->priv;
+
+   return ksz_spi_read_reg(spi, reg, data, len); }
+
+static int ksz_spi_read8(struct ksz_device *dev, u32 reg, u8 *val) {
+   return ksz_spi_read(dev, reg, val, 1); }
+
+static int ksz_spi_read16(struct ksz_device *dev, u32 reg, u16 *val) {
+   int ret = ksz_spi_read(dev, reg, (u8 *)val, 2);
+
+   if (!ret)
+   *val = be16_to_cpu(*val);
+
+   return ret;
+}
+
+static int ksz_spi_read32(struct ksz_device *dev, u32 reg, u32 *val) {
+   int ret = ksz_spi_read(dev, reg, (u8 *)val, 4);
+
+   if (!ret)
+   *val = be32_to_cpu(*val);
+
+   return ret;
+}
+
+static int ksz_spi_write_reg(struct spi_device *spi, u32 reg, u8 *val,
+unsigned int len)
+{
+   u16 *txbuf = (u16 *)val;
+
+   *txbuf = reg & SPI_ADDR_M;
+   *txbuf |= (KS_SPIOP_WR << SPI_ADDR_S);
+   *txbuf <<= SPI_TURNAROUND_S;
+   *txbuf = cpu_to_be16(*txbuf);
+
+   return spi_write(spi, txbuf, 2 + len); }
+
+static int ksz_spi_write(struct ksz_device *dev, u32 reg, void *data,
+unsigned int len)
+{
+   struct spi_device *spi = dev->priv;
+
+   if (len > SPI_TX_BUF_LEN)
+   len = SPI_TX_BUF_LEN;
+   memcpy(>txbuf[2], data, len);
+   return ksz_spi_write_reg(spi, reg, dev->txbuf, len); }
+
+static unsigned int ksz_spi_get(struct ksz_device *dev, u32 reg, void *data,
+   unsigned int len)
+{
+   int err;
+
+   err = ksz_spi_read(dev, reg, data, len);
+   if (err)
+   return 0;
+   return len;
+}
+
+static unsigned int ksz_spi_set(struct ksz_device *dev, u32 reg, void *data,
+   unsigned int len)
+{
+   int err;
+
+   err = ksz_spi_write(dev, reg, data, len);
+   if (err)
+   return 0;
+   return len;
+}
+
+static int ksz_spi_write8(struct ksz_device *dev, u32 reg, u8 value) {
+   return ksz_spi_write(dev, reg, , 1); }
+
+static int ksz_spi_write16(struct ksz_device *dev, u32 reg, u16 value) 
+{
+   value = cpu_to_be16(value);
+   return ksz_spi_write(dev, reg, , 2); }
+
+static int ksz_spi_write32(struct ksz_device *dev, u32 reg, u32 value) 
+{
+   value = cpu_to_be32(value);
+   return ksz_spi_write(dev, reg, , 4); }
+
+static const struct ksz_io_ops ksz_spi_ops = {
+   .read8 = ksz_spi_read8,
+   .read16 = ksz_spi_read16,
+   .read32 = ksz_spi_read32,
+   .write8 = ksz_spi_write8,
+   .write16 = ksz_spi_write16,
+   .write32 = ksz_spi_write32,
+   .get = ksz_spi_get,
+   .set = ksz_spi_set,
+};
+
+static int ksz_spi_probe(struct spi_device *spi)

[PATCH RFC 3/5] Add KSZ8795 switch driver

2017-09-07 Thread Tristram.Ha

From: Tristram Ha 

Add KSZ8795 switch support with function code.

Signed-off-by: Tristram Ha 
---
diff --git a/drivers/net/dsa/microchip/ksz8795.c 
b/drivers/net/dsa/microchip/ksz8795.c
new file mode 100644
index 000..e4d4e6a
--- /dev/null
+++ b/drivers/net/dsa/microchip/ksz8795.c
@@ -0,0 +1,2066 @@
+/*
+ * Microchip KSZ8795 switch driver
+ *
+ * Copyright (C) 2017 Microchip Technology Inc.
+ * Tristram Ha 
+ *
+ * Permission to use, copy, modify, and/or distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "ksz_priv.h"
+#include "ksz8795.h"
+
+/**
+ * Some counters do not need to be read too often because they are less likely
+ * to increase much.
+ */
+static const struct {
+   int interval;
+   char string[ETH_GSTRING_LEN];
+} mib_names[TOTAL_SWITCH_COUNTER_NUM] = {
+   { 1, "rx_hi" },
+   { 4, "rx_undersize" },
+   { 4, "rx_fragments" },
+   { 4, "rx_oversize" },
+   { 4, "rx_jabbers" },
+   { 4, "rx_symbol_err" },
+   { 4, "rx_crc_err" },
+   { 4, "rx_align_err" },
+   { 4, "rx_mac_ctrl" },
+   { 1, "rx_pause" },
+   { 1, "rx_bcast" },
+   { 1, "rx_mcast" },
+   { 1, "rx_ucast" },
+   { 2, "rx_64_or_less" },
+   { 2, "rx_65_127" },
+   { 2, "rx_128_255" },
+   { 2, "rx_256_511" },
+   { 2, "rx_512_1023" },
+   { 2, "rx_1024_1522" },
+   { 2, "rx_1523_2000" },
+   { 2, "rx_2001" },
+   { 1, "tx_hi" },
+   { 4, "tx_late_col" },
+   { 1, "tx_pause" },
+   { 1, "tx_bcast" },
+   { 1, "tx_mcast" },
+   { 1, "tx_ucast" },
+   { 4, "tx_deferred" },
+   { 4, "tx_total_col" },
+   { 4, "tx_exc_col" },
+   { 4, "tx_single_col" },
+   { 4, "tx_mult_col" },
+   { 1, "rx_total" },
+   { 1, "tx_total" },
+   { 2, "rx_discards" },
+   { 2, "tx_discards" },
+};
+
+static void ksz_cfg(struct ksz_device *dev, u32 addr, u8 bits, bool set)
+{
+   u8 data;
+
+   ksz_read8(dev, addr, );
+   if (set)
+   data |= bits;
+   else
+   data &= ~bits;
+   ksz_write8(dev, addr, data);
+}
+
+static void ksz_port_cfg(struct ksz_device *dev, int port, int offset, u8 bits,
+bool set)
+{
+   u32 addr;
+   u8 data;
+
+   addr = PORT_CTRL_ADDR(port, offset);
+   ksz_read8(dev, addr, );
+
+   if (set)
+   data |= bits;
+   else
+   data &= ~bits;
+
+   ksz_write8(dev, addr, data);
+}
+
+static int chk_last_port(struct ksz_device *dev, int p)
+{
+   if (dev->last_port && p == dev->last_port)
+   p = dev->cpu_port;
+   return p;
+}
+
+static int ksz_reset_switch(struct ksz_device *dev)
+{
+   /* reset switch */
+   ksz_write8(dev, REG_POWER_MANAGEMENT_1,
+  SW_SOFTWARE_POWER_DOWN << SW_POWER_MANAGEMENT_MODE_S);
+   ksz_write8(dev, REG_POWER_MANAGEMENT_1, 0);
+
+   return 0;
+}
+
+static void port_set_prio_queue(struct ksz_device *dev, int port, int queue)
+{
+   u8 hi;
+   u8 lo;
+
+   switch (queue) {
+   case 4:
+   case 3:
+   queue = PORT_QUEUE_SPLIT_4;
+   break;
+   case 2:
+   queue = PORT_QUEUE_SPLIT_2;
+   break;
+   default:
+   queue = PORT_QUEUE_SPLIT_1;
+   }
+   ksz_pread8(dev, port, REG_PORT_CTRL_0, );
+   ksz_pread8(dev, port, P_DROP_TAG_CTRL, );
+   lo &= ~PORT_QUEUE_SPLIT_L;
+   if (queue & 1)
+   lo |= PORT_QUEUE_SPLIT_L;
+   hi &= ~PORT_QUEUE_SPLIT_H;
+   if (queue & 2)
+   hi |= PORT_QUEUE_SPLIT_H;
+   ksz_pwrite8(dev, port, REG_PORT_CTRL_0, lo);
+   ksz_pwrite8(dev, port, P_DROP_TAG_CTRL, hi);
+
+   /* Default is port based for egress rate limit. */
+   if (queue)
+   ksz_cfg(dev, REG_SW_CTRL_19, SW_OUT_RATE_LIMIT_QUEUE_BASED,
+   true);
+}
+
+static void port_r_mib_cnt(struct ksz_device *dev, int port, u16 addr, u64 
*cnt)
+{
+   u32 data;
+   u16 ctrl_addr;
+   u8 check;
+   int timeout;
+
+   ctrl_addr = addr + SWITCH_COUNTER_NUM * port;

[PATCH RFC] Update documentation for KSZ DSA drivers so that new drivers can be added

2017-09-07 Thread Tristram.Ha

From: Tristram Ha 

Add other KSZ switches support so that patch check does not complain.

Signed-off-by: Tristram Ha 
---
 Documentation/devicetree/bindings/net/dsa/ksz.txt | 117 --
 1 file changed, 62 insertions(+), 55 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/dsa/ksz.txt 
b/Documentation/devicetree/bindings/net/dsa/ksz.txt
index 0ab8b39..34af0e0 100644
--- a/Documentation/devicetree/bindings/net/dsa/ksz.txt
+++ b/Documentation/devicetree/bindings/net/dsa/ksz.txt
@@ -3,8 +3,15 @@ Microchip KSZ Series Ethernet switches
 
 Required properties:
 
-- compatible: For external switch chips, compatible string must be exactly one
-  of: "microchip,ksz9477"
+- compatible: Should be "microchip,ksz9477" for KSZ9477 chip,
+ "microchip,ksz8795" for KSZ8795 chip,
+ "microchip,ksz8794" for KSZ8794 chip,
+ "microchip,ksz8765" for KSZ8765 chip,
+ "microchip,ksz8895" for KSZ8895 chip,
+ "microchip,ksz8864" for KSZ8864 chip,
+ "microchip,ksz8873" for KSZ8873 chip,
+ "microchip,ksz8863" for KSZ8863 chip,
+ "microchip,ksz8463" for KSZ8463 chip
 
 See Documentation/devicetree/bindings/dsa/dsa.txt for a list of additional  
required and optional properties.
@@ -13,60 +20,60 @@ Examples:
 
 Ethernet switch connected via SPI to the host, CPU port wired to eth0:
 
- eth0: ethernet@10001000 {
- fixed-link {
- speed = <1000>;
- full-duplex;
- };
- };
+   eth0: ethernet@10001000 {
+   fixed-link {
+   speed = <1000>;
+   full-duplex;
+   };
+   };
 
- spi1: spi@f8008000 {
- pinctrl-0 = <_spi_ksz>;
- cs-gpios = < 25 0>;
- id = <1>;
- status = "okay";
+   spi1: spi@f8008000 {
+   cs-gpios = < 25 0>;
+   id = <1>;
+   status = "okay";
 
- ksz9477: ksz9477@0 {
- compatible = 
"microchip,ksz9477";
- reg = <0>;
+   ksz9477: ksz9477@0 {
+   compatible = "microchip,ksz9477";
+   reg = <0>;
 
- spi-max-frequency 
= <4400>;
- spi-cpha;
- spi-cpol;
+   spi-max-frequency = <4400>;
+   spi-cpha;
+   spi-cpol;
+
+   status = "okay";
+   ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+   port@0 {
+   reg = <0>;
+   label = "lan1";
+   };
+   port@1 {
+   reg = <1>;
+   label = "lan2";
+   };
+   port@2 {
+   reg = <2>;
+   label = "lan3";
+   };
+   port@3 {
+   reg = <3>;
+   label = "lan4";
+   };
+   port@4 {
+   reg = <4>;
+   label = "lan5";
+   };
+   port@5 {
+   reg = <5>;
+   label = "cpu";
+   ethernet = <>;
+   fixed-link {
+   speed = <1000>;
+   full-duplex;
+   };
+   };
+   };
+   };
+   };
 
- status = "okay";
- ports {
- 
#address-cells =

[PATCH RFC 0/6] Modify KSZ9477 DSA driver in preparation to add other KSZ switch drivers.

2017-09-07 Thread Tristram.Ha

From: Tristram Ha 

This series of patches is to modify the original KSZ9477 DSA driver so that 
other KSZ switch drivers can be added and use the common code.

This patch set is against net-next.

 drivers/net/dsa/microchip/Makefile |2 +-
 drivers/net/dsa/microchip/ksz9477.c| 1317 
 drivers/net/dsa/microchip/ksz_common.c | 1156 +---
 drivers/net/dsa/microchip/ksz_priv.h   |  105 ++-
 drivers/net/dsa/microchip/ksz_spi.c|   13 +-
 net/dsa/tag_ksz.c  |   40 +-
 6 files changed, 1458 insertions(+), 1175 deletions(-)  create mode 100644 
drivers/net/dsa/microchip/ksz9477.c

[PATCH RFC 4/6] The common header ksz_priv.h does not contain chip specific data

2017-09-07 Thread Tristram.Ha

From: Tristram Ha 

The header ksz_priv.h is used by all KSZ switch drivers so chip specific data 
are removed and commonly used variables are added.

Signed-off-by: Tristram Ha 
---
diff --git a/drivers/net/dsa/microchip/ksz_priv.h 
b/drivers/net/dsa/microchip/ksz_priv.h
index 2a98dbd..343b509 100644
--- a/drivers/net/dsa/microchip/ksz_priv.h
+++ b/drivers/net/dsa/microchip/ksz_priv.h
@@ -25,14 +25,56 @@
 #include 
 #include 
 
-#include "ksz_9477_reg.h"
-
 struct ksz_io_ops;
 
 struct vlan_table {
u32 table[3];
 };
 
+enum {
+   PHY_NO_FLOW_CTRL,
+   PHY_FLOW_CTRL,
+   PHY_TX_ONLY,
+   PHY_RX_ONLY
+};
+
+struct ksz_port_mib {
+   u8 cnt_ptr;
+   u8 ctrl;
+   unsigned long time;
+
+   struct ksz_port_mib_info {
+   u64 counter;
+   u8 read_cnt;
+   u8 read_max;
+   } *info;
+};
+
+struct ksz_port {
+   u16 member;
+   u16 vid_member;
+   int stp_state;
+   u16 speed;
+   u8 duplex;
+   u8 advertised;
+   u8 flow_ctrl;
+   u8 link;
+   u8 partner;
+
+   u32 on:1;   /* port is not disabled by hardware */
+   u32 fiber:1;/* port is fiber */
+   u32 sgmii:1;/* port is SGMII */
+   u32 force:1;
+   u32 link_down:1;/* link just goes down */
+   u32 link_up:1;  /* link is up */
+
+   struct ksz_port_mib mib;
+};
+
+#define USE_FEWER_PORTSBIT(18)
+
+#define PTP_TAGBIT(29)
+
 struct ksz_device {
struct dsa_switch *ds;
struct ksz_platform_data *pdata;
@@ -43,6 +85,7 @@ struct ksz_device {
struct mutex alu_mutex; /* ALU access */
struct mutex vlan_mutex;/* vlan access */
const struct ksz_io_ops *ops;
+   const struct ksz_dev_ops *dev_ops;
 
struct device *dev;
 
@@ -56,10 +99,32 @@ struct ksz_device {
int cpu_port;   /* port connected to CPU */
int cpu_ports;  /* port bitmap can be cpu port */
int port_cnt;
+   int mib_cnt;
+   int mib_port_cnt;
+   int last_port;  /* ports after that not used */
+   int interface;
 
struct vlan_table *vlan_cache;
 
-   u64 mib_value[TOTAL_SWITCH_COUNTER_NUM];
+   u8 *txbuf;
+
+   struct ksz_port *ports;
+   struct timer_list mib_read_timer;
+   struct work_struct mib_read;
+   spinlock_t mib_read_lock;   /* use to update read_cnt */
+   unsigned long MIB_READ_INTERVAL;
+   u32 features;
+   u32 overrides;
+   u16 br_member;
+   u16 member;
+   u16 live_ports;
+   u16 on_ports;   /* ports enabled by DSA */
+   u16 rx_ports;
+   u16 tx_ports;
+   u16 mirror_rx;
+   u16 mirror_tx;
+   u16 HOST_MASK;
+   u16 PORT_MASK;
 };
 
 struct ksz_io_ops {
@@ -75,12 +140,30 @@ struct ksz_io_ops {
  u16 *value);
int (*phy_write16)(struct ksz_device *dev, int addr, int reg,
   u16 value);
+   unsigned int (*get)(struct ksz_device *dev, u32 reg, void *data,
+   unsigned int len);
+   unsigned int (*set)(struct ksz_device *dev, u32 reg, void *data,
+   unsigned int len);
+};
+
+struct ksz_dev_ops {
+   u32 (*get_port_addr)(int port, int offset);
+   int (*reset)(struct ksz_device *dev);
+   int (*get_rx_len)(struct ksz_device *dev);
+   int (*get_tx_len)(struct ksz_device *dev);
+   void (*add_tail_tag)(struct ksz_device *dev, struct sk_buff *skb,
+int port);
+   int (*get_tail_tag)(struct ksz_device *dev, struct sk_buff *skb,
+   int *port);
+   int (*detect)(struct ksz_device *dev);
+   int (*init)(struct ksz_device *dev);
+   void (*exit)(struct ksz_device *dev);
 };
 
 struct ksz_device *ksz_switch_alloc(struct device *base,
const struct ksz_io_ops *ops, void *priv); 
-int ksz_switch_detect(struct ksz_device *dev); -int ksz_switch_register(struct 
ksz_device *dev);
+int ksz_switch_register(struct ksz_device *dev,
+   const struct ksz_dev_ops *ops);
 void ksz_switch_remove(struct ksz_device *dev);
 
 static inline int ksz_read8(struct ksz_device *dev, u32 reg, u8 *val) @@ 
-174,37 +257,37 @@ static inline int ksz_write32(struct ksz_device *dev, u32 
reg, u32 value)  static inline void ksz_pread8(struct ksz_device *dev, int 
port, int offset,
  u8 *data)
 {
-   ksz_read8(dev, PORT_CTRL_ADDR(port, offset), data);
+   ksz_read8(dev, dev->dev_ops->get_port_addr(port, offset), data);
 }
 
 static inline void ksz_pread16(struct ksz_device *dev, int port, int offset,
   u16 *data)
 {
-

[PATCH RFC 1/6] The file ksz_common.c will be used by other KSZ switch drivers.

2017-09-07 Thread Tristram.Ha

From: Tristram Ha 

Break ksz_common.c into 2 files so that the common code can be used by other 
KSZ switch drivers.

Signed-off-by: Tristram Ha 
---
diff --git a/drivers/net/dsa/microchip/Makefile 
b/drivers/net/dsa/microchip/Makefile
index ed335e2..0961c30 100644
--- a/drivers/net/dsa/microchip/Makefile
+++ b/drivers/net/dsa/microchip/Makefile
@@ -1,2 +1,2 @@
-obj-$(CONFIG_MICROCHIP_KSZ)+= ksz_common.o
+obj-$(CONFIG_MICROCHIP_KSZ)+= ksz9477.o ksz_common.o
 obj-$(CONFIG_MICROCHIP_KSZ_SPI_DRIVER) += ksz_spi.o

[PATCH RFC 5/6] Switch SPI driver calls its own driver switch register function

2017-09-07 Thread Tristram.Ha

From: Tristram Ha 

SPI driver calls own specific switch register function.
Shutdown callback function is added to reset switch to default state.

Signed-off-by: Tristram Ha 
---
diff --git a/drivers/net/dsa/microchip/ksz_spi.c 
b/drivers/net/dsa/microchip/ksz_spi.c
index c519469..d03eb83 100644
--- a/drivers/net/dsa/microchip/ksz_spi.c
+++ b/drivers/net/dsa/microchip/ksz_spi.c
@@ -25,6 +25,8 @@
 
 #include "ksz_priv.h"
 
+int ksz9477_switch_register(struct ksz_device *dev);
+
 /* SPI frame opcodes */
 #define KS_SPIOP_RD3
 #define KS_SPIOP_WR2
@@ -174,7 +176,7 @@ static int ksz_spi_probe(struct spi_device *spi)
if (spi->dev.platform_data)
dev->pdata = spi->dev.platform_data;
 
-   ret = ksz_switch_register(dev);
+   ret = ksz9477_switch_register(dev);
if (ret)
return ret;
 
@@ -193,6 +195,14 @@ static int ksz_spi_remove(struct spi_device *spi)
return 0;
 }
 
+static void ksz_spi_shutdown(struct spi_device *spi) {
+   struct ksz_device *dev = spi_get_drvdata(spi);
+
+   if (dev)
+   dev->dev_ops->reset(dev);
+}
+
 static const struct of_device_id ksz_dt_ids[] = {
{ .compatible = "microchip,ksz9477" },
{},
@@ -207,6 +217,7 @@ static int ksz_spi_remove(struct spi_device *spi)
},
.probe  = ksz_spi_probe,
.remove = ksz_spi_remove,
+   .shutdown = ksz_spi_shutdown,
 };
 
 module_spi_driver(ksz_spi_driver);

[PATCH RFC 3/6] The file ksz_common.c contains common code shared by all KSZ switch drivers

2017-09-07 Thread Tristram.Ha

From: Tristram Ha 

The file ksz_common.c only holds common code used by all KSZ switch drivers.

Signed-off-by: Tristram Ha 
---
diff --git a/drivers/net/dsa/microchip/ksz_common.c 
b/drivers/net/dsa/microchip/ksz_common.c
index 56cd6d3..bebcc65 100644
--- a/drivers/net/dsa/microchip/ksz_common.c
+++ b/drivers/net/dsa/microchip/ksz_common.c
@@ -25,1114 +25,12 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
 #include "ksz_priv.h"
 
-static const struct {
-   int index;
-   char string[ETH_GSTRING_LEN];
-} mib_names[TOTAL_SWITCH_COUNTER_NUM] = {
-   { 0x00, "rx_hi" },
-   { 0x01, "rx_undersize" },
-   { 0x02, "rx_fragments" },
-   { 0x03, "rx_oversize" },
-   { 0x04, "rx_jabbers" },
-   { 0x05, "rx_symbol_err" },
-   { 0x06, "rx_crc_err" },
-   { 0x07, "rx_align_err" },
-   { 0x08, "rx_mac_ctrl" },
-   { 0x09, "rx_pause" },
-   { 0x0A, "rx_bcast" },
-   { 0x0B, "rx_mcast" },
-   { 0x0C, "rx_ucast" },
-   { 0x0D, "rx_64_or_less" },
-   { 0x0E, "rx_65_127" },
-   { 0x0F, "rx_128_255" },
-   { 0x10, "rx_256_511" },
-   { 0x11, "rx_512_1023" },
-   { 0x12, "rx_1024_1522" },
-   { 0x13, "rx_1523_2000" },
-   { 0x14, "rx_2001" },
-   { 0x15, "tx_hi" },
-   { 0x16, "tx_late_col" },
-   { 0x17, "tx_pause" },
-   { 0x18, "tx_bcast" },
-   { 0x19, "tx_mcast" },
-   { 0x1A, "tx_ucast" },
-   { 0x1B, "tx_deferred" },
-   { 0x1C, "tx_total_col" },
-   { 0x1D, "tx_exc_col" },
-   { 0x1E, "tx_single_col" },
-   { 0x1F, "tx_mult_col" },
-   { 0x80, "rx_total" },
-   { 0x81, "tx_total" },
-   { 0x82, "rx_discards" },
-   { 0x83, "tx_discards" },
-};
-
-static void ksz_cfg(struct ksz_device *dev, u32 addr, u8 bits, bool set) -{
-   u8 data;
-
-   ksz_read8(dev, addr, );
-   if (set)
-   data |= bits;
-   else
-   data &= ~bits;
-   ksz_write8(dev, addr, data);
-}
-
-static void ksz_cfg32(struct ksz_device *dev, u32 addr, u32 bits, bool set) -{
-   u32 data;
-
-   ksz_read32(dev, addr, );
-   if (set)
-   data |= bits;
-   else
-   data &= ~bits;
-   ksz_write32(dev, addr, data);
-}
-
-static void ksz_port_cfg(struct ksz_device *dev, int port, int offset, u8 bits,
-bool set)
-{
-   u32 addr;
-   u8 data;
-
-   addr = PORT_CTRL_ADDR(port, offset);
-   ksz_read8(dev, addr, );
-
-   if (set)
-   data |= bits;
-   else
-   data &= ~bits;
-
-   ksz_write8(dev, addr, data);
-}
-
-static void ksz_port_cfg32(struct ksz_device *dev, int port, int offset,
-  u32 bits, bool set)
-{
-   u32 addr;
-   u32 data;
-
-   addr = PORT_CTRL_ADDR(port, offset);
-   ksz_read32(dev, addr, );
-
-   if (set)
-   data |= bits;
-   else
-   data &= ~bits;
-
-   ksz_write32(dev, addr, data);
-}
-
-static int wait_vlan_ctrl_ready(struct ksz_device *dev, u32 waiton, int 
timeout) -{
-   u8 data;
-
-   do {
-   ksz_read8(dev, REG_SW_VLAN_CTRL, );
-   if (!(data & waiton))
-   break;
-   usleep_range(1, 10);
-   } while (timeout-- > 0);
-
-   if (timeout <= 0)
-   return -ETIMEDOUT;
-
-   return 0;
-}
-
-static int get_vlan_table(struct dsa_switch *ds, u16 vid, u32 *vlan_table) -{
-   struct ksz_device *dev = ds->priv;
-   int ret;
-
-   mutex_lock(>vlan_mutex);
-
-   ksz_write16(dev, REG_SW_VLAN_ENTRY_INDEX__2, vid & VLAN_INDEX_M);
-   ksz_write8(dev, REG_SW_VLAN_CTRL, VLAN_READ | VLAN_START);
-
-   /* wait to be cleared */
-   ret = wait_vlan_ctrl_ready(dev, VLAN_START, 1000);
-   if (ret < 0) {
-   dev_dbg(dev->dev, "Failed to read vlan table\n");
-   goto exit;
-   }
-
-   ksz_read32(dev, REG_SW_VLAN_ENTRY__4, _table[0]);
-   ksz_read32(dev, REG_SW_VLAN_ENTRY_UNTAG__4, _table[1]);
-   ksz_read32(dev, REG_SW_VLAN_ENTRY_PORTS__4, _table[2]);
-
-   ksz_write8(dev, REG_SW_VLAN_CTRL, 0);
-
-exit:
-   mutex_unlock(>vlan_mutex);
-
-   return ret;
-}
-
-static int set_vlan_table(struct dsa_switch *ds, u16 vid, u32 *vlan_table) -{
-   struct ksz_device *dev = ds->priv;
-   int ret;
-
-   mutex_lock(>vlan_mutex);
-
-   ksz_write32(dev, REG_SW_VLAN_ENTRY__4, vlan_table[0]);
-   ksz_write32(dev, REG_SW_VLAN_ENTRY_UNTAG__4, vlan_table[1]);
-   ksz_write32(dev, REG_SW_VLAN_ENTRY_PORTS__4, vlan_table[2]);
-
-   ksz_write16(dev, REG_SW_VLAN_ENTRY_INDEX__2, vid & VLAN_INDEX_M);
-   ksz_write8(dev, REG_SW_VLAN_CTRL, VLAN_START | VLAN_WRITE);
-
-   /* wait to be cleared */
-   ret = wait_vlan_ctrl_ready(dev, VLAN_START, 1000);
-   if (ret < 0) {
-   dev_dbg(dev->dev, "Failed to write

[PATCH RFC 6/6] Modify tag_ksz.c to support other KSZ switch drivers

2017-09-07 Thread Tristram.Ha

From: Tristram Ha 

The KSZ tail tag code can be used by different KSZ switch drivers.

Signed-off-by: Tristram Ha 
---
diff --git a/net/dsa/tag_ksz.c b/net/dsa/tag_ksz.c index 010ca0a..d5faf14 100644
--- a/net/dsa/tag_ksz.c
+++ b/net/dsa/tag_ksz.c
@@ -13,35 +13,21 @@
 #include 
 #include 
 #include "dsa_priv.h"
-
-/* For Ingress (Host -> KSZ), 2 bytes are added before FCS.
- * ---
- * DA(6bytes)|SA(6bytes)||Data(nbytes)|tag0(1byte)|tag1(1byte)|FCS(4bytes)
- * ---
- * tag0 : Prioritization (not used now)
- * tag1 : each bit represents port (eg, 0x01=port1, 0x02=port2, 0x10=port5)
- *
- * For Egress (KSZ -> Host), 1 byte is added before FCS.
- * ---
- * DA(6bytes)|SA(6bytes)||Data(nbytes)|tag0(1byte)|FCS(4bytes)
- * ---
- * tag0 : zero-based value represents port
- *   (eg, 0x00=port1, 0x02=port3, 0x06=port7)
- */
-
-#defineKSZ_INGRESS_TAG_LEN 2
-#defineKSZ_EGRESS_TAG_LEN  1
+#include "../../drivers/net/dsa/microchip/ksz_priv.h"
 
 static struct sk_buff *ksz_xmit(struct sk_buff *skb, struct net_device *dev)  {
struct dsa_slave_priv *p = netdev_priv(dev);
+   struct dsa_switch *ds = p->dp->ds;
+   struct ksz_device *swdev = ds->priv;
struct sk_buff *nskb;
+   int len;
int padlen;
-   u8 *tag;
 
padlen = (skb->len >= ETH_ZLEN) ? 0 : ETH_ZLEN - skb->len;
 
-   if (skb_tailroom(skb) >= padlen + KSZ_INGRESS_TAG_LEN) {
+   len = swdev->dev_ops->get_tx_len(swdev);
+   if (skb_tailroom(skb) >= padlen + len) {
/* Let dsa_slave_xmit() free skb */
if (__skb_put_padto(skb, skb->len + padlen, false))
return NULL;
@@ -49,7 +35,7 @@ static struct sk_buff *ksz_xmit(struct sk_buff *skb, struct 
net_device *dev)
nskb = skb;
} else {
nskb = alloc_skb(NET_IP_ALIGN + skb->len +
-padlen + KSZ_INGRESS_TAG_LEN, GFP_ATOMIC);
+padlen + len, GFP_ATOMIC);
if (!nskb)
return NULL;
skb_reserve(nskb, NET_IP_ALIGN);
@@ -70,9 +56,7 @@ static struct sk_buff *ksz_xmit(struct sk_buff *skb, struct 
net_device *dev)
consume_skb(skb);
}
 
-   tag = skb_put(nskb, KSZ_INGRESS_TAG_LEN);
-   tag[0] = 0;
-   tag[1] = 1 << p->dp->index; /* destination port */
+   swdev->dev_ops->add_tail_tag(swdev, nskb, p->dp->index);
 
return nskb;
 }
@@ -83,16 +67,16 @@ static struct sk_buff *ksz_rcv(struct sk_buff *skb, struct 
net_device *dev,
struct dsa_switch_tree *dst = dev->dsa_ptr;
struct dsa_port *cpu_dp = dsa_get_cpu_port(dst);
struct dsa_switch *ds = cpu_dp->ds;
-   u8 *tag;
+   struct ksz_device *swdev = ds->priv;
+   int len;
int source_port;
 
-   tag = skb_tail_pointer(skb) - KSZ_EGRESS_TAG_LEN;
+   len = swdev->dev_ops->get_tail_tag(swdev, skb, _port);
 
-   source_port = tag[0] & 7;
if (source_port >= ds->num_ports || !ds->ports[source_port].netdev)
return NULL;
 
-   pskb_trim_rcsum(skb, skb->len - KSZ_EGRESS_TAG_LEN);
+   pskb_trim_rcsum(skb, skb->len - len);
 
skb->dev = ds->ports[source_port].netdev;

[PATCH RFC 2/6] Create new file ksz9477.c from KSZ9477 code in ksz_common.c

2017-09-07 Thread Tristram.Ha

From: Tristram Ha 

Create new ksz9477.c file from original ksz_common.c.

Signed-off-by: Tristram Ha 
---
diff --git a/drivers/net/dsa/microchip/ksz9477.c 
b/drivers/net/dsa/microchip/ksz9477.c
new file mode 100644
index 000..bc722b4
--- /dev/null
+++ b/drivers/net/dsa/microchip/ksz9477.c
@@ -0,0 +1,1317 @@
+/*
+ * Microchip switch driver main logic
+ *
+ * Copyright (C) 2017
+ *
+ * Permission to use, copy, modify, and/or distribute this software for 
+any
+ * purpose with or without fee is hereby granted, provided that the 
+above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL 
+WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE 
+FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY 
+DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN 
+AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT 
+OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "ksz_priv.h"
+#include "ksz_9477_reg.h"
+
+static const struct {
+   int index;
+   char string[ETH_GSTRING_LEN];
+} mib_names[TOTAL_SWITCH_COUNTER_NUM] = {
+   { 0x00, "rx_hi" },
+   { 0x01, "rx_undersize" },
+   { 0x02, "rx_fragments" },
+   { 0x03, "rx_oversize" },
+   { 0x04, "rx_jabbers" },
+   { 0x05, "rx_symbol_err" },
+   { 0x06, "rx_crc_err" },
+   { 0x07, "rx_align_err" },
+   { 0x08, "rx_mac_ctrl" },
+   { 0x09, "rx_pause" },
+   { 0x0A, "rx_bcast" },
+   { 0x0B, "rx_mcast" },
+   { 0x0C, "rx_ucast" },
+   { 0x0D, "rx_64_or_less" },
+   { 0x0E, "rx_65_127" },
+   { 0x0F, "rx_128_255" },
+   { 0x10, "rx_256_511" },
+   { 0x11, "rx_512_1023" },
+   { 0x12, "rx_1024_1522" },
+   { 0x13, "rx_1523_2000" },
+   { 0x14, "rx_2001" },
+   { 0x15, "tx_hi" },
+   { 0x16, "tx_late_col" },
+   { 0x17, "tx_pause" },
+   { 0x18, "tx_bcast" },
+   { 0x19, "tx_mcast" },
+   { 0x1A, "tx_ucast" },
+   { 0x1B, "tx_deferred" },
+   { 0x1C, "tx_total_col" },
+   { 0x1D, "tx_exc_col" },
+   { 0x1E, "tx_single_col" },
+   { 0x1F, "tx_mult_col" },
+   { 0x80, "rx_total" },
+   { 0x81, "tx_total" },
+   { 0x82, "rx_discards" },
+   { 0x83, "tx_discards" },
+};
+
+static void ksz_cfg(struct ksz_device *dev, u32 addr, u8 bits, bool 
+set) {
+   u8 data;
+
+   ksz_read8(dev, addr, );
+   if (set)
+   data |= bits;
+   else
+   data &= ~bits;
+   ksz_write8(dev, addr, data);
+}
+
+static void ksz_cfg32(struct ksz_device *dev, u32 addr, u32 bits, bool 
+set) {
+   u32 data;
+
+   ksz_read32(dev, addr, );
+   if (set)
+   data |= bits;
+   else
+   data &= ~bits;
+   ksz_write32(dev, addr, data);
+}
+
+static void ksz_port_cfg(struct ksz_device *dev, int port, int offset, u8 bits,
+bool set)
+{
+   u32 addr;
+   u8 data;
+
+   addr = PORT_CTRL_ADDR(port, offset);
+   ksz_read8(dev, addr, );
+
+   if (set)
+   data |= bits;
+   else
+   data &= ~bits;
+
+   ksz_write8(dev, addr, data);
+}
+
+static void ksz_port_cfg32(struct ksz_device *dev, int port, int offset,
+  u32 bits, bool set)
+{
+   u32 addr;
+   u32 data;
+
+   addr = PORT_CTRL_ADDR(port, offset);
+   ksz_read32(dev, addr, );
+
+   if (set)
+   data |= bits;
+   else
+   data &= ~bits;
+
+   ksz_write32(dev, addr, data);
+}
+
+static int wait_vlan_ctrl_ready(struct ksz_device *dev, u32 waiton, int 
+timeout) {
+   u8 data;
+
+   do {
+   ksz_read8(dev, REG_SW_VLAN_CTRL, );
+   if (!(data & waiton))
+   break;
+   usleep_range(1, 10);
+   } while (timeout-- > 0);
+
+   if (timeout <= 0)
+   return -ETIMEDOUT;
+
+   return 0;
+}
+
+static int get_vlan_table(struct dsa_switch *ds, u16 vid, u32 
+*vlan_table) {
+   struct ksz_device *dev = ds->priv;
+   int ret;
+
+   mutex_lock(>vlan_mutex);
+
+   ksz_write16(dev, REG_SW_VLAN_ENTRY_INDEX__2, vid & VLAN_INDEX_M);
+   ksz_write8(dev, REG_SW_VLAN_CTRL, VLAN_READ | VLAN_START);
+
+   /* wait to be cleared */
+   ret = wait_vlan_ctrl_ready(dev, VLAN_START, 1000);
+   if (ret < 0) {
+   dev_dbg(dev->dev, "Failed to read vlan table\n");
+   goto exit;
+   }
+
+   ksz_read32(dev, REG_SW_VLAN_ENTRY__4, _table[0]);
+   ksz_read32(dev, REG_SW_VLAN_ENTRY_UNTAG__4, _table[1]);
+

Re: [Patch net v2 2/2] net_sched: fix all the madness of tc filter chain

2017-09-07 Thread Jiri Pirko

Thu, Sep 07, 2017 at 07:45:49PM CEST, xiyou.wangc...@gmail.com wrote:
>On Wed, Sep 6, 2017 at 11:32 PM, Jiri Pirko  wrote:
>> Thu, Sep 07, 2017 at 06:26:07AM CEST, xiyou.wangc...@gmail.com wrote:
>>>This patch fixes the following madness of tc filter chain:
>>
>> Could you avoid expressive words like "madness" and such?
>> Please be technical.
>>
>
>If the following 2a) 2b) 2c) 2d) are not enough to show the madness,
>I don't know any other to show it. Madness is for code, not for you
>or any other person, so 100% technical.

We hav eto agree to disagree. You want to be expressive, apparently
there is no way to talk you out of that. Sigh...


>
>>
>>>
>>>1) tcf_chain_destroy() is called by both tcf_block_put() and
>>>   tcf_chain_put().  tcf_chain_put() is correctly refcnt'ed and paired
>>>   with tcf_chain_get(), but tcf_block_put() is not, it should be paired
>>>   with tcf_block_get() which means we still need to decrease the refcnt.
>>>   Think it in another way: if we call tcf_bock_put() immediately after
>>>   tcf_block_get(), could we get effectively a nop? This causes a memory
>>>   leak as reported by Jakub.
>>>
>>>2) tp proto should hold a refcnt to the chain too. This significantly
>>>   simplifies the logic:
>>>
>>>2a) Chain 0 is no longer special, it is created and refcnted by tp
>>>like any other chains. All the ugliness in tcf_chain_put() can be
>>>gone!
>>>
>>>2b) No need to handle the flushing oddly, because block still holds
>>>chain 0, it can not be released, this guarantees block is the last
>>>user.
>>>
>>>2c) The race condition with RCU callbacks is easier to handle with just
>>>a rcu_barrier()! Much easier to understand, nothing to hide! Thanks
>>>to the previous patch. Please see also the comments in code.
>>>
>>>2d) Make the code understandable by humans, much less error-prone.
>>>
>>>Fixes: 744a4cf63e52 ("net: sched: fix use after free when tcf_chain_destroy 
>>>is called multiple times")
>>>Fixes: 5bc1701881e3 ("net: sched: introduce multichain support for filters")
>>>Reported-by: Jakub Kicinski 
>>>Cc: Jiri Pirko 
>>>Signed-off-by: Cong Wang 
>>>---
>>> net/sched/cls_api.c | 38 ++
>>> 1 file changed, 22 insertions(+), 16 deletions(-)
>>>
>>>diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>>>index 6c5ea84d2682..e9060dc36519 100644
>>>--- a/net/sched/cls_api.c
>>>+++ b/net/sched/cls_api.c
>>>@@ -209,21 +209,20 @@ static void tcf_chain_flush(struct tcf_chain *chain)
>>>   RCU_INIT_POINTER(*chain->p_filter_chain, NULL);
>>>   while ((tp = rtnl_dereference(chain->filter_chain)) != NULL) {
>>>   RCU_INIT_POINTER(chain->filter_chain, tp->next);
>>>+  tcf_chain_put(chain);
>>>   tcf_proto_destroy(tp);
>>>   }
>>> }
>>>
>>> static void tcf_chain_destroy(struct tcf_chain *chain)
>>> {
>>>-  /* May be already removed from the list by the previous call. */
>>>-  if (!list_empty(>list))
>>>-  list_del_init(>list);
>>>+  list_del(>list);
>>>+  kfree(chain);
>>>+}
>>>
>>>-  /* There might still be a reference held when we got here from
>>>-   * tcf_block_put. Wait for the user to drop reference before free.
>>>-   */
>>>-  if (!chain->refcnt)
>>>-  kfree(chain);
>>>+static void tcf_chain_hold(struct tcf_chain *chain)
>>>+{
>>>+  ++chain->refcnt;
>>> }
>>>
>>> struct tcf_chain *tcf_chain_get(struct tcf_block *block, u32 chain_index,
>>>@@ -233,7 +232,7 @@ struct tcf_chain *tcf_chain_get(struct tcf_block *block, 
>>>u32 chain_index,
>>>
>>>   list_for_each_entry(chain, >chain_list, list) {
>>>   if (chain->index == chain_index) {
>>>-  chain->refcnt++;
>>>+  tcf_chain_hold(chain);
>>>   return chain;
>>>   }
>>>   }
>>>@@ -246,10 +245,7 @@ EXPORT_SYMBOL(tcf_chain_get);
>>>
>>> void tcf_chain_put(struct tcf_chain *chain)
>>> {
>>>-  /* Destroy unused chain, with exception of chain 0, which is the
>>>-   * default one and has to be always present.
>>>-   */
>>>-  if (--chain->refcnt == 0 && !chain->filter_chain && chain->index != 0)
>>>+  if (--chain->refcnt == 0)
>>
>> Okay, so you take the reference for every goto_chain action and every
>> tp, right? Note that for chain 0, you hold one more reference (due to
>> the creation). That is probably ok as we need chain 0 not to go away
>> even if all tps and goto_chain actions are gone.
>
>Yeah, this is the core of the patch.
>
>>
>>
>>>   tcf_chain_destroy(chain);
>>> }
>>> EXPORT_SYMBOL(tcf_chain_put);
>>>@@ -294,10 +290,18 @@ void tcf_block_put(struct tcf_block *block)
>>>   if (!block)
>>>   return;
>>>
>>>-  list_for_each_entry_safe(chain, tmp, >chain_list, list) {
>>>+  /* Standalone actions are not allowed to jump to any chain,

Re: [RFC PATCH v3 0/6] Configuring traffic classes via new hardware offload mechanism in tc/mqprio

2017-09-07 Thread Nambiar, Amritha

On 9/7/2017 11:34 AM, Florian Fainelli wrote:
> On 09/07/2017 04:00 AM, Amritha Nambiar wrote:
>> The following series introduces a new hardware offload mode in 
>> tc/mqprio where the TCs, the queue configurations and bandwidth 
>> rate limits are offloaded to the hardware. The existing mqprio 
>> framework is extended to configure the queue counts and layout and
>>  also added support for rate limiting. This is achieved through new
>>  netlink attributes for the 'mode' option which takes values such
>> as 'dcb' (default) and 'channel' and a 'shaper' option for QoS 
>> attributes such as bandwidth rate limits in hw mode 1.
> 
> So "dcb" defines a default priorities to queue mapping?

In the default offload implementation, only the basic hw offload was
supported factoring only the 'number of TCs' values. The rest of the
queue configuration was being ignored. This is the legacy behavior with
hw mode set to 1.
Example:
# tc qdisc add dev eth0 root mqprio num_tc 4  map 0 0 0 0 1 1 1 1 queues
4@0 4@4 hw 1
I just named this default behavior to 'dcb' while introducing a new
offload mechanism.

> 
>> Legacy devices can fall back to the existing setup supporting hw 
>> mode 1 without these additional options where only the TCs are 
>> offloaded and then the 'mode' and 'shaper' options defaults to DCB
>>  support.
> 
> That's the last part that confuses me, see below.

As I introduced new options for 'mode' and 'shaper', I set the defaults
for these options to 'dcb' so existing offloaders can continue to work
without supporting these new options. Patch 1 has a detailed description
on how this is done.

> 
>> The i40e driver enables the new mqprio hardware offload mechanism 
>> factoring the TCs, queue configuration and bandwidth rates by 
>> creating HW channel VSIs.
> 
> I am really confused by what you call hw_mode 1, as I understand it 
> there are really 3 different modes:

There are actually 2 modes now with 'hw' option set to 1, legacy/dcb and
channel.
> 
> - legacy: you don't define any traffic class mapping, but you can 
> still chain this scheduler with a match + action (like what 
> Documentation/networking/multiqueue.txt) you can optionally also add 
> "shaper" arguments, but there should not be any default DCB queue 
> mapping either?
> 
> - dcb: a default mapping for traffic classes to queues is defined, 
> optional "shaper" arguments

The legacy mode now becomes the dcb mode. In this mode, although the TC
values, the queue configurations, prio-tc-mapping are all offloaded to
the device, the existing implementation in current drivers support only
a basic hw offload factoring only the TC values.

Examples:
# ... num_tc 2  map 0 0 0 0 1 1 1 1 queues 4@0 4@4 hw 1
# ... num_tc 2  map 0 0 0 0 1 1 1 1 queues 4@0 4@4 hw 1 mode dcb
# ... num_tc 2  map 0 0 0 0 1 1 1 1 queues 4@0 4@4 hw 1 mode dcb\
  shaper dcb

> 
> - channel: (maybe calling that "custom_tc_map" would be clearer?) 
> where you express the exact traffic classes to queue mapping and 
> optional "shaper" arguments

In the channel mode, a full hw offload is supported, the TC values, the
queue configurations and additionally QoS attributes (optional) are all
used in the new implementation.

Examples:
# ... num_tc 2  map 0 0 0 0 1 1 1 1 queues 4@0 4@4 hw 1 mode channel
# ... num_tc 2  map 0 0 0 0 1 1 1 1 queues 4@0 4@4 hw 1 mode channel\
shaper bw_rlimit max_rate 4Gbit 5Gbit

> 
> I think that's what you are doing, but I just got confused by the 
> cover letter.
> 
>> 
>> In this new mode, the priority to traffic class mapping and the 
>> user specified queue ranges are used to configure the traffic class
>> when the 'mode' option is set to 'channel'. This is achieved by
>> creating HW channels(VSI). A new channel is created for each of the
>> traffic class configuration offloaded via mqprio framework except
>> for the first TC (TC0) which is for the main VSI. TC0 for the main
>> VSI is also reconfigured as per user provided queue parameters.
>> Finally, bandwidth rate limits are set on these traffic classes
>> through the shaper attribute by sending these rates in addition to
>> the number of TCs and the queue configurations.
>> 
>> Example: # tc qdisc add dev eth0 root mqprio num_tc 2 map 0 0 0 0 1
>> 1 1 1\ queues 4@0 4@4 hw 1 mode channel shaper bw_rlimit\
> 
> Do you see a case where you can declare a different number of
> traffic classes say 4 and map them onto just 2 hardware queues? If
> not, it seems a tiny bit redundant to have to specify both the map
> and the queue mapping should be sufficient, right?

This will be subjected to validation of the user input and will be
treated as invalid configuration. The 'map' specifies the mapping
between user priorities and traffic classes while the queue mapping is
the queue layout specifying the queue count and offsets.

> 
>> min_rate 1Gbit 2Gbit max_rate 4Gbit 5Gbit
>> 
>> To dump the bandwidth rates:
>> 
>> # tc qdisc show dev eth0
>> 
>> qdisc mqprio 804a: root  tc 2 map 0 0 0 0 1 1 1 1 0 0 0 0

Re: [RFC net-next 0/5] TSN: Add qdisc-based config interfaces for traffic shapers

2017-09-07 Thread Guedes, Andre

Hi Henrik,

Thanks for your feedback! I'll address some of your comments below.

On Thu, 2017-09-07 at 07:34 +0200, Henrik Austad wrote:
> > As for the shapers config interface:
> > 
> >  * CBS (802.1Qav)
> > 
> >    This patchset is proposing a new qdisc called 'cbs'. Its 'tc' cmd line
> > is:
> >    $ tc qdisc add dev IFACE parent ID cbs locredit N hicredit M sendslope S
> > \
> >  idleslope I
> 
> So this confuses me a bit, why specify sendSlope?
> 
> sendSlope = portTransmitRate - idleSlope
> 
> and portTransmitRate is the speed of the MAC (which you get from the 
> driver). Adding sendSlope here is just redundant I think.

Yes, this was something we've spent quite a few time discussing before this RFC
series. After reading the Annex L from 802.1Q-2014 (operation of CBS algorithm)
so many times, we've came up with the rationale explained below.

The rationale here is that sendSlope is just another parameter from CBS
algorithm like idleSlope, hiCredit and loCredit. As such, its calculation
should be done at the same "layer" as the others parameters (in this case, user
space) in order to keep consistency. Moreover, in this design, the driver layer
is dead simple: all the device driver has to do is applying CBS parameters to
hardware. Having any CBS parameter calculation in the driver layer means all
device drivers must implement that calculation.

> Also, does this mean that when you create the qdisc, you have locked the 
> bandwidth for the scheduler? Meaning, if I later want to add another 
> stream that requires more bandwidth, I have to close all active streams, 
> reconfigure the qdisc and then restart?

If we want to reserve more bandwidth to "accommodate" a new stream, we don't
need to close all active streams. All we have to do is changing the CBS qdisc
and pass the new CBS parameters. Here is what the command-line would look like:

$ tc qdisc change dev enp0s4 parent 8001:5 cbs locredit -1470 hicredit 30
sendslope -98 idleslope 2

No application/stream is interrupted while new CBS parameters are applied.

> >    Note that the parameters for this qdisc are the ones defined by the
> >    802.1Q-2014 spec, so no hardware specific functionality is exposed here.
> 
> You do need to know if the link is brought up as 100 or 1000 though - which 
> the driver already knows.

User space knows that information via ethtool or /sys.

> > Testing this RFC
> > 
> > 
> > For testing the patches of this RFC only, you can refer to the samples and
> > helper script being added to samples/tsn/ and the use the 'mqprio' qdisc to
> > setup the priorities to Tx queues mapping, together with the 'cbs' qdisc to
> > configure the HW shaper of the i210 controller:
> 
> I will test it, feedback will be provided soon! :)

That's great! Please let us know if you find any issue and thanks for you
support.

> > 8) You can also run a Talker for class B (prio 2 here)
> > $ ./talker -i enp3s0 -p 2
> > 
> >  * The bandwidth displayed on the listener output now should increase to
> > very
> >    close to the one configured for class A + class B.
> 
> Because you grab both class A *and* B, or because B will eat what A does 
> not use?

Because the listener application grabs both class A and B traffic.

Regards,

Andre

smime.p7s
Description: S/MIME cryptographic signature

[PATCH] net: tulip: Constify tulip_tbl

2017-09-07 Thread Kees Cook

It looks like all users of tulip_tbl are reads, so mark this table
as read-only.

$ git grep tulip_tbl  # edited to avoid line-wraps...
interrupt.c: iowrite32(tulip_tbl[tp->chip_id].valid_intrs, ...
interrupt.c: iowrite32(tulip_tbl[tp->chip_id].valid_intrs&~RxPollInt, ...
interrupt.c: iowrite32(tulip_tbl[tp->chip_id].valid_intrs, ...
interrupt.c: iowrite32(tulip_tbl[tp->chip_id].valid_intrs | TimerInt,
pnic.c:  iowrite32(tulip_tbl[tp->chip_id].valid_intrs, ioaddr + CSR7);
tulip.h: extern struct tulip_chip_table tulip_tbl[];
tulip_core.c:struct tulip_chip_table tulip_tbl[] = {
tulip_core.c:iowrite32(tulip_tbl[tp->chip_id].valid_intrs, ioaddr + CSR5);
tulip_core.c:iowrite32(tulip_tbl[tp->chip_id].valid_intrs, ioaddr + CSR7);
tulip_core.c:setup_timer(>timer, tulip_tbl[tp->chip_id].media_timer,
tulip_core.c:const char *chip_name = tulip_tbl[chip_idx].chip_name;
tulip_core.c:if (pci_resource_len (pdev, 0) < tulip_tbl[chip_idx].io_size)
tulip_core.c:ioaddr =  pci_iomap(..., tulip_tbl[chip_idx].io_size);
tulip_core.c:tp->flags = tulip_tbl[chip_idx].flags;
tulip_core.c:setup_timer(>timer, tulip_tbl[tp->chip_id].media_timer,
tulip_core.c:INIT_WORK(>media_work, tulip_tbl[tp->chip_id].media_task);

Cc: "David S. Miller" 
Cc: Jarod Wilson 
Cc: "Gustavo A. R. Silva" 
Cc: netdev@vger.kernel.org
Cc: linux-par...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 drivers/net/ethernet/dec/tulip/tulip.h  | 2 +-
 drivers/net/ethernet/dec/tulip/tulip_core.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/dec/tulip/tulip.h 
b/drivers/net/ethernet/dec/tulip/tulip.h
index 38431a155f09..06660dbc44b7 100644
--- a/drivers/net/ethernet/dec/tulip/tulip.h
+++ b/drivers/net/ethernet/dec/tulip/tulip.h
@@ -515,7 +515,7 @@ void comet_timer(unsigned long data);
 extern int tulip_debug;
 extern const char * const medianame[];
 extern const char tulip_media_cap[];
-extern struct tulip_chip_table tulip_tbl[];
+extern const struct tulip_chip_table tulip_tbl[];
 void oom_timer(unsigned long data);
 extern u8 t21040_csr13[];
 
diff --git a/drivers/net/ethernet/dec/tulip/tulip_core.c 
b/drivers/net/ethernet/dec/tulip/tulip_core.c
index 84394b43c0a1..851b6d1f5a42 100644
--- a/drivers/net/ethernet/dec/tulip/tulip_core.c
+++ b/drivers/net/ethernet/dec/tulip/tulip_core.c
@@ -138,7 +138,7 @@ static void tulip_timer(unsigned long data)
  * It is indexed via the values in 'enum chips'
  */
 
-struct tulip_chip_table tulip_tbl[] = {
+const struct tulip_chip_table tulip_tbl[] = {
   { }, /* placeholder for array, slot unused currently */
   { }, /* placeholder for array, slot unused currently */
 
-- 
2.7.4


-- 
Kees Cook
Pixel Security

Re: [RFC PATCH v3 0/6] Configuring traffic classes via new hardware offload mechanism in tc/mqprio

2017-09-07 Thread Florian Fainelli

On 09/07/2017 04:00 AM, Amritha Nambiar wrote:
> The following series introduces a new hardware offload mode in
> tc/mqprio where the TCs, the queue configurations and
> bandwidth rate limits are offloaded to the hardware. The existing
> mqprio framework is extended to configure the queue counts and
> layout and also added support for rate limiting. This is achieved
> through new netlink attributes for the 'mode' option which takes
> values such as 'dcb' (default) and 'channel' and a 'shaper' option
> for QoS attributes such as bandwidth rate limits in hw mode 1.

So "dcb" defines a default priorities to queue mapping?

> Legacy devices can fall back to the existing setup supporting hw mode
> 1 without these additional options where only the TCs are offloaded
> and then the 'mode' and 'shaper' options defaults to DCB support.

That's the last part that confuses me, see below.

> The i40e driver enables the new mqprio hardware offload mechanism
> factoring the TCs, queue configuration and bandwidth rates by
> creating HW channel VSIs.

I am really confused by what you call hw_mode 1, as I understand it
there are really 3 different modes:

- legacy: you don't define any traffic class mapping, but you can still
chain this scheduler with a match + action (like what
Documentation/networking/multiqueue.txt) you can optionally also add
"shaper" arguments, but there should not be any default DCB queue
mapping either?

- dcb: a default mapping for traffic classes to queues is defined,
optional "shaper" arguments

- channel: (maybe calling that "custom_tc_map" would be clearer?) where
you express the exact traffic classes to queue mapping and optional
"shaper" arguments

I think that's what you are doing, but I just got confused by the cover
letter.

> 
> In this new mode, the priority to traffic class mapping and the
> user specified queue ranges are used to configure the traffic
> class when the 'mode' option is set to 'channel'. This is achieved by
> creating HW channels(VSI). A new channel is created for each of the
> traffic class configuration offloaded via mqprio framework except for
> the first TC (TC0) which is for the main VSI. TC0 for the main VSI is
> also reconfigured as per user provided queue parameters. Finally,
> bandwidth rate limits are set on these traffic classes through the
> shaper attribute by sending these rates in addition to the number of
> TCs and the queue configurations.
> 
> Example:
> # tc qdisc add dev eth0 root mqprio num_tc 2 map 0 0 0 0 1 1 1 1\
>   queues 4@0 4@4 hw 1 mode channel shaper bw_rlimit\

Do you see a case where you can declare a different number of traffic
classes say 4 and map them onto just 2 hardware queues? If not, it seems
a tiny bit redundant to have to specify both the map and the queue
mapping should be sufficient, right?

>   min_rate 1Gbit 2Gbit max_rate 4Gbit 5Gbit
> 
> To dump the bandwidth rates:
> 
> # tc qdisc show dev eth0
> 
> qdisc mqprio 804a: root  tc 2 map 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0
>  queues:(0:3) (4:7)
>  mode:channel
>  shaper:bw_rlimit   min_rate:1Gbit 2Gbit   max_rate:4Gbit 
> 5Gbit

I am not well versed into tc, but being able to specify "shaper"
arguments has actually value outside of just the multiq scheduler and it
could probably be an action on its own?

> 
> ---
> 
> Amritha Nambiar (6):
>   mqprio: Introduce new hardware offload mode and shaper in mqprio
>   i40e: Add macro for PF reset bit
>   i40e: Add infrastructure for queue channel support
>   i40e: Enable 'channel' mode in mqprio for TC configs
>   i40e: Refactor VF BW rate limiting
>   i40e: Add support setting TC max bandwidth rates
> 
> 
>  drivers/net/ethernet/intel/i40e/i40e.h |   44 +
>  drivers/net/ethernet/intel/i40e/i40e_debugfs.c |3 
>  drivers/net/ethernet/intel/i40e/i40e_ethtool.c |8 
>  drivers/net/ethernet/intel/i40e/i40e_main.c| 1463 
> +---
>  drivers/net/ethernet/intel/i40e/i40e_txrx.h|2 
>  drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c |   50 -
>  include/net/pkt_cls.h  |9 
>  include/uapi/linux/pkt_sched.h |   32 
>  net/sched/sch_mqprio.c |  183 ++-
>  9 files changed, 1551 insertions(+), 243 deletions(-)
> 
> --
> 


-- 
Florian

Re: [Patch net v2 2/2] net_sched: fix all the madness of tc filter chain

2017-09-07 Thread Cong Wang

On Wed, Sep 6, 2017 at 11:32 PM, Jiri Pirko  wrote:
> Thu, Sep 07, 2017 at 06:26:07AM CEST, xiyou.wangc...@gmail.com wrote:
>>This patch fixes the following madness of tc filter chain:
>
> Could you avoid expressive words like "madness" and such?
> Please be technical.
>

If the following 2a) 2b) 2c) 2d) are not enough to show the madness,
I don't know any other to show it. Madness is for code, not for you
or any other person, so 100% technical.

>
>>
>>1) tcf_chain_destroy() is called by both tcf_block_put() and
>>   tcf_chain_put().  tcf_chain_put() is correctly refcnt'ed and paired
>>   with tcf_chain_get(), but tcf_block_put() is not, it should be paired
>>   with tcf_block_get() which means we still need to decrease the refcnt.
>>   Think it in another way: if we call tcf_bock_put() immediately after
>>   tcf_block_get(), could we get effectively a nop? This causes a memory
>>   leak as reported by Jakub.
>>
>>2) tp proto should hold a refcnt to the chain too. This significantly
>>   simplifies the logic:
>>
>>2a) Chain 0 is no longer special, it is created and refcnted by tp
>>like any other chains. All the ugliness in tcf_chain_put() can be
>>gone!
>>
>>2b) No need to handle the flushing oddly, because block still holds
>>chain 0, it can not be released, this guarantees block is the last
>>user.
>>
>>2c) The race condition with RCU callbacks is easier to handle with just
>>a rcu_barrier()! Much easier to understand, nothing to hide! Thanks
>>to the previous patch. Please see also the comments in code.
>>
>>2d) Make the code understandable by humans, much less error-prone.
>>
>>Fixes: 744a4cf63e52 ("net: sched: fix use after free when tcf_chain_destroy 
>>is called multiple times")
>>Fixes: 5bc1701881e3 ("net: sched: introduce multichain support for filters")
>>Reported-by: Jakub Kicinski 
>>Cc: Jiri Pirko 
>>Signed-off-by: Cong Wang 
>>---
>> net/sched/cls_api.c | 38 ++
>> 1 file changed, 22 insertions(+), 16 deletions(-)
>>
>>diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>>index 6c5ea84d2682..e9060dc36519 100644
>>--- a/net/sched/cls_api.c
>>+++ b/net/sched/cls_api.c
>>@@ -209,21 +209,20 @@ static void tcf_chain_flush(struct tcf_chain *chain)
>>   RCU_INIT_POINTER(*chain->p_filter_chain, NULL);
>>   while ((tp = rtnl_dereference(chain->filter_chain)) != NULL) {
>>   RCU_INIT_POINTER(chain->filter_chain, tp->next);
>>+  tcf_chain_put(chain);
>>   tcf_proto_destroy(tp);
>>   }
>> }
>>
>> static void tcf_chain_destroy(struct tcf_chain *chain)
>> {
>>-  /* May be already removed from the list by the previous call. */
>>-  if (!list_empty(>list))
>>-  list_del_init(>list);
>>+  list_del(>list);
>>+  kfree(chain);
>>+}
>>
>>-  /* There might still be a reference held when we got here from
>>-   * tcf_block_put. Wait for the user to drop reference before free.
>>-   */
>>-  if (!chain->refcnt)
>>-  kfree(chain);
>>+static void tcf_chain_hold(struct tcf_chain *chain)
>>+{
>>+  ++chain->refcnt;
>> }
>>
>> struct tcf_chain *tcf_chain_get(struct tcf_block *block, u32 chain_index,
>>@@ -233,7 +232,7 @@ struct tcf_chain *tcf_chain_get(struct tcf_block *block, 
>>u32 chain_index,
>>
>>   list_for_each_entry(chain, >chain_list, list) {
>>   if (chain->index == chain_index) {
>>-  chain->refcnt++;
>>+  tcf_chain_hold(chain);
>>   return chain;
>>   }
>>   }
>>@@ -246,10 +245,7 @@ EXPORT_SYMBOL(tcf_chain_get);
>>
>> void tcf_chain_put(struct tcf_chain *chain)
>> {
>>-  /* Destroy unused chain, with exception of chain 0, which is the
>>-   * default one and has to be always present.
>>-   */
>>-  if (--chain->refcnt == 0 && !chain->filter_chain && chain->index != 0)
>>+  if (--chain->refcnt == 0)
>
> Okay, so you take the reference for every goto_chain action and every
> tp, right? Note that for chain 0, you hold one more reference (due to
> the creation). That is probably ok as we need chain 0 not to go away
> even if all tps and goto_chain actions are gone.

Yeah, this is the core of the patch.

>
>
>>   tcf_chain_destroy(chain);
>> }
>> EXPORT_SYMBOL(tcf_chain_put);
>>@@ -294,10 +290,18 @@ void tcf_block_put(struct tcf_block *block)
>>   if (!block)
>>   return;
>>
>>-  list_for_each_entry_safe(chain, tmp, >chain_list, list) {
>>+  /* Standalone actions are not allowed to jump to any chain, and
>>+   * bound actions should be all removed after flushing. However,
>>+   * filters are destroyed in RCU callbacks, we have to flush and wait
>>+   * for them before releasing this refcnt, otherwise we race with RCU
>>+   * callbacks!!!
>
> Why the "!!!"? Please avoid that. Not necessary

Re: [Intel-wired-lan] [RFC PATCH v3 0/6] Configuring traffic classes via new hardware offload mechanism in tc/mqprio

2017-09-07 Thread Shannon Nelson


On 9/7/2017 10:22 AM, Nambiar, Amritha wrote:

On 9/7/2017 9:45 AM, Shannon Nelson wrote:

[...]


It would be nice to know what has changed since the last review, either
summarized here or in the individual patch files. 

[...]


For all those patch files that have changed since the last revision, I
have captured all the new changes in the section titled "v3: " of each
patch file. 


Mea culpa.

ugh.
sln

Re: [Intel-wired-lan] [RFC PATCH v3 0/6] Configuring traffic classes via new hardware offload mechanism in tc/mqprio

2017-09-07 Thread Nambiar, Amritha

On 9/7/2017 9:45 AM, Shannon Nelson wrote:
> On 9/7/2017 4:00 AM, Amritha Nambiar wrote:
>> The following series introduces a new hardware offload mode in
>> tc/mqprio where the TCs, the queue configurations and
>> bandwidth rate limits are offloaded to the hardware. The existing
>> mqprio framework is extended to configure the queue counts and
>> layout and also added support for rate limiting. This is achieved
>> through new netlink attributes for the 'mode' option which takes
>> values such as 'dcb' (default) and 'channel' and a 'shaper' option
>> for QoS attributes such as bandwidth rate limits in hw mode 1.
>> Legacy devices can fall back to the existing setup supporting hw mode
>> 1 without these additional options where only the TCs are offloaded
>> and then the 'mode' and 'shaper' options defaults to DCB support.
>> The i40e driver enables the new mqprio hardware offload mechanism
>> factoring the TCs, queue configuration and bandwidth rates by
>> creating HW channel VSIs.
>>
>> In this new mode, the priority to traffic class mapping and the
>> user specified queue ranges are used to configure the traffic
>> class when the 'mode' option is set to 'channel'. This is achieved by
>> creating HW channels(VSI). A new channel is created for each of the
>> traffic class configuration offloaded via mqprio framework except for
>> the first TC (TC0) which is for the main VSI. TC0 for the main VSI is
>> also reconfigured as per user provided queue parameters. Finally,
>> bandwidth rate limits are set on these traffic classes through the
>> shaper attribute by sending these rates in addition to the number of
>> TCs and the queue configurations.
>>
>> Example:
>>  # tc qdisc add dev eth0 root mqprio num_tc 2 map 0 0 0 0 1 1 1 1\
>>queues 4@0 4@4 hw 1 mode channel shaper bw_rlimit\
>>min_rate 1Gbit 2Gbit max_rate 4Gbit 5Gbit
>>
>>  To dump the bandwidth rates:
>>
>>  # tc qdisc show dev eth0
>>
>>  qdisc mqprio 804a: root  tc 2 map 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0
>>   queues:(0:3) (4:7)
>>   mode:channel
>>   shaper:bw_rlimit   min_rate:1Gbit 2Gbit   max_rate:4Gbit 
>> 5Gbit
>>
>> ---
>>
>> Amritha Nambiar (6):
>>mqprio: Introduce new hardware offload mode and shaper in mqprio
>>i40e: Add macro for PF reset bit
>>i40e: Add infrastructure for queue channel support
>>i40e: Enable 'channel' mode in mqprio for TC configs
>>i40e: Refactor VF BW rate limiting
>>i40e: Add support setting TC max bandwidth rates
>>
> 
> It would be nice to know what has changed since the last review, either 
> summarized here or in the individual patch files.  This helps in knowing 
> how much attention should be given to this new set of patches, and 
> encourages further review.  I don't remember seeing any responses to my 
> previous comments, and it looks like not all of them were acted upon.

For all those patch files that have changed since the last revision, I
have captured all the new changes in the section titled "v3: " of each
patch file. Also, I had replied to most of your comments that they'll be
fixed in v3 and the one in the mqprio patch that error handling was
already being done and these responses can be verified in patchwork. I
missed to respond to your comment regarding supporting macvlan offloads
through these channels, that will not be a part of this series and will
be looked upon at a later time in a subsequent patch series.

> 
> sln
> 
>>
>>   drivers/net/ethernet/intel/i40e/i40e.h |   44 +
>>   drivers/net/ethernet/intel/i40e/i40e_debugfs.c |3
>>   drivers/net/ethernet/intel/i40e/i40e_ethtool.c |8
>>   drivers/net/ethernet/intel/i40e/i40e_main.c| 1463 
>> +---
>>   drivers/net/ethernet/intel/i40e/i40e_txrx.h|2
>>   drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c |   50 -
>>   include/net/pkt_cls.h  |9
>>   include/uapi/linux/pkt_sched.h |   32
>>   net/sched/sch_mqprio.c |  183 ++-
>>   9 files changed, 1551 insertions(+), 243 deletions(-)
>>
>> --
>> ___
>> Intel-wired-lan mailing list
>> intel-wired-...@osuosl.org
>> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
>>

Re: Bad escapes in ip -online

2017-09-07 Thread Stephen Hemminger

On Thu, 7 Sep 2017 12:08:56 -0400
John Kodis  wrote:

> The -online option to the 'ip link', 'ip addr', and perhaps others is 
> putting out a backslash in places where a newline character should go, 
> as so:
> 
> 1: lo:  mtu 65536 qdisc noqueue state UNKNOWN mode 
> DEFAULT group default qlen 1000\link/loopback 00:00:00:00:00:00 brd 
> 00:00:00:00:00:00
> 
> -- John.
> 

The backslash is the virtual line separator in the one line format.
It is has always been that way, and programs parse the oneline output.

Re: nfp bpf offload add/replace

2017-09-07 Thread Jakub Kicinski

On Thu, 7 Sep 2017 16:05:03 +0200, Jiri Pirko wrote:
> Thu, Sep 07, 2017 at 03:44:12PM CEST, kubak...@wp.pl wrote:
> >On Thu, 7 Sep 2017 11:10:33 +0200, Jiri Pirko wrote:  
> >> Hi Kuba.
> >> 
> >> I'm looking into cls_bpf code and nfp_net_bpf_offload function in your
> >> driver. Why do you need TC_CLSBPF_ADD? Seems like TC_CLSBPF_REPLACE
> >> should be enough. It would make the cls_bpf code easier.
> >>
> >> Note that other cls just have replace/destroy (u32 too, as drivers
> >> handle NEW/REPLACE in one switch-case - will patch this).  
> >
> >Could we clarify what the REPLACE is actually supposed to do?  :)
> >
> >In the flower code and the REPLACE looks a lot like ADD on the
> >surface...  If change is called it will invoke REPLACE with the new
> >filter and then if there was an old filter, it will do DELETE.  Is my
> >understanding correct?  
> 
> Yes, correct.
> 
> >
> >If so I found this model of operation somehow confusing.  Plus the
> >management of flows may get slightly tricky if there is a possibility of
> >"replacing" a flow with an identical one.  Flower may make calls like
> >these:
> >
> >add flower vlan_id 100 action ...
> ># REPLACE vid 100 ...
> >change ... flower vlan_id 100 action ...
> ># REPLACE vid 100 ...
> ># DELETE  vid 100 ...  
> 
> Yes, that is the flow.
> 
> >
> >Doesn't this force driver/HW to implement refcounting on the rules?  
> 
> Why do you think so? There is a cookie that is passed from flower down
> and driver uses it to remove the entry.

Right, the key/mask combination doesn't have to be unique anyway...

> >On why I need the replace - BPF unlike other classifiers usually
> >installs a single program, I think offloading multiple TC filters is
> >questionable (people will use tailcalls instead most likely).  I want to
> >be able to implement atomic replace of that single program (i.e. not ADD
> >followed by DELETE) because that simplifies the driver quite a bit.  
> 
> Understood. So, looks like the REPLACE/DESTROY would be sufficient for
> bpf. ADD is not needed as it can be done by REPLACE-NULL, right?

Yes, or you could take it to the extreme ;)
 DESTROY == offload(old, NULL)
 ADD == offload(NULL, new)
 REPLACE == offload(obj, new)

> On the other hand, the rest of the cls, namely flower, u32 and matchall
> need ADD/DESTROY as they don't really do no replacing.
> 
> Makes sense?

Ack, if you're unifying things, I don't mind how things are muxed as
long as atomic replace is possible.

FWIW cls_bpf doesn't pass old prog in REPLACE right now, but I have
patch to add it anyway since it simplifies the driver when maps are
involved.  I should probably stop looking at the .command completely,
just rely on new/old programs being populated.

Re: [Intel-wired-lan] [RFC PATCH v3 0/6] Configuring traffic classes via new hardware offload mechanism in tc/mqprio

2017-09-07 Thread Shannon Nelson


On 9/7/2017 4:00 AM, Amritha Nambiar wrote:

The following series introduces a new hardware offload mode in
tc/mqprio where the TCs, the queue configurations and
bandwidth rate limits are offloaded to the hardware. The existing
mqprio framework is extended to configure the queue counts and
layout and also added support for rate limiting. This is achieved
through new netlink attributes for the 'mode' option which takes
values such as 'dcb' (default) and 'channel' and a 'shaper' option
for QoS attributes such as bandwidth rate limits in hw mode 1.
Legacy devices can fall back to the existing setup supporting hw mode
1 without these additional options where only the TCs are offloaded
and then the 'mode' and 'shaper' options defaults to DCB support.
The i40e driver enables the new mqprio hardware offload mechanism
factoring the TCs, queue configuration and bandwidth rates by
creating HW channel VSIs.

In this new mode, the priority to traffic class mapping and the
user specified queue ranges are used to configure the traffic
class when the 'mode' option is set to 'channel'. This is achieved by
creating HW channels(VSI). A new channel is created for each of the
traffic class configuration offloaded via mqprio framework except for
the first TC (TC0) which is for the main VSI. TC0 for the main VSI is
also reconfigured as per user provided queue parameters. Finally,
bandwidth rate limits are set on these traffic classes through the
shaper attribute by sending these rates in addition to the number of
TCs and the queue configurations.

Example:
 # tc qdisc add dev eth0 root mqprio num_tc 2 map 0 0 0 0 1 1 1 1\
   queues 4@0 4@4 hw 1 mode channel shaper bw_rlimit\
   min_rate 1Gbit 2Gbit max_rate 4Gbit 5Gbit

 To dump the bandwidth rates:

 # tc qdisc show dev eth0

 qdisc mqprio 804a: root  tc 2 map 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0
  queues:(0:3) (4:7)
  mode:channel
  shaper:bw_rlimit   min_rate:1Gbit 2Gbit   max_rate:4Gbit 5Gbit

---

Amritha Nambiar (6):
   mqprio: Introduce new hardware offload mode and shaper in mqprio
   i40e: Add macro for PF reset bit
   i40e: Add infrastructure for queue channel support
   i40e: Enable 'channel' mode in mqprio for TC configs
   i40e: Refactor VF BW rate limiting
   i40e: Add support setting TC max bandwidth rates



It would be nice to know what has changed since the last review, either 
summarized here or in the individual patch files.  This helps in knowing 
how much attention should be given to this new set of patches, and 
encourages further review.  I don't remember seeing any responses to my 
previous comments, and it looks like not all of them were acted upon.


sln



  drivers/net/ethernet/intel/i40e/i40e.h |   44 +
  drivers/net/ethernet/intel/i40e/i40e_debugfs.c |3
  drivers/net/ethernet/intel/i40e/i40e_ethtool.c |8
  drivers/net/ethernet/intel/i40e/i40e_main.c| 1463 +---
  drivers/net/ethernet/intel/i40e/i40e_txrx.h|2
  drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c |   50 -
  include/net/pkt_cls.h  |9
  include/uapi/linux/pkt_sched.h |   32
  net/sched/sch_mqprio.c |  183 ++-
  9 files changed, 1551 insertions(+), 243 deletions(-)

--
___
Intel-wired-lan mailing list
intel-wired-...@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

Re: iwlwifi: mvm: only send LEDS_CMD when the FW supports it

2017-09-07 Thread Kalle Valo

Luciano Coelho  wrote:

> From: Luca Coelho 
> 
> The LEDS_CMD command is only supported in some newer FW versions
> (e.g. iwlwifi-8000C-31.ucode), so we can't send it to older versions
> (such as iwlwifi-8000C-27.ucode).
> 
> To fix this, check for a new bit in the FW capabilities TLV that tells
> when the command is supported.
> 
> Note that the current version of -31.ucode in linux-firmware.git
> (31.532993.0) does not have this capability bit set, so the LED won't
> work, even though this version should support it.  But we will update
> this firmware soon, so it won't be a problem anymore.
> 
> Fixes: 7089ae634c50 ("iwlwifi: mvm: use firmware LED command where 
> applicable")
> Reported-by: Linus Torvalds 
> Signed-off-by: Luca Coelho 

Patch applied to wireless-drivers.git, thanks.

2eabc84d2f8e iwlwifi: mvm: only send LEDS_CMD when the FW supports it

-- 
https://patchwork.kernel.org/patch/9941719/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

Re: pull-request: mac80211 2017-09-07

2017-09-07 Thread David Miller

From: Johannes Berg 
Date: Thu,  7 Sep 2017 09:09:38 +0200

> During my long absence some things have accumulated, but there wasn't
> actually all that much that could've gone into the last cycle, and a
> fix or two was taken care of by others.
> 
> The most important thing here is probably the deadlock fix that a few
> people have run into on 4.13, but that was only identified now, and
> perhaps the 40 MHz fix from Emmanuel that helps avoid iwlwifi firmware
> crashes.
> 
> Please pull and let me know if there's any problem.

Pulled, thanks.

Re: [PATCH] iwlwifi: mvm: only send LEDS_CMD when the FW supports it

2017-09-07 Thread Kalle Valo

Linus Torvalds  writes:

> On Thu, Sep 7, 2017 at 5:39 AM, Kalle Valo  wrote:
>>
>> Linus, do you want to apply this directly or should we take it via the
>> normal route (wireless-drivers -> net)? If your prefer the latter when
>> I'm planning to submit this to Dave in a day or two and expecting it to
>> get to your tree in about a week, depending of course what is Dave's
>> schedule.
>
> Since we have a workaround for the problem, let's just go through the
> regular channels. As long as I get the fix through David before the
> merge window closes, I'm happy.

Ok, I'll aim to send the pull request to Dave tomorrow.

-- 
Kalle Valo

Re: [RFC net-next 0/5] TSN: Add qdisc-based config interfaces for traffic shapers

2017-09-07 Thread Henrik Austad

On Thu, Sep 07, 2017 at 05:53:15PM +0200, Richard Cochran wrote:
> On Thu, Sep 07, 2017 at 05:27:51PM +0200, Henrik Austad wrote:
> > On Thu, Sep 07, 2017 at 02:40:18PM +0200, Richard Cochran wrote:
> > And if you want to this driver to act as a bridge, how do you accomodate 
> > change in network requirements? (i.e. how does this work with switchdev?)
> 
> To my understanding, this Qdisc idea provides QoS for the host's
> transmitted traffic, and nothing more.

Ok, then we're on the same page.

> > - Or am I overthinking this?
> 
> Being able to configure the external ports of a switchdev is probably
> a nice feature, but that is another story.  (But maybe I misunderstood
> the authors' intent!)

ok, chalk that one up for later perhaps

> > If you have more than 1 application in userspace that wants to send data 
> > using this scheduler, how do you ensure fair transmission of frames? (both 
> > how much bandwidth they use,
> 
> There are many ways to handle this, and we shouldn't put any of that
> policy into the kernel.  For example, there might be a monolithic
> application with configurable threads, or an allocation server that
> grants bandwidth to applications via IPC, or a multiplexing stream
> server like jack, pulse, etc, and so on...

true

> > but also ordering of frames from each application)
> 
> Not sure what you mean by this.

Fair enough, I'm not that good at making myself clear :)

Let's see if I can make a better attempt:

If you have 2 separate applications that have their own streams going to 
different endpoints - but both are in the same class, then they will 
share the qdisc bandwidth.

So application 
- A sends frame A1, A2, A3, .. An
- B sends B1, B2, .. Bn

What I was trying to describe was: if application A send 2 frames, and B 
sends 2 frames at the same time, then you would hope that the order would 
be A1, B1, A2, B2, and not A1, A2, B1, B2.

None of this would be a problem if you expect a *single* user, like the 
allocation server you described above. Again, I think this is just me 
overthinking the problem right now :)

> > Do you expect all of this to be handled in userspace?
> 
> Yes, I do.

ok, fair enough

Thanks for answering my questions!

-- 
Henrik Austad


signature.asc
Description: PGP signature

Bad escapes in ip -online

2017-09-07 Thread John Kodis

The -online option to the 'ip link', 'ip addr', and perhaps others is 
putting out a backslash in places where a newline character should go, 
as so:


1: lo:  mtu 65536 qdisc noqueue state UNKNOWN mode 
DEFAULT group default qlen 1000\link/loopback 00:00:00:00:00:00 brd 
00:00:00:00:00:00


-- John.

Re: [PATCH] rtlwifi: btcoex: 23b 1ant: fix duplicated code for different branches

2017-09-07 Thread Gustavo A. R. Silva


Hi Larry,

On 08/30/2017 11:48 PM, Larry Finger wrote:

On 08/30/2017 08:42 AM, Gustavo A. R. Silva wrote:

Refactor code in order to avoid identical code for different branches.

This issue was detected with the help of Coccinelle.

Addresses-Coverity-ID: 1226788
Signed-off-by: Gustavo A. R. Silva 
---
This issue was reported by Coverity and it was tested by compilation
only.
I'm suspicious this may be a copy/paste error. Please, verify.

  .../net/wireless/realtek/rtlwifi/btcoexist/halbtc8723b1ant.c   | 10
++
  1 file changed, 2 insertions(+), 8 deletions(-)


This change is not correct. When bt_link_info->sco_exist is true, the
call should be

halbtc8723b1ant_limited_rx(btcoexist,
   NORMAL_EXEC, true,
   false, 0x5);

NACK

I will push the correct patch.



Great. Good to know.

Thanks
--
Gustavo A. R. Silva

Re: [PATCH] iwlwifi: mvm: only send LEDS_CMD when the FW supports it

2017-09-07 Thread Linus Torvalds

On Thu, Sep 7, 2017 at 5:39 AM, Kalle Valo  wrote:
>
> Linus, do you want to apply this directly or should we take it via the
> normal route (wireless-drivers -> net)? If your prefer the latter when
> I'm planning to submit this to Dave in a day or two and expecting it to
> get to your tree in about a week, depending of course what is Dave's
> schedule.

Since we have a workaround for the problem, let's just go through the
regular channels. As long as I get the fix through David before the
merge window closes, I'm happy.

  Linus

Re: [RFC net-next 0/5] TSN: Add qdisc-based config interfaces for traffic shapers

2017-09-07 Thread Richard Cochran

On Thu, Sep 07, 2017 at 05:27:51PM +0200, Henrik Austad wrote:
> On Thu, Sep 07, 2017 at 02:40:18PM +0200, Richard Cochran wrote:
> And if you want to this driver to act as a bridge, how do you accomodate 
> change in network requirements? (i.e. how does this work with switchdev?)

To my understanding, this Qdisc idea provides QoS for the host's
transmitted traffic, and nothing more.

> - Or am I overthinking this?

Being able to configure the external ports of a switchdev is probably
a nice feature, but that is another story.  (But maybe I misunderstood
the authors' intent!)

> If you have more than 1 application in userspace that wants to send data 
> using this scheduler, how do you ensure fair transmission of frames? (both 
> how much bandwidth they use,

There are many ways to handle this, and we shouldn't put any of that
policy into the kernel.  For example, there might be a monolithic
application with configurable threads, or an allocation server that
grants bandwidth to applications via IPC, or a multiplexing stream
server like jack, pulse, etc, and so on...

> but also ordering of frames from each application)

Not sure what you mean by this.

> Do you expect all of this to be handled in userspace?

Yes, I do.

Thanks,
Richard

[PATCH] net: ethernet: ti: netcp_core: no need in netif_napi_del

2017-09-07 Thread Ivan Khoronzhuk

Don't remove rx_napi specifically just before free_netdev(),
it's supposed to be done in it and is confusing w/o tx_napi deletion.

Signed-off-by: Ivan Khoronzhuk 
---
Based on net-next/master

 drivers/net/ethernet/ti/netcp_core.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/ti/netcp_core.c 
b/drivers/net/ethernet/ti/netcp_core.c
index eb96a69..437d362 100644
--- a/drivers/net/ethernet/ti/netcp_core.c
+++ b/drivers/net/ethernet/ti/netcp_core.c
@@ -2145,7 +2145,6 @@ static void netcp_delete_interface(struct netcp_device 
*netcp_device,
 
of_node_put(netcp->node_interface);
unregister_netdev(ndev);
-   netif_napi_del(>rx_napi);
free_netdev(ndev);
 }
 
-- 
2.7.4

Re: [RFC net-next 0/5] TSN: Add qdisc-based config interfaces for traffic shapers

2017-09-07 Thread Henrik Austad

On Thu, Sep 07, 2017 at 02:40:18PM +0200, Richard Cochran wrote:
> On Thu, Sep 07, 2017 at 07:34:11AM +0200, Henrik Austad wrote:
> > Also, does this mean that when you create the qdisc, you have locked the 
> > bandwidth for the scheduler? Meaning, if I later want to add another 
> > stream that requires more bandwidth, I have to close all active streams, 
> > reconfigure the qdisc and then restart?
> 
> No, just allocate enough bandwidth to accomodate all of the expected
> streams.  The streams can start and stop at will.

Sure, that'll work.

And if you want to this driver to act as a bridge, how do you accomodate 
change in network requirements? (i.e. how does this work with switchdev?)
- Or am I overthinking this?

> > So my understanding of all of this is that you configure the *total* 
> > bandwith for each class when you load the qdisc and then let userspace 
> > handle the rest. Is this correct?
> 
> Nothing wrong with that.

Didn't mean to say it was wrong, just making sure I've understood the 
concept.

> > In my view, it would be nice if the qdisc had some notion about streams so 
> > that you could create a stream, feed frames to it and let the driver pace 
> > them out. (The fewer you queue, the shorter the delay). This will also 
> > allow you to enforce per-stream bandwidth restrictions. I don't see how you 
> > can do this here unless you want to do this in userspace.
> > 
> > Do you have any plans for adding support for multiplexing streams? If you 
> > have multiple streams, how do you enforce that one stream does not eat into 
> > the bandwidth of another stream? AFAIK, this is something the network must 
> > enforce, but I see no option of doing som here.
> 
> Please, lets keep this simple. 

Simple is always good

> Today we have exactly zero user space
> applications using this kind of bandwidth reservation.  The case of
> wanting the kernel to police individual stream usage does not exist,
> and probably never will.

That we have *zero* userspace applications today is probably related to the 
fact that we have exacatly *zero* drivers in the kernel that talks TSN :)

To rephrase a bit, what I'm worried about:

If you have more than 1 application in userspace that wants to send data 
using this scheduler, how do you ensure fair transmission of frames? (both 
how much bandwidth they use, but also ordering of frames from each 
application) Do you expect all of this to be handled in userspace?

> For serious TSN use cases, the bandwidth needed by each system and
> indeed the entire network will be engineered, and we can reasonably
> expect applications to cooperate in this regard.

yes.. that'll happen ;)

> Thanks,
> Richard

Don't get me wrong, I think it is great that others are working on this!
I'm just trying to fully understand the thought that have gone into this 
and how it is inteded to be used.

I'll get busy testing the code and wrapping my head around the different 
parameters.

-- 
Henrik Austad


signature.asc
Description: PGP signature

Re: [PATCH 0/2] 9p: Fixes for hard-to-hit bugs

2017-09-07 Thread Latchesar Ionkov

Acked-by: Latchesar Ionkov 

On Wed, Sep 6, 2017 at 8:59 AM, Tuomas Tynkkynen  wrote:
> These two patches fix two hard-to-hit (but really annoying) bugs in 9p.
> The first one was posted earlier in February (with one R-b), the second
> is a new one.
>
> Both of these have had soaking in NixOS distribution kernels for
> a couple of months with no ill effects.
>
> Tuomas Tynkkynen (2):
>   fs/9p: Compare qid.path in v9fs_test_inode
>   net/9p: Switch to wait_event_killable()
>
>  fs/9p/vfs_inode.c  |  3 +++
>  fs/9p/vfs_inode_dotl.c |  3 +++
>  net/9p/client.c|  3 +--
>  net/9p/trans_virtio.c  | 13 ++---
>  net/9p/trans_xen.c |  4 ++--
>  5 files changed, 15 insertions(+), 11 deletions(-)
>
> --
> 2.13.0
>

Re: [V2 PATCH net-next 2/2] xdp: catch invalid XDP_REDIRECT API usage

2017-09-07 Thread Daniel Borkmann


On 09/07/2017 04:13 PM, Daniel Borkmann wrote:
[...]

+uint64_t addr = (unsigned long)prog;


And of course: s/uint64_t/u64/. Lack of context switch. ;-)

1 2 >

1 - 100 of 169 matches

Mail list logo