Re: rsi: Delete unnecessary variable initialisations in rsi_send_mgmt_pkt()

2016-01-05 Thread SF Markus Elfring
> That said, if you figure out some change that produces significant
> reductions in code or binary size on multiple architectures without
> making things more complicated, less readable or making the code or
> binary size larger, then by all means propose it.

Are you looking also for "a proof" that such changes are worthwhile?


> "This makes things smaller" carries much more weight than
> "I think this is better".

Can the discussed implementation of a function like "rsi_send_mgmt_pkt"
become a bit smaller by the deletion of extra variable initialisations


> Almost all of the changes you've proposed that have seen any
> discussion whatsoever fall into the latter category.

Thanks for your interesting feedback.

Can a further constructive dialogue evolve from the presented information?

Regards,
Markus
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH (net-next.git) 04/18] stmmac: remove modulo in stmmac_xmit()

2016-01-05 Thread Giuseppe CAVALLARO

Hi David

On 1/5/2016 4:34 AM, David Miller wrote:

From: Giuseppe Cavallaro 
Date: Mon, 4 Jan 2016 14:06:49 +0100


@@ -2056,7 +2068,10 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, 
struct net_device *dev)
priv->hw->desc->set_tx_owner(first);
wmb();

-   priv->cur_tx++;
+   if (++entry >= txsize)
+   entry = 0;


You are doing this over and over again, encapsulate it into a helper
like "NEXT_TX(x)" or similar.

Also, this is just fundamentally completely stupid.  Enforce the ring


this is not completely gentle but I share the final advice and I will 
fix that asap.


thanks for the review.

peppe


size to be a power-of-2, then you can just go "x + 1 & (size - 1)" and
not even have the conditional statement.

Thanks.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH (net-next.git) 09/18] stmmac: optimize tx desc management

2016-01-05 Thread Giuseppe CAVALLARO

On 1/5/2016 4:39 AM, David Miller wrote:

From: Giuseppe Cavallaro 
Date: Mon, 4 Jan 2016 14:06:54 +0100


@@ -334,12 +334,11 @@ struct stmmac_desc_ops {

/* Invoked by the xmit function to prepare the tx descriptor */
void (*prepare_tx_desc) (struct dma_desc *p, int is_fs, int len,
-int csum_flag, int mode);
+int csum_flag, int mode, int tx_own,
+bool ls_ic);


I don't understand how you can get the type right and use 'bool' for
'ls_ic' but then incorrectly use an int for 'tx_own'.



sure I will fix it too.

thanks
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH (net-next.git) 01/18] stmmac: share reset function between dwmac100 and dwmac1000

2016-01-05 Thread Giuseppe CAVALLARO

On 1/5/2016 4:25 AM, David Miller wrote:

From: Giuseppe Cavallaro 
Date: Mon, 4 Jan 2016 14:06:46 +0100


@@ -376,7 +376,8 @@ extern const struct stmmac_desc_ops ndesc_ops;
  /* Specific DMA helpers */
  struct stmmac_dma_ops {
/* DMA core initialization */
-   int (*init) (void __iomem *ioaddr, int pbl, int fb, int mb,
+   int (*reset)(void __iomem *ioaddr);
+   void (*init)(void __iomem *ioaddr, int pbl, int fb, int mb,
 int burst_len, u32 dma_tx, u32 dma_rx, int atds);


Since you change the return type of the 'init' method, and this
changes the column of the openning parenthesis, you have to fix the
indentation of the argument list on the next line.



hmm, lines are well aligned.

I will check again, in case of I introduced some indentation problem.

peppe

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/1 v2] include/uapi/linux/sockios.h: mark SIOCRTMSG unused

2016-01-05 Thread Heinrich Schuchardt
IOCTL SIOCRTMSG does nothing but return EINVAL.

So comment it as unused.

SIOCRTMSG is only used in:
* net/ipv4/af_inet.c
* include/uapi/linux/sockios.h

inet_ioctl calls ip_rt_ioctl.
ip_rt_ioctl only handles SIOCADDRT and SIOCDELRT and returns -EINVAL
otherwise.

Signed-off-by: Heinrich Schuchardt 
---
 include/uapi/linux/sockios.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/uapi/linux/sockios.h b/include/uapi/linux/sockios.h
index e888b1a..8e7890b 100644
--- a/include/uapi/linux/sockios.h
+++ b/include/uapi/linux/sockios.h
@@ -27,7 +27,7 @@
 /* Routing table calls. */
 #define SIOCADDRT  0x890B  /* add routing table entry  */
 #define SIOCDELRT  0x890C  /* delete routing table entry   */
-#define SIOCRTMSG  0x890D  /* call to routing system   */
+#define SIOCRTMSG  0x890D  /* unused   */
 
 /* Socket configuration controls. */
 #define SIOCGIFNAME0x8910  /* get iface name   */
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


cobalt ring

2016-01-05 Thread Tom
Dear Sir or Madam

We have some stock of items. if you are retailer that will be good for you. No 
MOQ demand. prompt shipment.

If you are interested pls do feel free to contact us

Best whishes 


TomN�Р骒r��yb�X�肚�v�^�)藓{.n�+�阀z�^�)��骅w*jg�报�茛j/�赇z罐���2���ㄨ��&�)摺�a囤���G���h��j:+v���w��佶

Re: [PATCH (net-next.git) 02/18] stmmac: rework DMA bus setting and introduce new platform AXI structure

2016-01-05 Thread Giuseppe CAVALLARO

On 1/5/2016 4:29 AM, David Miller wrote:

From: Giuseppe Cavallaro 
Date: Mon, 4 Jan 2016 14:06:47 +0100


@@ -81,7 +81,7 @@ static void stmmac_default_data(struct plat_stmmacenet_data 
*plat)
plat->mdio_bus_data->phy_mask = 0;

plat->dma_cfg->pbl = 32;
-   plat->dma_cfg->burst_len = DMA_AXI_BLEN_256;
+   /* TODO: AXI */

/* Set default value for multicast hash bins */
plat->multicast_filter_bins = HASH_TABLE_SIZE;
@@ -115,8 +115,8 @@ static int quark_default_data(struct plat_stmmacenet_data 
*plat,
plat->mdio_bus_data->phy_mask = 0;

plat->dma_cfg->pbl = 16;
-   plat->dma_cfg->burst_len = DMA_AXI_BLEN_256;
plat->dma_cfg->fixed_burst = 1;
+   /* AXI (TODO) */

/* Set default value for multicast hash bins */
plat->multicast_filter_bins = HASH_TABLE_SIZE;


Isn't this going to cause a regression for some things?


trying to rebuild the story of this setting, I understand it
was added to align a configuration so not for fixing some
known problem. I do not see any issue on my side indeed with
the patch applied. My understanding is that, when we adopt "fixed burst
length" it is more safe to use the default burst length instead of
tuning it to the maximum value. I met the same on platform driver
where for performance issue and just in some cases it helped to play
with AXI parameters.

For sure, if somebody has different behavior I can shortly arrange the
code to keep the previous setting or complete the AXI management for PCI
driver (providing a default setup).

Let me know what do you think.

peppe
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 net-next 3/4] soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF

2016-01-05 Thread Daniel Borkmann

On 01/04/2016 11:41 PM, Craig Gallek wrote:

From: Craig Gallek 

Expose socket options for setting a classic or extended BPF program
for use when selecting sockets in an SO_REUSEPORT group.  These options
can be used on the first socket to belong to a group before bind or
on any socket in the group after bind.

This change includes refactoring of the existing sk_filter code to
allow reuse of the existing BPF filter validation checks.

Signed-off-by: Craig Gallek 

[...]

+static struct sock *run_bpf(struct sock_reuseport *reuse, u16 socks,
+   struct bpf_prog *prog, struct sk_buff *skb,
+   int hdr_len)
+{
+   struct sk_buff *nskb = NULL;
+   u32 index;
+
+   if (skb_shared(skb)) {
+   nskb = skb_clone(skb, GFP_ATOMIC);
+   if (!nskb)
+   return NULL;
+   skb = nskb;
+   }
+
+   /* temporarily advance data past protocol header */
+   if (!pskb_pull(skb, hdr_len)) {
+   consume_skb(nskb);


Btw, this one could still be made kfree_skb() to indicate error condition here.


+   return NULL;
+   }
+   index = bpf_prog_run_save_cb(prog, skb);
+   __skb_push(skb, hdr_len);
+
+   consume_skb(nskb);
+
+   if (index >= socks)
+   return NULL;
+
+   return reuse->socks[index];
+}

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net] bridge: Only call /sbin/bridge-stp for the initial network namespace

2016-01-05 Thread Hannes Frederic Sowa
[I stole this patch from Eric Biederman. He wrote:]

> There is no defined mechanism to pass network namespace information
> into /sbin/bridge-stp therefore don't even try to invoke it except
> for bridge devices in the initial network namespace.
>
> It is possible for unprivileged users to cause /sbin/bridge-stp to be
> invoked for any network device name which if /sbin/bridge-stp does not
> guard against unreasonable arguments or being invoked twice on the
> same network device could cause problems.

[Hannes: changed patch using netns_eq]

Cc: Eric W. Biederman 
Signed-off-by: Eric W. Biederman 
Signed-off-by: Hannes Frederic Sowa 
---
 net/bridge/br_stp_if.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
index 905cedd37a0539..a31ac6ad76a223 100644
--- a/net/bridge/br_stp_if.c
+++ b/net/bridge/br_stp_if.c
@@ -143,7 +143,10 @@ static void br_stp_start(struct net_bridge *br)
char *envp[] = { NULL };
struct net_bridge_port *p;
 
-   r = call_usermodehelper(BR_STP_PROG, argv, envp, UMH_WAIT_PROC);
+   if (net_eq(dev_net(br->dev), &init_net))
+   r = call_usermodehelper(BR_STP_PROG, argv, envp, UMH_WAIT_PROC);
+   else
+   r = -ENOENT;
 
spin_lock_bh(&br->lock);
 
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rsi: Delete unnecessary variable initialisations in rsi_send_mgmt_pkt()

2016-01-05 Thread Julian Calaby
Hi Markus,

On Tue, Jan 5, 2016 at 7:29 PM, SF Markus Elfring
 wrote:
>> That said, if you figure out some change that produces significant
>> reductions in code or binary size on multiple architectures without
>> making things more complicated, less readable or making the code or
>> binary size larger, then by all means propose it.
>
> Are you looking also for "a proof" that such changes are worthwhile?

It'd be better than "I think doing things this way is better", which
is the hallmark of most of your patch sets. (Admittedly not this one,
but this one is where the discussion is now, so that's where we're
discussing it.)

>> "This makes things smaller" carries much more weight than
>> "I think this is better".
>
> Can the discussed implementation of a function like "rsi_send_mgmt_pkt"
> become a bit smaller by the deletion of extra variable initialisations

I'm talking in general.

In this case you're asking people to review a patch which requires a
lot of careful review for a fairly minor improvement. I must also note
that you haven't CC'd the people who wrote this driver, so it's
possible that the only people who have reviewed it aren't experts in
the code.

The patches you sent recently which moved labels into if statements
were a clear case of "I think this is better" where any actual benefit
from the changes was eclipsed by the style and readability issues they
introduced.

>> Almost all of the changes you've proposed that have seen any
>> discussion whatsoever fall into the latter category.
>
> Thanks for your interesting feedback.

No problem.

> Can a further constructive dialogue evolve from the presented information?

Part of the issue here is that you don't seem to be listening to the
discussion of your patches, or if you are, you're not significantly
changing your approach or attitude in response.

Every time you send a set of patches, there are legitimate issues
which people raise, and every time they are discussed, you assert that
your patches improve things and seem to ignore the concerns people
raise.

I've seen this same pattern of discussion here with these patches,
with your patches to move labels into if statements, with the patches
you sent late June last year, your patches to remove conditions before
kfree() and friends, etc.

You need to change you attitude: just because you can see some benefit
from your patches doesn't mean others do and it doesn't mean that
they're willing to accept them.

Thanks,

-- 
Julian Calaby

Email: julian.cal...@gmail.com
Profile: http://www.google.com/profiles/julian.calaby/
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V5 9/9] hvsock: introduce Hyper-V VM Sockets feature

2016-01-05 Thread Vitaly Kuznetsov
Dexuan Cui  writes:

Just some minor nitpicks below -- I have to admit I didn't test the feature.

[..skip..] 

> +
> + if (sk->sk_err) {
> + ret = -sk->sk_err;
> + goto out_wait_error;
> + } else {
> + ret = 0;
> + }
> +
> +out_wait:
> + finish_wait(sk_sleep(sk), &wait);
> +out:
> + release_sock(sk);
> + return ret;
> +
> +out_wait_error:
> + sk->sk_state = SS_UNCONNECTED;
> + sock->state = SS_UNCONNECTED;
> + goto out_wait;
> +}

Why not just place out_wait_error label before out_wait (and do 'goto
out_wait' in ret = 0 case instead of 'goto out_wait_error' in the error
case)?

[..skip..]

> +
> +static int __init hvsock_init(void)
> +{
> + int ret;
> +
> + /* Hyper-V socket requires at least VMBus 4.0 */
> + if ((vmbus_proto_version >> 16) < 4) {
> + pr_err("failed to load: VMBus 4 or later is required\n");
> + return -ENODEV;

(Let me pretend I'm Dan :-) So here we return ...

> + }
> +
> + ret = vmbus_driver_register(&hvsock_drv);
> + if (ret) {
> + pr_err("failed to register hv_sock driver\n");
> + goto out;

... and here we goto where we just return. I suggest we bring some
consistency by directly returning ret here and eliminating 'out' label. 

> + }
> +
> + ret = proto_register(&hvsock_proto, 0);
> + if (ret) {
> + pr_err("failed to register protocol\n");
> + goto unreg_hvsock_drv;
> + }
> +
> + ret = sock_register(&hvsock_family_ops);
> + if (ret) {
> + pr_err("failed to register address family\n");
> + goto unreg_proto;
> + }
> +
> + return 0;
> +
> +unreg_proto:
> + proto_unregister(&hvsock_proto);
> +unreg_hvsock_drv:
> + vmbus_driver_unregister(&hvsock_drv);
> +out:
> + return ret;
> +}
> +
> +static void __exit hvsock_exit(void)
> +{
> + sock_unregister(AF_HYPERV);
> + proto_unregister(&hvsock_proto);
> + vmbus_driver_unregister(&hvsock_drv);
> +}
> +
> +module_init(hvsock_init);
> +module_exit(hvsock_exit);
> +
> +MODULE_DESCRIPTION("Microsoft Hyper-V Virtual Socket Family");
> +MODULE_VERSION("0.1");

Do we really need it? When the driver is commited we won't probably be
updating it with v0.2 as a whole, we'll be sending patches addressing
issues and there always will be a question when to swtich to 0.2, 0.3,
... And we don't have MODULE_VERSION for other Hyper-V drivers.

> +MODULE_LICENSE("Dual BSD/GPL");

-- 
  Vitaly
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v1 1/2] tipc: move netlink policies to netlink.h

2016-01-05 Thread Richard Alpe
Make the c files less cluttered and enable netlink attributes to be
shared between files. This will prove useful in a future patch where a
node message will contain a nested network.

Signed-off-by: Richard Alpe 
Acked-by: Jon Maloy 
---
 net/tipc/bearer.c | 19 +---
 net/tipc/link.c   |  8 -
 net/tipc/name_table.c |  7 +
 net/tipc/net.c|  6 +---
 net/tipc/netlink.c| 13 +---
 net/tipc/netlink.h| 85 +++
 net/tipc/node.c   | 23 +-
 net/tipc/socket.c |  9 +-
 net/tipc/udp_media.c  |  9 +-
 9 files changed, 92 insertions(+), 87 deletions(-)

diff --git a/net/tipc/bearer.c b/net/tipc/bearer.c
index 802ffad..d654cfe 100644
--- a/net/tipc/bearer.c
+++ b/net/tipc/bearer.c
@@ -40,6 +40,7 @@
 #include "link.h"
 #include "discover.h"
 #include "bcast.h"
+#include "netlink.h"
 
 #define MAX_ADDR_STR 60
 
@@ -53,24 +54,6 @@ static struct tipc_media * const media_info_array[] = {
 #endif
NULL
 };
-
-static const struct nla_policy
-tipc_nl_bearer_policy[TIPC_NLA_BEARER_MAX + 1] = {
-   [TIPC_NLA_BEARER_UNSPEC]= { .type = NLA_UNSPEC },
-   [TIPC_NLA_BEARER_NAME] = {
-   .type = NLA_STRING,
-   .len = TIPC_MAX_BEARER_NAME
-   },
-   [TIPC_NLA_BEARER_PROP]  = { .type = NLA_NESTED },
-   [TIPC_NLA_BEARER_DOMAIN]= { .type = NLA_U32 }
-};
-
-static const struct nla_policy tipc_nl_media_policy[TIPC_NLA_MEDIA_MAX + 1] = {
-   [TIPC_NLA_MEDIA_UNSPEC] = { .type = NLA_UNSPEC },
-   [TIPC_NLA_MEDIA_NAME]   = { .type = NLA_STRING },
-   [TIPC_NLA_MEDIA_PROP]   = { .type = NLA_NESTED }
-};
-
 static void bearer_disable(struct net *net, struct tipc_bearer *b);
 
 /**
diff --git a/net/tipc/link.c b/net/tipc/link.c
index 0c2944f..cb807be 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -196,14 +196,6 @@ struct tipc_link {
 static const char *link_co_err = "Link tunneling error, ";
 static const char *link_rst_msg = "Resetting link ";
 
-/* Properties valid for media, bearar and link */
-static const struct nla_policy tipc_nl_prop_policy[TIPC_NLA_PROP_MAX + 1] = {
-   [TIPC_NLA_PROP_UNSPEC]  = { .type = NLA_UNSPEC },
-   [TIPC_NLA_PROP_PRIO]= { .type = NLA_U32 },
-   [TIPC_NLA_PROP_TOL] = { .type = NLA_U32 },
-   [TIPC_NLA_PROP_WIN] = { .type = NLA_U32 }
-};
-
 /* Send states for broadcast NACKs
  */
 enum {
diff --git a/net/tipc/name_table.c b/net/tipc/name_table.c
index 91fce70..75992b5 100644
--- a/net/tipc/name_table.c
+++ b/net/tipc/name_table.c
@@ -44,15 +44,10 @@
 #include "addr.h"
 #include "node.h"
 #include 
+#include "netlink.h"
 
 #define TIPC_NAMETBL_SIZE 1024 /* must be a power of 2 */
 
-static const struct nla_policy
-tipc_nl_name_table_policy[TIPC_NLA_NAME_TABLE_MAX + 1] = {
-   [TIPC_NLA_NAME_TABLE_UNSPEC]= { .type = NLA_UNSPEC },
-   [TIPC_NLA_NAME_TABLE_PUBL]  = { .type = NLA_NESTED }
-};
-
 /**
  * struct name_info - name sequence publication info
  * @node_list: circular list of publications made by own node
diff --git a/net/tipc/net.c b/net/tipc/net.c
index 77bf911..d7a5c11 100644
--- a/net/tipc/net.c
+++ b/net/tipc/net.c
@@ -41,11 +41,7 @@
 #include "socket.h"
 #include "node.h"
 #include "bcast.h"
-
-static const struct nla_policy tipc_nl_net_policy[TIPC_NLA_NET_MAX + 1] = {
-   [TIPC_NLA_NET_UNSPEC]   = { .type = NLA_UNSPEC },
-   [TIPC_NLA_NET_ID]   = { .type = NLA_U32 }
-};
+#include "netlink.h"
 
 /*
  * The TIPC locking policy is designed to ensure a very fine locking
diff --git a/net/tipc/netlink.c b/net/tipc/netlink.c
index 8975b01..234cb93 100644
--- a/net/tipc/netlink.c
+++ b/net/tipc/netlink.c
@@ -42,18 +42,7 @@
 #include "node.h"
 #include "net.h"
 #include 
-
-static const struct nla_policy tipc_nl_policy[TIPC_NLA_MAX + 1] = {
-   [TIPC_NLA_UNSPEC]   = { .type = NLA_UNSPEC, },
-   [TIPC_NLA_BEARER]   = { .type = NLA_NESTED, },
-   [TIPC_NLA_SOCK] = { .type = NLA_NESTED, },
-   [TIPC_NLA_PUBL] = { .type = NLA_NESTED, },
-   [TIPC_NLA_LINK] = { .type = NLA_NESTED, },
-   [TIPC_NLA_MEDIA]= { .type = NLA_NESTED, },
-   [TIPC_NLA_NODE] = { .type = NLA_NESTED, },
-   [TIPC_NLA_NET]  = { .type = NLA_NESTED, },
-   [TIPC_NLA_NAME_TABLE]   = { .type = NLA_NESTED, }
-};
+#include "netlink.h"
 
 /* Users of the legacy API (tipc-config) can't handle that we add operations,
  * so we have a separate genl handling for the new API.
diff --git a/net/tipc/netlink.h b/net/tipc/netlink.h
index 08a1db6..ff7a39da 100644
--- a/net/tipc/netlink.h
+++ b/net/tipc/netlink.h
@@ -36,7 +36,92 @@
 #ifndef _TIPC_NETLINK_H
 #define _TIPC_NETLINK_H
 
+#include 
+
 extern struct genl_family tipc_genl_family;
+
+static const struct nla_policy tipc_nl_policy[TIPC_NLA_MAX + 1] = {
+   [TIPC_NLA_

[PATCH net-next v1 2/2] tipc: add peer removal functionality

2016-01-05 Thread Richard Alpe
Add TIPC_NL_PEER_REMOVE netlink command. This command can remove
an offline peer node from the internal data structures.

This will be supported by the tipc user space tool in iproute2.

Signed-off-by: Richard Alpe 
Reviewed-by: Jon Maloy 
Acked-by: Ying Xue 
---
 include/uapi/linux/tipc_netlink.h |  1 +
 net/tipc/net.c|  2 +-
 net/tipc/netlink.c|  5 
 net/tipc/node.c   | 60 +++
 net/tipc/node.h   |  3 +-
 5 files changed, 63 insertions(+), 8 deletions(-)

diff --git a/include/uapi/linux/tipc_netlink.h 
b/include/uapi/linux/tipc_netlink.h
index d4c8f14..25eb645 100644
--- a/include/uapi/linux/tipc_netlink.h
+++ b/include/uapi/linux/tipc_netlink.h
@@ -56,6 +56,7 @@ enum {
TIPC_NL_NET_GET,
TIPC_NL_NET_SET,
TIPC_NL_NAME_TABLE_GET,
+   TIPC_NL_PEER_REMOVE,
 
__TIPC_NL_CMD_MAX,
TIPC_NL_CMD_MAX = __TIPC_NL_CMD_MAX - 1
diff --git a/net/tipc/net.c b/net/tipc/net.c
index d7a5c11..bc6b4a6 100644
--- a/net/tipc/net.c
+++ b/net/tipc/net.c
@@ -135,7 +135,7 @@ void tipc_net_stop(struct net *net)
  tn->own_addr);
rtnl_lock();
tipc_bearer_stop(net);
-   tipc_node_stop(net);
+   tipc_node_stop_net(net);
rtnl_unlock();
 
pr_info("Left network mode\n");
diff --git a/net/tipc/netlink.c b/net/tipc/netlink.c
index 234cb93..87c3ffa 100644
--- a/net/tipc/netlink.c
+++ b/net/tipc/netlink.c
@@ -134,6 +134,11 @@ static const struct genl_ops tipc_genl_v2_ops[] = {
.cmd= TIPC_NL_NAME_TABLE_GET,
.dumpit = tipc_nl_name_table_dump,
.policy = tipc_nl_policy,
+   },
+   {
+   .cmd= TIPC_NL_PEER_REMOVE,
+   .doit   = tipc_nl_peer_rm,
+   .policy = tipc_nl_policy,
}
 };
 
diff --git a/net/tipc/node.c b/net/tipc/node.c
index ee6f93c..7334547 100644
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -378,17 +378,21 @@ static void tipc_node_delete(struct tipc_node *node)
kfree_rcu(node, rcu);
 }
 
-void tipc_node_stop(struct net *net)
+static void tipc_node_stop(struct tipc_node *node)
+{
+   if (del_timer(&node->timer))
+   tipc_node_put(node);
+   tipc_node_put(node);
+}
+
+void tipc_node_stop_net(struct net *net)
 {
struct tipc_net *tn = net_generic(net, tipc_net_id);
struct tipc_node *node, *t_node;
 
spin_lock_bh(&tn->node_list_lock);
-   list_for_each_entry_safe(node, t_node, &tn->node_list, list) {
-   if (del_timer(&node->timer))
-   tipc_node_put(node);
-   tipc_node_put(node);
-   }
+   list_for_each_entry_safe(node, t_node, &tn->node_list, list)
+   tipc_node_stop(node);
spin_unlock_bh(&tn->node_list_lock);
 }
 
@@ -1508,6 +1512,50 @@ discard:
kfree_skb(skb);
 }
 
+int tipc_nl_peer_rm(struct sk_buff *skb, struct genl_info *info)
+{
+   int err;
+   u32 addr;
+   struct net *net = sock_net(skb->sk);
+   struct nlattr *attrs[TIPC_NLA_NET_MAX + 1];
+   struct tipc_net *tn = net_generic(net, tipc_net_id);
+   struct tipc_node *peer;
+
+   /* We identify the peer by its net */
+   if (!info->attrs[TIPC_NLA_NET])
+   return -EINVAL;
+
+   err = nla_parse_nested(attrs, TIPC_NLA_NET_MAX,
+  info->attrs[TIPC_NLA_NET],
+  tipc_nl_net_policy);
+   if (err)
+   return err;
+
+   if (!attrs[TIPC_NLA_NET_ADDR])
+   return -EINVAL;
+
+   addr = nla_get_u32(attrs[TIPC_NLA_NET_ADDR]);
+
+   spin_lock_bh(&tn->node_list_lock);
+   list_for_each_entry_rcu(peer, &tn->node_list, list) {
+   if (peer->addr != addr)
+   continue;
+
+   if (peer->state == SELF_DOWN_PEER_DOWN ||
+   peer->state == SELF_DOWN_PEER_LEAVING) {
+   tipc_node_stop(peer);
+
+   spin_unlock_bh(&tn->node_list_lock);
+   return 0;
+   }
+   spin_unlock_bh(&tn->node_list_lock);
+   return -EBUSY;
+   }
+   spin_unlock_bh(&tn->node_list_lock);
+
+   return -ENXIO;
+}
+
 int tipc_nl_node_dump(struct sk_buff *skb, struct netlink_callback *cb)
 {
int err;
diff --git a/net/tipc/node.h b/net/tipc/node.h
index f39d9d0..8dfb6ba 100644
--- a/net/tipc/node.h
+++ b/net/tipc/node.h
@@ -51,7 +51,7 @@ enum {
 #define TIPC_NODE_CAPABILITIES TIPC_BCAST_SYNCH
 #define INVALID_BEARER_ID -1
 
-void tipc_node_stop(struct net *net);
+void tipc_node_stop_net(struct net *net);
 void tipc_node_check_dest(struct net *net, u32 onode,
  struct tipc_bearer *bearer,
  u16 capabilities, u32 signature,
@@ -75,5 +75,6 @@ int tipc_nl_node_dump_link(struct sk_buff *skb, struct 
netlink_callback *cb);
 in

[PATCH iproute2 1/2] tipc: fix help text spelling error in node.c

2016-01-05 Thread Richard Alpe
---
 tipc/node.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tipc/node.c b/tipc/node.c
index 163fb74..201fe1a 100644
--- a/tipc/node.c
+++ b/tipc/node.c
@@ -245,7 +245,7 @@ static int cmd_node_get(struct nlmsghdr *nlh, const struct 
cmd *cmd,
 void cmd_node_help(struct cmdl *cmdl)
 {
fprintf(stderr,
-   "Usage: %s media COMMAND [ARGS] ...\n\n"
+   "Usage: %s node COMMAND [ARGS] ...\n\n"
"COMMANDS\n"
" list  - List remote nodes\n"
" get   - Get local node parameters\n"
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH net-next 3/3] macsec: introduce IEEE 802.1AE driver

2016-01-05 Thread Paolo Abeni
On Mon, 2015-12-28 at 13:38 +0100, Sabrina Dubroca wrote:
> +#define MACSEC_SCI_LEN 8
> +
> +/* SecTAG length = macsec_eth_header without the optional SCI */
> +#define MACSEC_TAG_LEN 6
> +
> +struct macsec_eth_header {
> + struct ethhdr eth;
> + /* SecTAG */
> + u8  tci_an;
> +#if defined(__LITTLE_ENDIAN_BITFIELD)
> + u8  short_length:6,
> +   unused:2;
> +#elif defined(__BIG_ENDIAN_BITFIELD)
> + u8unused:2,
> + short_length:6;
> +#else
> +#error   "Please fix "
> +#endif
> + __be32 packet_number;
> + u8 secure_channel_id[8]; /* optional */
> +} __packed;

> +#define MACSEC_NEEDED_HEADROOM sizeof(struct macsec_eth_header)

The needed_headroom field does not need to include the hard_header_len,
which, for macsec devices, is set to ETH_HLEN by ether_setup().

Since on xmit path you can push up to MACSEC_TAG_LEN + MACSEC_SCI_LEN +
sizeof(__be16) bytes on the skb head, possibly that should be a better
MACSEC_NEEDED_HEADROOM value.

> +static void macsec_count_tx(struct sk_buff *skb, struct macsec_tx_sc *tx_sc,
> + struct macsec_tx_sa *tx_sa)
> +{
> + struct pcpu_tx_sc_stats *txsc_stats = this_cpu_ptr(tx_sc->stats);
> +
> + u64_stats_update_begin(&txsc_stats->syncp);
> + if (tx_sc->encrypt) {
> + txsc_stats->stats.OutOctetsEncrypted += skb->len;
> + txsc_stats->stats.OutPktsEncrypted++;
> + this_cpu_inc(tx_sa->stats->OutPktsEncrypted);

The last line is probably a duplicate

> + } else {
> + txsc_stats->stats.OutOctetsProtected += skb->len;
> + txsc_stats->stats.OutPktsProtected++;
> + this_cpu_inc(tx_sa->stats->OutPktsProtected);

Same as above.

> + struct pcpu_secy_stats *secy_stats = 
> this_cpu_ptr(macsec->stats);
> +
> + if (macsec->secy.validate_frames == MACSEC_VALIDATE_STRICT) {
> + u64_stats_update_begin(&secy_stats->syncp);
> + secy_stats->stats.InPktsNoTag++;
> + u64_stats_update_end(&secy_stats->syncp);
> + continue;
> + }

Can the 64_stats_update block be replaced by a single:
this_cpu_inc(macsec->stats->InPktsNoTag) ?

There are a few others similar blocks below.

Thanks,

Paolo

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 1/5] sctp: add the rhashtable apis for sctp global transport hashtable

2016-01-05 Thread Xin Long
On Thu, Dec 31, 2015 at 1:41 AM, Marcelo Ricardo Leitner
 wrote:
> On Wed, Dec 30, 2015 at 11:50:46PM +0800, Xin Long wrote:
> ...
>> +void sctp_hash_transport(struct sctp_transport *t)
>> +{
>> + struct sctp_sockaddr_entry *addr;
>> + struct sctp_hash_cmp_arg arg;
>> +
>> + addr = list_entry(t->asoc->base.bind_addr.address_list.next,
>> +   struct sctp_sockaddr_entry, list);
>> + arg.laddr = &addr->a;
>> + arg.paddr = &t->ipaddr;
>> + arg.net   = sock_net(t->asoc->base.sk);
>> +
>> +reinsert:
>> + if (rhashtable_lookup_insert_key(&sctp_transport_hashtable, &arg,
>> +  &t->node, sctp_hash_params) == -EBUSY)
>> + goto reinsert;
>> +}
>
> This is the nasty situation I mentioned in previous email. It seems that
> a stress test can trigger a double rehash and cause an entry to not be
> added.
>
> This is in fact very near some bugs you caught on rhashtable in the past
> few days/couple of weeks tops.
>
> I'm actually against this loop as is. I may have not been clear with Xin
> about not adding my signature to the patchset due to this.
>
> Please take a look at Xin's emails on thread 'rhashtable: Prevent
> spurious EBUSY errors on insertion' about this particular situation.
> Cc'ing Herbert as he wanted to see the patches for that issue.
>
>   Marcelo
>
without this 'reinsert'.

we can reproduce this issue like this:
1. download the attachment
2. cp sctperf.tar.gz to server and client hosts
3. in each hosts.
#make
4. in server:
#sh saddr.sh $ethx
#./ss
5. in client:
#sh caddr.sh $ethx
#ulimit -n 2
#./cc

when the number of  connections reach about 1600, this issue will be triggered.


sctperf.tar.gz
Description: GNU Zip compressed data


[PATCH iproute2 2/2] tipc: add peer remove functionality

2016-01-05 Thread Richard Alpe
This enables a user to remove an offline peer from the kernel data
structures. This could for example be useful when deliberately scaling
in peer nodes in a cloud environment.

Signed-off-by: Richard Alpe 
Reviewed-by: Jon Maloy 
Reviewed-by: Ying Xue 
---
 include/linux/tipc_netlink.h |  1 +
 man/man8/tipc-bearer.8   |  1 +
 man/man8/tipc-link.8 |  1 +
 man/man8/tipc-media.8|  1 +
 man/man8/tipc-nametable.8|  1 +
 man/man8/tipc-node.8 |  1 +
 man/man8/tipc-peer.8 | 52 +
 man/man8/tipc.8  |  1 +
 tipc/Makefile|  2 +-
 tipc/peer.c  | 93 
 tipc/peer.h  | 21 ++
 tipc/tipc.c  |  3 ++
 12 files changed, 177 insertions(+), 1 deletion(-)
 create mode 100644 man/man8/tipc-peer.8
 create mode 100644 tipc/peer.c
 create mode 100644 tipc/peer.h

diff --git a/include/linux/tipc_netlink.h b/include/linux/tipc_netlink.h
index d4c8f14..25eb645 100644
--- a/include/linux/tipc_netlink.h
+++ b/include/linux/tipc_netlink.h
@@ -56,6 +56,7 @@ enum {
TIPC_NL_NET_GET,
TIPC_NL_NET_SET,
TIPC_NL_NAME_TABLE_GET,
+   TIPC_NL_PEER_REMOVE,
 
__TIPC_NL_CMD_MAX,
TIPC_NL_CMD_MAX = __TIPC_NL_CMD_MAX - 1
diff --git a/man/man8/tipc-bearer.8 b/man/man8/tipc-bearer.8
index 50a1ed2..565ee01 100644
--- a/man/man8/tipc-bearer.8
+++ b/man/man8/tipc-bearer.8
@@ -218,6 +218,7 @@ Exit status is 0 if command was successful or a positive 
integer upon failure.
 .BR tipc-media (8),
 .BR tipc-nametable (8),
 .BR tipc-node (8),
+.BR tipc-peer (8),
 .BR tipc-socket (8)
 .br
 .SH REPORTING BUGS
diff --git a/man/man8/tipc-link.8 b/man/man8/tipc-link.8
index 3be8c9a..2ee03a0 100644
--- a/man/man8/tipc-link.8
+++ b/man/man8/tipc-link.8
@@ -213,6 +213,7 @@ Exit status is 0 if command was successful or a positive 
integer upon failure.
 .BR tipc-bearer (8),
 .BR tipc-nametable (8),
 .BR tipc-node (8),
+.BR tipc-peer (8),
 .BR tipc-socket (8)
 .br
 .SH REPORTING BUGS
diff --git a/man/man8/tipc-media.8 b/man/man8/tipc-media.8
index 6c6e2b1..4689cb3 100644
--- a/man/man8/tipc-media.8
+++ b/man/man8/tipc-media.8
@@ -74,6 +74,7 @@ Exit status is 0 if command was successful or a positive 
integer upon failure.
 .BR tipc-link (8),
 .BR tipc-nametable (8),
 .BR tipc-node (8),
+.BR tipc-peer (8),
 .BR tipc-socket (8)
 .br
 .SH REPORTING BUGS
diff --git a/man/man8/tipc-nametable.8 b/man/man8/tipc-nametable.8
index d3397f9..4bcefe4 100644
--- a/man/man8/tipc-nametable.8
+++ b/man/man8/tipc-nametable.8
@@ -87,6 +87,7 @@ Exit status is 0 if command was successful or a positive 
integer upon failure.
 .BR tipc-link (8),
 .BR tipc-media (8),
 .BR tipc-node (8),
+.BR tipc-peer (8),
 .BR tipc-socket (8)
 .br
 .SH REPORTING BUGS
diff --git a/man/man8/tipc-node.8 b/man/man8/tipc-node.8
index ef32ec7..a72a409 100644
--- a/man/man8/tipc-node.8
+++ b/man/man8/tipc-node.8
@@ -59,6 +59,7 @@ Exit status is 0 if command was successful or a positive 
integer upon failure.
 .BR tipc-link (8),
 .BR tipc-media (8),
 .BR tipc-nametable (8),
+.BR tipc-peer (8),
 .BR tipc-socket (8)
 .br
 .SH REPORTING BUGS
diff --git a/man/man8/tipc-peer.8 b/man/man8/tipc-peer.8
new file mode 100644
index 000..430651f
--- /dev/null
+++ b/man/man8/tipc-peer.8
@@ -0,0 +1,52 @@
+.TH TIPC-PEER 8 "04 Dec 2015" "iproute2" "Linux"
+
+.\" For consistency, please keep padding right aligned.
+.\" For example '.B "foo " bar' and not '.B foo " bar"'
+
+.SH NAME
+tipc-peer \- modify peer information
+
+.SH SYNOPSIS
+.ad l
+.in +8
+
+.ti -8
+.B tipc peer remove address
+.IR ADDRESS
+
+.SH OPTIONS
+Options (flags) that can be passed anywhere in the command chain.
+.TP
+.BR "\-h" , " --help"
+Show help about last valid command. For example
+.B tipc peer --help
+will show peer help and
+.B tipc --help
+will show general help. The position of the option in the string is irrelevant.
+.SH DESCRIPTION
+
+.SS Peer remove
+Remove an offline peer node from the local data structures. The peer is
+identified by its
+.B address
+
+.SH EXIT STATUS
+Exit status is 0 if command was successful or a positive integer upon failure.
+
+.SH SEE ALSO
+.BR tipc (8),
+.BR tipc-bearer (8),
+.BR tipc-link (8),
+.BR tipc-media (8),
+.BR tipc-nametable (8),
+.BR tipc-node (8),
+.BR tipc-socket (8)
+.br
+.SH REPORTING BUGS
+Report any bugs to the Network Developers mailing list
+.B 
+where the development and maintenance is primarily done.
+You do not have to be subscribed to the list to send a message there.
+
+.SH AUTHOR
+Richard Alpe 
diff --git a/man/man8/tipc.8 b/man/man8/tipc.8
index c116552..32943fa 100644
--- a/man/man8/tipc.8
+++ b/man/man8/tipc.8
@@ -87,6 +87,7 @@ Exit status is 0 if command was successful or a positive 
integer upon failure.
 .BR tipc-media (8),
 .BR tipc-nametable (8),
 .BR tipc-node (8),
+.BR tipc-peer (8),
 .BR tipc-socket (8)
 .br
 .SH REPORTING BUGS
diff --git a/tipc/Makefile b/tipc/Ma

Re: [PATCH 1/1] net/ipv6: add sysctl option accept_ra_min_hop_limit

2016-01-05 Thread YOSHIFUJI Hideaki
Hi, Machida-san.

Yuki Machida wrote:
> Please apply the following patch to v4.1.x.
> 
> By ommit 6fd99094de2b ("ipv6: Don't reduce hop limit for an interface")

s/ommit/commit/

Futher comment below.

:
> Signed-off-by: Hangbin Liu 
> Acked-by: YOSHIFUJI Hideaki 
> Signed-off-by: David S. Miller 
> ---
>  Documentation/networking/ip-sysctl.txt |  8 
>  include/linux/ipv6.h   |  1 +
>  include/uapi/linux/ipv6.h  |  1 +
>  net/ipv6/addrconf.c| 10 ++
>  net/ipv6/ndisc.c   | 16 +++-
>  5 files changed, 27 insertions(+), 9 deletions(-)
> 
> diff --git a/Documentation/networking/ip-sysctl.txt 
> b/Documentation/networking/ip-sysctl.txt
> index 071fb18..07fad3d 100644
:
> --- a/include/uapi/linux/ipv6.h
> +++ b/include/uapi/linux/ipv6.h
> @@ -171,6 +171,7 @@ enum {
>   DEVCONF_USE_OPTIMISTIC,
>   DEVCONF_ACCEPT_RA_MTU,
>   DEVCONF_STABLE_SECRET,

You have to add a hole for DEVCONF_USE_OIF_ADDRS_ONLY.

--yoshfuji

> + DEVCONF_ACCEPT_RA_MIN_HOP_LIMIT,
>   DEVCONF_MAX
>  };
>  

-- 
Hideaki Yoshifuji 
Technical Division, MIRACLE LINUX CORPORATION
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] rt2x00pci: Disable memory-write-invalidate when the driver exits

2016-01-05 Thread Helmut Schaa
On Tue, Jan 5, 2016 at 2:27 AM, Jia-Ju Bai  wrote:
> On 01/05/2016 12:50 AM, Helmut Schaa wrote:
>>
>> On Mon, Jan 4, 2016 at 8:55 AM, Jia-Ju Bai  wrote:
>>>
>>> The driver calls pci_set_mwi to enable memory-write-invalidate when it
>>> is initialized, but does not call pci_clear_mwi when it is removed. Many
>>> other drivers calls pci_clear_mwi when pci_set_mwi is called, such as
>>> r8169, 8139cp and e1000.
>>>
>>> This patch adds pci_clear_mwi in error handling and removal procedure,
>>> which can fix the problem.
>>>
>>> Signed-off-by: Jia-Ju Bai
>>
>> Looks good to me.
>> Does this fix any actual issue?
>> If yes it might we worth to mention it in the commit message.
>> Helmut
>>
>
> Lacking pci_clear_mwi may cause a resource-release omission,
> but this omission may not cause obvious issues.
> For reliability, it is better to add pci_clear_mwi in the driver.
> Many other drivers do so, such as r8169, 8139cp and e1000.

Thanks for clarification, fine with me then.

Acked-by: Helmut Schaa 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


pull-request: wireless-drivers-next 2016-01-05

2016-01-05 Thread Kalle Valo
Hi Dave,

here's a big pull request for 4.5, which I should have sent to you
earlier but instead I was wasting my time with eating way too much
chocolate and keeping my brains in low power mode :)

Once again 'git request-pull' got diffstat wrong so I had to manually
fix it, but at least based on my investigation the pull request seems to
be ok. Lots of new features and bugfixes as well as a new kconfig option
CONFIG_ATH9K_HWRNG to enable the new ath9k random number generator.

I'm hoping to send one more pull request later this week. Please let me
know if there are any problems.

The following changes since commit 4585436091cd812b1165aab71bd4847ea1cb08ec:

  iwlwifi: mvm: protect RCU dereference in iwl_mvm_get_key_sta_id (2015-12-13 
13:38:26 +0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next.git 
tags/wireless-drivers-next-for-davem-2016-01-05

for you to fetch changes up to 49f2a47d913184b97ef84b2e4fd438dbee341487:

  prism54: fix checks for dma mapping errors (2015-12-31 10:23:32 +0200)


brcfmac

* fix IBSS which got broken over time
* new USB id for bcm43242 dongle
* arp offload configuration through inet notifier

ath9k

* add random number generator support (CONFIG_ATH9K_HWRNG)

iwlwifi

* Make scan parameters low latency aware
* Fix in the NL80211_FEATURE_FULL_AP_CLIENT_STATE state case
* Fix enable injection mode (Chaya Rachel)
* Various cleanups (Dan / Julia / myself)
* Allow to stay more time on popular channels (David Spinadel)
* Bug fixes for D0i3 (Eliad / Luca)
* Fixes for GO uAPSD (myself)
* Start of TSO support (myself)
* Rate control bug fixes (Eyal / Gregory)
* Start the work on 9000 devices (Johannes / Sara / Oren)
* Start the work on a new Tx queue allocation model (Liad)
* Debug infrastructure enhancements (Golan)

mwifiex

* add a debugfs file for chip reset
* advertise SMS4 cipher suite
* increase ap and station interface limit to 3
* enable MSI support on newer pcie devices (8897 onwards)

rtlwifi

* fix lots of module parameter usage


Alexey Khoroshilov (1):
  prism54: fix checks for dma mapping errors

Amitkumar Karwar (13):
  mwifiex: parse adhoc start/join result
  mwifiex: handle start AP error paths correctly
  mwifiex: set regulatory info from EEPROM
  mwifiex: don't follow AP if country code received from EEPROM
  mwifiex: correction in region code to country mapping
  mwifiex: suppress "Rx of mgmt packet failed" message
  mwifiex: remove redundant timestamp assignment
  mwifiex: add debugfs file for testing reset of card
  mwifiex: fix AMPDU not setup on TDLS link problem
  mwifiex: update region_code_index array
  mwifiex: use world for unidentified region code
  mwifiex: fix PCIe register information for 8997 chipset
  mwifiex: add missing check for PCIe8997 chipset

Arend van Spriel (2):
  brcmfmac: no interface combination check for single interface
  brcmfmac: add 43242 device id for LG dongle

Arnd Bergmann (1):
  iwlegacy: mark il_adjust_beacon_interval as noinline

Avinash Patil (1):
  mwifiex: enable MSI interrupt support in pcie

Avraham Stern (1):
  iwlwifi: mvm: configure scheduled scan according to traffic conditions

Ayala Beker (1):
  iwlwifi: mvm: Change number of associated stations when station becomes 
associated

Ben Greear (2):
  ath6kl: fix tx/rx antenna reporting for 2x2 devices
  ath6kl: add log messages for firmware failure cases.

Bob Copeland (1):
  ath5k: fix RTS/CTS by using proper rate flags

Chaya Rachel Ivgi (1):
  iwlwifi: mvm: Add a station in monitor mode

Colin Ian King (2):
  brcmfmac: only lock and unlock fws if fws is not null
  ath9k: fix inconsistent indenting on return statement

Dan Carpenter (6):
  cw1200: remove some dead code
  iwlegacy: cleanup end of il_send_add_sta()
  mwifiex: remove an unneeded condition
  hostap: fix an error code in prism2_config()
  prism54: off by one BUG_ON() test
  iwlwifi: mvm: remove an extra tab

David Spinadel (1):
  iwlwifi: mvm: add extended dwell time

Eliad Peller (6):
  iwlwifi: mvm: cleanup roc te on restart cleanup
  iwlwifi: mvm: check iwl_mvm_wowlan_config_key_params() return value
  iwlwifi: avoid d0i3 commands when no/init ucode is loaded
  iwlwifi: mvm: remove the vif parameter of iwl_mvm_configure_bcast_filter()
  iwlwifi: update key params on d0i3 entrance/exit
  iwlwifi: bail out in case of bad trans state

Emmanuel Grumbach (18):
  iwlwifi: pcie: allow the op_mode to block the tx queues
  iwlwifi: trans: support a callback for ASYNC commands
  iwlwifi: block the queues when we send ADD_STA for uAPSD
  iwlwifi: uninline iwl_trans_send_cmd
  Merge tag 'mac80211-next-for-davem-2015-12-07' into next
  iwlwifi

[patch net-next] mlxsw: pci: Adjust value of CPU egress traffic class

2016-01-05 Thread Jiri Pirko
From: Ido Schimmel 

During initialization, when creating the send descriptor queues (SDQs),
we specify the CPU egress traffic class of each SDQ. The maximum number
of classes of this type is different in the two ASICs supported by this
PCI driver.

New firmware versions check this value is set correctly, which causes
errors on the Spectrum ASIC, as its max exposed egress traffic class is
lower than 7.

Solve this by setting this field to 3, which is an acceptable value for
both ASICs.

Note that we currently do not expose the QoS capabilities of the ASICs,
so setting this to an hardcoded value is OK for now.

Signed-off-by: Ido Schimmel 
Signed-off-by: Jiri Pirko 
---
 drivers/net/ethernet/mellanox/mlxsw/pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/pci.c 
b/drivers/net/ethernet/mellanox/mlxsw/pci.c
index d2102e5..c071077 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/pci.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/pci.c
@@ -384,7 +384,7 @@ static int mlxsw_pci_sdq_init(struct mlxsw_pci *mlxsw_pci, 
char *mbox,
 
/* Set CQ of same number of this SDQ. */
mlxsw_cmd_mbox_sw2hw_dq_cq_set(mbox, q->num);
-   mlxsw_cmd_mbox_sw2hw_dq_sdq_tclass_set(mbox, 7);
+   mlxsw_cmd_mbox_sw2hw_dq_sdq_tclass_set(mbox, 3);
mlxsw_cmd_mbox_sw2hw_dq_log2_dq_sz_set(mbox, 3); /* 8 pages */
for (i = 0; i < MLXSW_PCI_AQ_PAGES; i++) {
dma_addr_t mapaddr = __mlxsw_pci_queue_page_get(q, i);
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Fix potential NULL pointer dereference in __skb_try_recv_datagram

2016-01-05 Thread Jacob Siverskog
On Mon, Jan 4, 2016 at 4:25 PM, Eric Dumazet  wrote:
> On Mon, 2016-01-04 at 10:10 +0100, Jacob Siverskog wrote:
>> On Wed, Dec 30, 2015 at 11:30 PM, Cong Wang  wrote:
>> > On Wed, Dec 30, 2015 at 6:30 AM, Jacob Siverskog
>> >  wrote:
>> >> On Wed, Dec 30, 2015 at 2:26 PM, Eric Dumazet  wrote:
>> >>> How often can you trigger this bug ?
>> >>
>> >> Ok. I don't have a good repro to trigger it unfortunately, I've seen it 
>> >> just a
>> >> few times when bringing up/down network interfaces. Does the trace
>> >> give any clue?
>> >>
>> >
>> > A little bit. You need to help people to narrow down the problem
>> > because there are too many places using skb->next and skb->prev.
>> >
>> > Since you mentioned it seems related to network interface flip,
>> > what network interfaces are you using? What's is your TC setup?
>> >
>> > Thanks.
>>
>> The system contains only one physical network interface (TI WL1837,
>> wl18xx module).
>> The state prior to the crash was as follows:
>> - One virtual network interface active (as STA, associated with access point)
>> - Bluetooth (BLE only) active (same physical chip, co-existence,
>> btwilink/st_drv modules)
>>
>> Actions made around the time of the crash:
>> - Bluetooth disabled
>> - One additional virtual network interface brought up (also as STA)
>>
>> I believe the crash occurred between these two actions. I just saw
>> that there are some interesting events in the log prior to the crash:
>> kernel: Bluetooth: Unable to push skb to HCI core(-6)
>> kernel: (stc):  proto stack 4's ->recv failed
>> kernel: (stc): remove_channel_from_table: id 3
>> kernel: (stc): remove_channel_from_table: id 2
>> kernel: (stc): remove_channel_from_table: id 4
>> kernel: (stc):  all chnl_ids unregistered
>> kernel: (stk) :ldisc_install = 0(stc): st_tty_close
>>
>> The first print is from btwilink.c. However, I can't see the
>> connection between Bluetooth (BLE) and UDP/IPv6 (we're not using
>> 6LoWPAN or anything similar).
>>
>> Thanks, Jacob
>
> Definitely these details are useful ;)
>
> Could you try :
>
> diff --git a/drivers/misc/ti-st/st_core.c b/drivers/misc/ti-st/st_core.c
> index 6e3af8b42cdd..0c99a74fb895 100644
> --- a/drivers/misc/ti-st/st_core.c
> +++ b/drivers/misc/ti-st/st_core.c
> @@ -912,7 +912,9 @@ void st_core_exit(struct st_data_s *st_gdata)
> skb_queue_purge(&st_gdata->txq);
> skb_queue_purge(&st_gdata->tx_waitq);
> kfree_skb(st_gdata->rx_skb);
> +   st_gdata->rx_skb = NULL;
> kfree_skb(st_gdata->tx_skb);
> +   st_gdata->tx_skb = NULL;
> /* TTY ldisc cleanup */
> err = tty_unregister_ldisc(N_TI_WL);
> if (err)
>
>

Sure. Since I don't have a good way to trigger the initial issue, I
can't really know if there is a difference with your patch. However,
normal usage seems to work as expected with your patch. I've tried to
reproduce the initial issue with and without your patch repeatedly for
hours and have not seen any crash in any of the runs so far.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] netfilter: nf_conntrack: use safer way to lock all buckets

2016-01-05 Thread David Laight
From: Sasha Levin
> Sent: 05 January 2016 02:26
> When we need to lock all buckets in the connection hashtable we'd attempt to
> lock 1024 spinlocks, which is way more preemption levels than supported by
> the kernel. Furthermore, this behavior was hidden by checking if lockdep is
> enabled, and if it was - use only 8 buckets(!).
> 
> Fix this by using a global lock and synchronize all buckets on it when we
> need to lock them all. This is pretty heavyweight, but is only done when we
> need to resize the hashtable, and that doesn't happen often enough (or at 
> all).
...
> +static void nf_conntrack_lock_nested(spinlock_t *lock)
> +{
> + spin_lock_nested(lock, SINGLE_DEPTH_NESTING);
> + while (unlikely(nf_conntrack_locks_all)) {
> + spin_unlock(lock);
> + spin_lock(&nf_conntrack_locks_all_lock);
> + spin_unlock(&nf_conntrack_locks_all_lock);
> + spin_lock_nested(lock, SINGLE_DEPTH_NESTING);
> + }
> +}
...
> @@ -102,16 +126,19 @@ static void nf_conntrack_all_lock(void)
>  {
>   int i;
> 
> - for (i = 0; i < CONNTRACK_LOCKS; i++)
> - spin_lock_nested(&nf_conntrack_locks[i], i);
> + spin_lock(&nf_conntrack_locks_all_lock);
> + nf_conntrack_locks_all = true;
> +
> + for (i = 0; i < CONNTRACK_LOCKS; i++) {
> + spin_lock(&nf_conntrack_locks[i]);
> + spin_unlock(&nf_conntrack_locks[i]);
> + }
>  }

If spin_lock_nested() does anything like what I think its
name suggests then I suspect that deadlocks.

David


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] 6pack: fix free memory scribbles

2016-01-05 Thread Alan
commit acf673a3187edf72068ee2f92f4dc47d66baed47 fixed a user triggerable free
memory scribble but in doing so replaced it with a different one that allows
the user to control the data and scribble even more.

sixpack_close is called by the tty layer in tty context. The tty context is
protected by sp_get() and sp_put(). However network layer activity via
sp_xmit() is not protected this way. We must therefore stop the queue
otherwise the user gets to dump a buffer mostly of their choice into freed
kernel pages.

Signed-off-by: Alan Cox 
---
 drivers/net/hamradio/6pack.c |6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/hamradio/6pack.c b/drivers/net/hamradio/6pack.c
index 9f0b1c3..b3ba6a8 100644
--- a/drivers/net/hamradio/6pack.c
+++ b/drivers/net/hamradio/6pack.c
@@ -683,6 +683,12 @@ static void sixpack_close(struct tty_struct *tty)
if (!atomic_dec_and_test(&sp->refcnt))
down(&sp->dead_sem);
 
+   /* We must stop the queue to avoid potentially scribbling
+  on the free buffers. The sp->dead_sem is not sufficient
+  to protect us from sp->xbuff access */
+
+   netif_stop_queue(sp->dev);
+
del_timer_sync(&sp->tx_t);
del_timer_sync(&sp->resync_t);
 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv2 net-next 4/4] nfp: fix RX buffer length validation

2016-01-05 Thread Jakub Kicinski
Meaning of data_len and meta_len RX WB descriptor fields depend
slightly on whether rx_offset is dynamic or not.  For dynamic
offsets data_len includes mata_len.  This makes the code harder
to follow, in fact our RX buffer length check is incorrect -
we are comparing allocation length to data_len while we should
also account for meta_len.

Let's adjust the values of data_len and meta_len to their natural
meaning and simplify the logic.

Signed-off-by: Jakub Kicinski 
Reviewed-by: Rolf Neugebauer 
---
 drivers/net/ethernet/netronome/nfp/nfp_net_common.c | 19 ---
 1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c 
b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 553ae64e2f7f..070645f9bc21 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -1259,22 +1259,19 @@ static int nfp_net_rx(struct nfp_net_rx_ring *rx_ring, 
int budget)
 
meta_len = rxd->rxd.meta_len_dd & PCIE_DESC_RX_META_LEN_MASK;
data_len = le16_to_cpu(rxd->rxd.data_len);
+   /* For dynamic offset data_len includes meta_len, adjust */
+   if (nn->rx_offset == NFP_NET_CFG_RX_OFFSET_DYNAMIC)
+   data_len -= meta_len;
+   else
+   meta_len = nn->rx_offset;
 
-   if (WARN_ON_ONCE(data_len > nn->fl_bufsz)) {
+   if (WARN_ON_ONCE(meta_len + data_len > nn->fl_bufsz)) {
dev_kfree_skb_any(skb);
continue;
}
 
-   if (nn->rx_offset == NFP_NET_CFG_RX_OFFSET_DYNAMIC) {
-   /* The packet data starts after the metadata */
-   skb_reserve(skb, meta_len);
-   } else {
-   /* The packet data starts at a fixed offset */
-   skb_reserve(skb, nn->rx_offset);
-   }
-
-   /* Adjust the SKB for the dynamic meta data pre-pended */
-   skb_put(skb, data_len - meta_len);
+   skb_reserve(skb, meta_len);
+   skb_put(skb, data_len);
 
nfp_net_set_hash(nn->netdev, skb, rxd);
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv2 net-next 3/4] nfp: correct RX buffer length calculation

2016-01-05 Thread Jakub Kicinski
When calculating the RX buffer length we need to account for
up to 2 VLAN tags and up to 8 MPLS labels.  Rounding up to 1k
is an relic of a distant past and can be removed.  While at
it also remove trivial print statement.

Signed-off-by: Jakub Kicinski 
Reviewed-by: Rolf Neugebauer 
---
 drivers/net/ethernet/netronome/nfp/nfp_net_common.c | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c 
b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index b381682de3d6..553ae64e2f7f 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -61,6 +61,7 @@
 
 #include 
 
+#include 
 #include 
 
 #include "nfp_net_ctrl.h"
@@ -1912,9 +1913,6 @@ static int nfp_net_change_mtu(struct net_device *netdev, 
int new_mtu)
 {
struct nfp_net *nn = netdev_priv(netdev);
int ret = 0;
-   u32 tmp;
-
-   nn_dbg(nn, "New MTU = %d\n", new_mtu);
 
if (new_mtu < 68 || new_mtu > nn->max_mtu) {
nn_err(nn, "New MTU (%d) is not valid\n", new_mtu);
@@ -1925,10 +1923,8 @@ static int nfp_net_change_mtu(struct net_device *netdev, 
int new_mtu)
nfp_net_netdev_close(netdev);
 
netdev->mtu = new_mtu;
-
-   /* Freelist buffer size rounded up to the nearest 1K */
-   tmp = new_mtu + ETH_HLEN + VLAN_HLEN + NFP_NET_MAX_PREPEND;
-   nn->fl_bufsz = roundup(tmp, 1024);
+   nn->fl_bufsz = NFP_NET_MAX_PREPEND + ETH_HLEN + VLAN_HLEN * 2 +
+   MPLS_HLEN * 8 + new_mtu;
 
if (netif_running(netdev))
ret = nfp_net_netdev_open(netdev);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv2 net-next 1/4] nfp: return error if MTU change fails

2016-01-05 Thread Jakub Kicinski
When reopening device fails after MTU change, let the userspace
know.  MTU remains changed even though error is returned, this
is what all ethernet devices are doing.

Signed-off-by: Jakub Kicinski 
Reviewed-by: Rolf Neugebauer 
---
Dave,

I know this is not what you asked for but, since we are using FW
commands to disable/enable RX, even if we allocate all required
resources before freeing old ones we still cannot guarantee that
the reenabling operation will not fail.  Should we refuse to do
MTU changes while the interface is running altogether?

---
 drivers/net/ethernet/netronome/nfp/nfp_net_common.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c 
b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 43c618bafdb6..006d9600240f 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -1911,6 +1911,7 @@ static void nfp_net_set_rx_mode(struct net_device *netdev)
 static int nfp_net_change_mtu(struct net_device *netdev, int new_mtu)
 {
struct nfp_net *nn = netdev_priv(netdev);
+   int ret = 0;
u32 tmp;
 
nn_dbg(nn, "New MTU = %d\n", new_mtu);
@@ -1929,10 +1930,10 @@ static int nfp_net_change_mtu(struct net_device 
*netdev, int new_mtu)
/* restart if running */
if (netif_running(netdev)) {
nfp_net_netdev_close(netdev);
-   nfp_net_netdev_open(netdev);
+   ret = nfp_net_netdev_open(netdev);
}
 
-   return 0;
+   return ret;
 }
 
 static struct rtnl_link_stats64 *nfp_net_stat64(struct net_device *netdev,
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv2 net-next 0/4] MTU changes and other fixes

2016-01-05 Thread Jakub Kicinski
Hi!

Four small fixes around RX buffer sizing.  First one makes sure
we return an error if .ndo_change_mtu() fails.  Second one corrects
the length used for unmapping DMA buffers when MTU is changed,
third makes sure buffers are big enough to meet FW's expectations.
Last change corrects packet length validation on RX.

v2:
 - add first patch (return error on fail).


Jakub Kicinski (4):
  nfp: return error if MTU change fails
  nfp: free buffers before changing MTU
  nfp: correct RX buffer length calculation
  nfp: fix RX buffer length validation

 .../net/ethernet/netronome/nfp/nfp_net_common.c| 42 ++
 1 file changed, 18 insertions(+), 24 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv2 net-next 2/4] nfp: free buffers before changing MTU

2016-01-05 Thread Jakub Kicinski
For freeing DMA buffers we depend on nfp_net.fl_bufsz having the same
value as during allocation therefore in .ndo_change_mtu we must first
free the buffers and then change the setting.

Signed-off-by: Jakub Kicinski 
Reviewed-by: Rolf Neugebauer 
---
 drivers/net/ethernet/netronome/nfp/nfp_net_common.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c 
b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 006d9600240f..b381682de3d6 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -1921,17 +1921,17 @@ static int nfp_net_change_mtu(struct net_device 
*netdev, int new_mtu)
return -EINVAL;
}
 
+   if (netif_running(netdev))
+   nfp_net_netdev_close(netdev);
+
netdev->mtu = new_mtu;
 
/* Freelist buffer size rounded up to the nearest 1K */
tmp = new_mtu + ETH_HLEN + VLAN_HLEN + NFP_NET_MAX_PREPEND;
nn->fl_bufsz = roundup(tmp, 1024);
 
-   /* restart if running */
-   if (netif_running(netdev)) {
-   nfp_net_netdev_close(netdev);
+   if (netif_running(netdev))
ret = nfp_net_netdev_open(netdev);
-   }
 
return ret;
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 net-next 2/3] net: macb: Add NPx macb config using USRIO_DISABLED cap

2016-01-05 Thread Neil Armstrong
On 01/04/2016 11:38 AM, Nicolas Ferre wrote:
> Le 04/01/2016 10:42, Neil Armstrong a écrit :
>>  static const struct macb_config zynqmp_config = {
>>  .caps = MACB_CAPS_GIGABIT_MODE_AVAILABLE | MACB_CAPS_JUMBO,
>> @@ -2801,6 +2806,7 @@ static const struct of_device_id macb_dt_ids[] = {
>>  { .compatible = "cdns,at32ap7000-macb" },
>>  { .compatible = "cdns,at91sam9260-macb", .data = &at91sam9260_config },
>>  { .compatible = "cdns,macb" },
>> +{ .compatible = "cdns,npx-macb", .data = &npx_config },
> 
> I can accept that, but I think that you'd better make your device tree
> compatibility string *not* generic. Name it by the first NPx SoC or
> perfectly compatible SoC family that has this configuration and you'll
> be able to make the NP(x+1) compatible with it.
Well, the first Soc having this configuration is Np4, would cdns,np4-macb be ok 
?
> 
> It has proven to be much more future proof and even if in the early days
> of DT on ARM we accepted some binding with generic strings like this one
> below, It has proven to be a mistake.
> 
>>  { .compatible = "cdns,gem", .data = &pc302gem_config },
>>  { .compatible = "atmel,sama5d2-gem", .data = &sama5d2_config },
>>
> 
> 

Neil
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V5 4/9] Drivers: hv: ring_buffer: enhance hv_ringbuffer_read() to support hvsock

2016-01-05 Thread Vitaly Kuznetsov
Dexuan Cui  writes:

> To get the payload of hvsock, we need raw=0 to skip the level-1 header
> (i.e., struct vmpacket_descriptor desc) and we also need to skip the
> level-2 header (i.e., struct vmpipe_proto_header pipe_hdr).
>
> NB: if the length of the hvsock payload is not aligned with the 8-byte
> boundeary, at most 7 padding bytes are appended, so the real hvsock
> payload's length must be retrieved by the pipe_hdr.data_size field.
>
> I 'upgrade' the 'raw' parameter of hv_ringbuffer_read() to a
> 'read_flags', trying to share the logic of the function.

When I was touching this code last time I was actually thinking about
eliminating 'raw' flag by making all ring reads raw and moving this
header filtering job to the upper layer (as we already have
vmbus_recvpacket()/vmbus_recvpacket_raw()) but for some reason I didn't
do it. I believe you have more or less the same reasoing for introducing
new read type instead of parsing this at a higher level. Some comments
below ...

>
> This patch is required by the next patch, which will introduce the hvsock
> send/recv APIs.
>
> Signed-off-by: Dexuan Cui 
> Cc: Vitaly Kuznetsov 
> ---
>  drivers/hv/channel.c  | 10 +
>  drivers/hv/hyperv_vmbus.h | 13 +++-
>  drivers/hv/ring_buffer.c  | 54 
> ---
>  include/linux/hyperv.h| 12 +++
>  4 files changed, 72 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
> index eaaa066..cc49966 100644
> --- a/drivers/hv/channel.c
> +++ b/drivers/hv/channel.c
> @@ -940,13 +940,14 @@ EXPORT_SYMBOL_GPL(vmbus_sendpacket_multipagebuffer);
>  static inline int
>  __vmbus_recvpacket(struct vmbus_channel *channel, void *buffer,
>  u32 bufferlen, u32 *buffer_actual_len, u64 *requestid,
> -bool raw)
> +u32 read_flags)
>  {
>   int ret;
>   bool signal = false;
>
>   ret = hv_ringbuffer_read(&channel->inbound, buffer, bufferlen,
> -  buffer_actual_len, requestid, &signal, raw);
> +  buffer_actual_len, requestid, &signal,
> +  read_flags);
>
>   if (signal)
>   vmbus_setevent(channel);
> @@ -959,7 +960,7 @@ int vmbus_recvpacket(struct vmbus_channel *channel, void 
> *buffer,
>u64 *requestid)
>  {
>   return __vmbus_recvpacket(channel, buffer, bufferlen,
> -   buffer_actual_len, requestid, false);
> +   buffer_actual_len, requestid, 0);
>  }
>  EXPORT_SYMBOL(vmbus_recvpacket);
>
> @@ -971,6 +972,7 @@ int vmbus_recvpacket_raw(struct vmbus_channel *channel, 
> void *buffer,
> u64 *requestid)
>  {
>   return __vmbus_recvpacket(channel, buffer, bufferlen,
> -   buffer_actual_len, requestid, true);
> +   buffer_actual_len, requestid,
> +   HV_RINGBUFFER_READ_FLAG_RAW);
>  }
>  EXPORT_SYMBOL_GPL(vmbus_recvpacket_raw);
> diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
> index 0411b7b..46206b6 100644
> --- a/drivers/hv/hyperv_vmbus.h
> +++ b/drivers/hv/hyperv_vmbus.h
> @@ -619,9 +619,20 @@ int hv_ringbuffer_write(struct hv_ring_buffer_info 
> *ring_info,
>   struct kvec *kv_list,
>   u32 kv_count, bool *signal);
>
> +/*
> + * By default, a read_flags of 0 means: the payload offset is
> + * sizeof(struct vmpacket_descriptor).
> + *
> + * If HV_RINGBUFFER_READ_FLAG_RAW is used, the payload offset is 0.
> + *
> + * If HV_RINGBUFFER_READ_FLAG_HVSOCK is used, the payload offset is
> + * sizeof(struct vmpacket_descriptor) + sizeof(struct
> vmpipe_proto_header).

So these are mutually exclusive, right? Should we introduce 'int
payload_offset' parameter instead of flags? 

> + */
> +#define HV_RINGBUFFER_READ_FLAG_RAW  (1 << 0)
> +#define HV_RINGBUFFER_READ_FLAG_HVSOCK   (1 << 1)
>  int hv_ringbuffer_read(struct hv_ring_buffer_info *inring_info,
>  void *buffer, u32 buflen, u32 *buffer_actual_len,
> -u64 *requestid, bool *signal, bool raw);
> +u64 *requestid, bool *signal, u32 read_flags);
>
>  void hv_ringbuffer_get_debuginfo(struct hv_ring_buffer_info *ring_info,
>   struct hv_ring_buffer_debug_info *debug_info);
> diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c
> index b53702c..03a509c 100644
> --- a/drivers/hv/ring_buffer.c
> +++ b/drivers/hv/ring_buffer.c
> @@ -382,32 +382,43 @@ int hv_ringbuffer_write(struct hv_ring_buffer_info 
> *outring_info,
>
>  int hv_ringbuffer_read(struct hv_ring_buffer_info *inring_info,
>  void *buffer, u32 buflen, u32 *buffer_actual_len,
> -u64 *requestid, bool *signal, bool raw)
> +u64 *requestid, bool *signal, u32 read_flags)
>  {

Re: [PATCH V5 5/9] Drivers: hv: vmbus: add APIs to send/recv hvsock packets

2016-01-05 Thread Vitaly Kuznetsov
Dexuan Cui  writes:

> This will be used by the coming net/hvsock driver.
>
> Signed-off-by: Dexuan Cui 
> ---
>  drivers/hv/channel.c   | 59 
> ++
>  include/linux/hyperv.h |  9 
>  2 files changed, 68 insertions(+)
>
> diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
> index cc49966..ce1b885 100644
> --- a/drivers/hv/channel.c
> +++ b/drivers/hv/channel.c
> @@ -924,6 +924,52 @@ int vmbus_sendpacket_multipagebuffer(struct 
> vmbus_channel *channel,
>  }
>  EXPORT_SYMBOL_GPL(vmbus_sendpacket_multipagebuffer);
>
> +/*
> + * vmbus_sendpacket_hvsock - Send the hvsock payload 'buf' of a length 'len'
> + */
> +int vmbus_sendpacket_hvsock(struct vmbus_channel *channel, void *buf, u32 
> len)
> +{
> + struct vmpipe_proto_header pipe_hdr;
> + struct vmpacket_descriptor desc;
> + struct kvec bufferlist[4];
> + u32 packetlen_aligned;
> + u32 packetlen;
> + u64 aligned_data = 0;
> + bool signal = false;
> + int ret;
> +
> + packetlen = HVSOCK_HEADER_LEN + len;
> + packetlen_aligned = ALIGN(packetlen, sizeof(u64));
> +
> + /* Setup the descriptor */
> + desc.type = VM_PKT_DATA_INBAND;
> + /* in 8-bytes granularity */
> + desc.offset8 = sizeof(struct vmpacket_descriptor) >> 3;
> + desc.len8 = (u16)(packetlen_aligned >> 3);
> + desc.flags = 0;
> + desc.trans_id = 0;
> +
> + pipe_hdr.pkt_type = 1;
> + pipe_hdr.data_size = len;
> +
> + bufferlist[0].iov_base = &desc;
> + bufferlist[0].iov_len  = sizeof(struct vmpacket_descriptor);
> + bufferlist[1].iov_base = &pipe_hdr;
> + bufferlist[1].iov_len  = sizeof(struct vmpipe_proto_header);
> + bufferlist[2].iov_base = buf;
> + bufferlist[2].iov_len  = len;
> + bufferlist[3].iov_base = &aligned_data;
> + bufferlist[3].iov_len  = packetlen_aligned - packetlen;
> +
> + ret = hv_ringbuffer_write(&channel->outbound, bufferlist, 4,
> &signal);

Using ARRAY_SIZE(bufferlist) instead of 4 would allow us to keep this
line untouched when we decide to add something (and compiler will
optimize it to 4 anyway).

> +
> + if (ret == 0 && signal)
> + vmbus_setevent(channel);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(vmbus_sendpacket_hvsock);
> +
>  /**
>   * vmbus_recvpacket() - Retrieve the user packet on the specified channel
>   * @channel: Pointer to vmbus_channel structure.
> @@ -976,3 +1022,16 @@ int vmbus_recvpacket_raw(struct vmbus_channel *channel, 
> void *buffer,
> HV_RINGBUFFER_READ_FLAG_RAW);
>  }
>  EXPORT_SYMBOL_GPL(vmbus_recvpacket_raw);
> +
> +/*
> + * vmbus_recvpacket_hvsock - Receive the hvsock payload from the vmbus
> + * ringbuffer into the 'buffer'.
> + */
> +int vmbus_recvpacket_hvsock(struct vmbus_channel *channel, void *buffer,
> + u32 bufferlen, u32 *buffer_actual_len)
> +{
> + return __vmbus_recvpacket(channel, buffer, bufferlen,
> +   buffer_actual_len, NULL,
> +   HV_RINGBUFFER_READ_FLAG_HVSOCK);
> +}
> +EXPORT_SYMBOL_GPL(vmbus_recvpacket_hvsock);
> diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
> index e005223..646c20d 100644
> --- a/include/linux/hyperv.h
> +++ b/include/linux/hyperv.h
> @@ -908,6 +908,9 @@ extern int vmbus_sendpacket_ctl(struct vmbus_channel 
> *channel,
> u32 flags,
> bool kick_q);
>
> +extern int vmbus_sendpacket_hvsock(struct vmbus_channel *channel,
> +void *buf, u32 len);
> +
>  extern int vmbus_sendpacket_pagebuffer(struct vmbus_channel *channel,
>   struct hv_page_buffer pagebuffers[],
>   u32 pagecount,
> @@ -958,6 +961,9 @@ extern int vmbus_recvpacket_raw(struct vmbus_channel 
> *channel,
>u64 *requestid);
>
> +extern int vmbus_recvpacket_hvsock(struct vmbus_channel *channel, void 
> *buffer,
> +u32 bufferlen, u32 *buffer_actual_len);
> +
>  extern void vmbus_ontimer(unsigned long data);
>
>  /* Base driver object */
> @@ -1280,4 +1286,7 @@ struct vmpipe_proto_header {
>   };
>  } __packed;
>
> +#define HVSOCK_HEADER_LEN(sizeof(struct vmpacket_descriptor) + \
> +  sizeof(struct vmpipe_proto_header))
> +
>  #endif /* _HYPERV_H */

-- 
  Vitaly
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next V3 3/4] net/mlx5e: Add HW timestamping (TS) support

2016-01-05 Thread Richard Cochran
On Tue, Dec 29, 2015 at 02:58:31PM +0200, Saeed Mahameed wrote:
> From: Eran Ben Elisha 
> 
> Add support for enable/disable HW timestamping for incoming and/or
> outgoing packets. To enable/disable HW timestamping appropriate
> ioctl should be used. Currently HWTSTAMP_FILTER_ALL/NONE and
> HWTSAMP_TX_ON/OFF only are supported. Make all relevant changes in
> RX/TX flows to consider TS request and plant HW timestamps into
> relevant structures.
> 
> Add internal clock for converting hardware timestamp to nanoseconds. In
> addition, add a service task to catch internal clock overflow, to make
> sure timestamping is accurate.
> 
> Signed-off-by: Eran Ben Elisha 
> Signed-off-by: Saeed Mahameed 

Acked-by: Richard Cochran 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V5 7/9] Drivers: hv: vmbus: add a mechanism to pass hvsock events to the hvsock driver

2016-01-05 Thread Vitaly Kuznetsov
Dexuan Cui  writes:

> For now only 1 event is defined: HVSOCK_RESCIND_CHANNEL.
> We'll have more events in the future.
>
> Signed-off-by: Dexuan Cui 
> ---
>  drivers/hv/channel_mgmt.c | 18 ++
>  include/linux/hyperv.h| 17 +
>  2 files changed, 35 insertions(+)
>
> diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> index 4611b50..87fc7d2 100644
> --- a/drivers/hv/channel_mgmt.c
> +++ b/drivers/hv/channel_mgmt.c
> @@ -608,6 +608,16 @@ static void vmbus_onoffer_rescind(struct 
> vmbus_channel_message_header *hdr)
>   spin_unlock_irqrestore(&channel->lock, flags);
>
>   if (channel->device_obj) {
> + if (is_hvsock_channel(channel) &&
> + channel->hvsock_event_callback) {
> + channel->hvsock_event_callback(channel,
> +HVSOCK_RESCIND_CHANNEL);
> + /*
> +  * We can't invoke vmbus_device_unregister()
> +  * until the socket fd is closed.
> +  */
> + return;
> + }
>   /*
>* We will have to unregister this device from the
>* driver core.
> @@ -977,3 +987,11 @@ bool vmbus_are_subchannels_present(struct vmbus_channel 
> *primary)
>   return ret;
>  }
>  EXPORT_SYMBOL_GPL(vmbus_are_subchannels_present);
> +
> +void vmbus_set_hvsock_event_callback(struct vmbus_channel *channel,
> + void (*hvsock_event_callback)(struct vmbus_channel *,
> +   enum hvsock_event))
> +{
> + channel->hvsock_event_callback = hvsock_event_callback;
> +}
> +EXPORT_SYMBOL_GPL(vmbus_set_hvsock_event_callback);
> diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
> index b4cc44c..7e507bb 100644
> --- a/include/linux/hyperv.h
> +++ b/include/linux/hyperv.h
> @@ -645,6 +645,12 @@ enum hv_signal_policy {
>   HV_SIGNAL_POLICY_EXPLICIT,
>  };
>
> +/* hvsock related definitions */
> +enum hvsock_event {
> + /* The host application is close()-ing the connection */
> + HVSOCK_RESCIND_CHANNEL,
> +};
> +
>  struct vmbus_channel {
>   /* Unique channel id */
>   int id;
> @@ -740,6 +746,13 @@ struct vmbus_channel {
>   void (*sc_creation_callback)(struct vmbus_channel *new_sc);
>
>   /*
> +  * hvsock event callback.
> +  * For now only 1 event is defined: HVSOCK_RESCIND_CHANNEL.
> +  */
> + void (*hvsock_event_callback)(struct vmbus_channel *channel,
> +   enum hvsock_event event);

Would it make sense to rename it to something more general,
e.g. sc_rescind_callback and call it for all drivers (even if we don't
need it now) intead of introducing enum hvsock_event? When new events
arrive we'll just add new callbacks (or, alternatively, we could unify
it to 'channel_event_callback' and merging with sc_creation_callback()
but I'd say it is uglier).

> +
> + /*
>* The spinlock to protect the structure. It is being used to protect
>* test-and-set access to various attributes of the structure as well
>* as all sc_list operations.
> @@ -825,6 +838,10 @@ int vmbus_request_offers(void);
>  void vmbus_set_sc_create_callback(struct vmbus_channel *primary_channel,
>   void (*sc_cr_cb)(struct vmbus_channel *new_sc));
>
> +void vmbus_set_hvsock_event_callback(struct vmbus_channel *channel,
> + void (*hvsock_event_callback)(struct vmbus_channel *,
> +   enum hvsock_event));
> +
>  /*
>   * Retrieve the (sub) channel on which to send an outgoing request.
>   * When a primary channel has multiple sub-channels, we choose a

-- 
  Vitaly
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next V3 4/4] net/mlx5e: Add PTP Hardware Clock (PHC) support

2016-01-05 Thread Richard Cochran
On Tue, Dec 29, 2015 at 02:58:32PM +0200, Saeed Mahameed wrote:
> From: Eran Ben Elisha 
> 
> Add a PHC support to the mlx5_en driver. Use reader/writer spinlocks to
> protect the timecounter since every packet received needs to call
> timecounter_cycle2time() when timestamping is enabled.  This can become
> a performance bottleneck with RSS and multiple receive queues if normal
> spinlocks are used.
> 
> The driver has been tested with both Documentation/ptp/testptp and the
> linuxptp project (http://linuxptp.sourceforge.net/) on a Mellanox
> ConnectX-4 card.
> 
> Signed-off-by: Eran Ben Elisha 
> Cc: Richard Cochran 
> Signed-off-by: Saeed Mahameed 

Acked-by: Richard Cochran 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next V3 0/4] Introduce mlx5 ethernet timestamping

2016-01-05 Thread Richard Cochran
On Mon, Jan 04, 2016 at 04:47:03PM -0500, David Miller wrote:
> Richard, please review this series.

It looks fine to me now, and I acked the timestamping/phc bits.

Thanks,
Richard
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 net-next 2/3] net: macb: Add NPx macb config using USRIO_DISABLED cap

2016-01-05 Thread Nicolas Ferre
Le 05/01/2016 13:20, Neil Armstrong a écrit :
> On 01/04/2016 11:38 AM, Nicolas Ferre wrote:
>> Le 04/01/2016 10:42, Neil Armstrong a écrit :
>>>  static const struct macb_config zynqmp_config = {
>>> .caps = MACB_CAPS_GIGABIT_MODE_AVAILABLE | MACB_CAPS_JUMBO,
>>> @@ -2801,6 +2806,7 @@ static const struct of_device_id macb_dt_ids[] = {
>>> { .compatible = "cdns,at32ap7000-macb" },
>>> { .compatible = "cdns,at91sam9260-macb", .data = &at91sam9260_config },
>>> { .compatible = "cdns,macb" },
>>> +   { .compatible = "cdns,npx-macb", .data = &npx_config },
>>
>> I can accept that, but I think that you'd better make your device tree
>> compatibility string *not* generic. Name it by the first NPx SoC or
>> perfectly compatible SoC family that has this configuration and you'll
>> be able to make the NP(x+1) compatible with it.
> Well, the first Soc having this configuration is Np4, would cdns,np4-macb be 
> ok ?

Yes, absolutely.

Thanks

>> It has proven to be much more future proof and even if in the early days
>> of DT on ARM we accepted some binding with generic strings like this one
>> below, It has proven to be a mistake.
>>
>>> { .compatible = "cdns,gem", .data = &pc302gem_config },
>>> { .compatible = "atmel,sama5d2-gem", .data = &sama5d2_config },
>>>
>>
>>
> 
> Neil
> 


-- 
Nicolas Ferre
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 net-next 1/3] net: ethernet: cadence-macb: Add disabled usrio caps

2016-01-05 Thread Neil Armstrong
On some platforms, the macb integration does not use the USRIO
register to configure the (R)MII port and clocks.
When the register is not implemented and the MACB error signal
is connected to the bus error, reading or writing to the USRIO
register can trigger some Imprecise External Aborts on ARM platforms.

Signed-off-by: Neil Armstrong 
---
 drivers/net/ethernet/cadence/macb.c | 27 +++
 drivers/net/ethernet/cadence/macb.h |  1 +
 2 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb.c 
b/drivers/net/ethernet/cadence/macb.c
index 8b45bc9..fa53bc3 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -2124,7 +2124,8 @@ static void macb_get_regs(struct net_device *dev, struct 
ethtool_regs *regs,
regs_buff[10] = macb_tx_dma(&bp->queues[0], tail);
regs_buff[11] = macb_tx_dma(&bp->queues[0], head);
 
-   regs_buff[12] = macb_or_gem_readl(bp, USRIO);
+   if (!(bp->caps & MACB_CAPS_USRIO_DISABLED))
+   regs_buff[12] = macb_or_gem_readl(bp, USRIO);
if (macb_is_gem(bp)) {
regs_buff[13] = gem_readl(bp, DMACFG);
}
@@ -2403,19 +2404,21 @@ static int macb_init(struct platform_device *pdev)
dev->hw_features &= ~NETIF_F_SG;
dev->features = dev->hw_features;
 
-   val = 0;
-   if (bp->phy_interface == PHY_INTERFACE_MODE_RGMII)
-   val = GEM_BIT(RGMII);
-   else if (bp->phy_interface == PHY_INTERFACE_MODE_RMII &&
-(bp->caps & MACB_CAPS_USRIO_DEFAULT_IS_MII))
-   val = MACB_BIT(RMII);
-   else if (!(bp->caps & MACB_CAPS_USRIO_DEFAULT_IS_MII))
-   val = MACB_BIT(MII);
+   if (!(bp->caps & MACB_CAPS_USRIO_DISABLED)) {
+   val = 0;
+   if (bp->phy_interface == PHY_INTERFACE_MODE_RGMII)
+   val = GEM_BIT(RGMII);
+   else if (bp->phy_interface == PHY_INTERFACE_MODE_RMII &&
+(bp->caps & MACB_CAPS_USRIO_DEFAULT_IS_MII))
+   val = MACB_BIT(RMII);
+   else if (!(bp->caps & MACB_CAPS_USRIO_DEFAULT_IS_MII))
+   val = MACB_BIT(MII);
 
-   if (bp->caps & MACB_CAPS_USRIO_HAS_CLKEN)
-   val |= MACB_BIT(CLKEN);
+   if (bp->caps & MACB_CAPS_USRIO_HAS_CLKEN)
+   val |= MACB_BIT(CLKEN);
 
-   macb_or_gem_writel(bp, USRIO, val);
+   macb_or_gem_writel(bp, USRIO, val);
+   }
 
/* Set MII management clock divider */
val = macb_mdc_clk_div(bp);
diff --git a/drivers/net/ethernet/cadence/macb.h 
b/drivers/net/ethernet/cadence/macb.h
index 5c03e81..0d4ecfc 100644
--- a/drivers/net/ethernet/cadence/macb.h
+++ b/drivers/net/ethernet/cadence/macb.h
@@ -400,6 +400,7 @@
 #define MACB_CAPS_USRIO_HAS_CLKEN  0x0002
 #define MACB_CAPS_USRIO_DEFAULT_IS_MII 0x0004
 #define MACB_CAPS_NO_GIGABIT_HALF  0x0008
+#define MACB_CAPS_USRIO_DISABLED   0x0010
 #define MACB_CAPS_FIFO_MODE0x1000
 #define MACB_CAPS_GIGABIT_MODE_AVAILABLE   0x2000
 #define MACB_CAPS_SG_DISABLED  0x4000
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 net-next 0/3] Add new capability and macb DT variant

2016-01-05 Thread Neil Armstrong
The first patch introduces a new capability bit to disable usage of the
USRIO register on platform not implementing it thus avoiding some external
imprecise aborts on ARM based platforms.
The two last patchs adds a new macb variant compatible name using the
capability, the NP4 SoC uses this particular hardware configuration.

v1: 
http://lkml.kernel.org/r/1449485914-12883-1-git-send-email-narmstr...@baylibre.com
v2: 
http://lkml.kernel.org/r/1449582726-6148-1-git-send-email-narmstr...@baylibre.com
v3: 
http://lkml.kernel.org/r/1451898103-21868-1-git-send-email-narmstr...@baylibre.com
v4: 
http://lkml.kernel.org/r/1451900573-22657-1-git-send-email-narmstr...@baylibre.com
v5: switch SoC name to non-generic NP4 name

Neil Armstrong (3):
  net: ethernet: cadence-macb: Add disabled usrio caps
  net: macb: Add NP4 macb config using USRIO_DISABLED
  dt-bindings: net: macb: Add NP4 macb variant

 Documentation/devicetree/bindings/net/macb.txt |  1 +
 drivers/net/ethernet/cadence/macb.c| 33 --
 drivers/net/ethernet/cadence/macb.h|  1 +
 3 files changed, 23 insertions(+), 12 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 net-next 2/3] net: macb: Add NP4 macb config using USRIO_DISABLED

2016-01-05 Thread Neil Armstrong
Declare a new NP4 SoC variant having USRIO_DISABLED as capability bit.

Signed-off-by: Neil Armstrong 
---
 drivers/net/ethernet/cadence/macb.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/cadence/macb.c 
b/drivers/net/ethernet/cadence/macb.c
index fa53bc3..d12ee07 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -2781,6 +2781,11 @@ static const struct macb_config emac_config = {
.init = at91ether_init,
 };
 
+static const struct macb_config np4_config = {
+   .caps = MACB_CAPS_USRIO_DISABLED,
+   .clk_init = macb_clk_init,
+   .init = macb_init,
+};
 
 static const struct macb_config zynqmp_config = {
.caps = MACB_CAPS_GIGABIT_MODE_AVAILABLE | MACB_CAPS_JUMBO,
@@ -2801,6 +2806,7 @@ static const struct of_device_id macb_dt_ids[] = {
{ .compatible = "cdns,at32ap7000-macb" },
{ .compatible = "cdns,at91sam9260-macb", .data = &at91sam9260_config },
{ .compatible = "cdns,macb" },
+   { .compatible = "cdns,np4-macb", .data = &np4_config },
{ .compatible = "cdns,pc302-gem", .data = &pc302gem_config },
{ .compatible = "cdns,gem", .data = &pc302gem_config },
{ .compatible = "atmel,sama5d2-gem", .data = &sama5d2_config },
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 net-next 3/3] dt-bindings: net: macb: Add NP4 macb variant

2016-01-05 Thread Neil Armstrong
Add NP4 macb SoC variant.

Signed-off-by: Neil Armstrong 
---
 Documentation/devicetree/bindings/net/macb.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/net/macb.txt 
b/Documentation/devicetree/bindings/net/macb.txt
index 38c8e84..5c397ca 100644
--- a/Documentation/devicetree/bindings/net/macb.txt
+++ b/Documentation/devicetree/bindings/net/macb.txt
@@ -4,6 +4,7 @@ Required properties:
 - compatible: Should be "cdns,[-]{macb|gem}"
   Use "cdns,at91sam9260-macb" for Atmel at91sam9 SoCs or the 10/100Mbit IP
   available on sama5d3 SoCs.
+  Use "cdns,np4-macb" for NP4 SoC devices.
   Use "cdns,at32ap7000-macb" for other 10/100 usage or use the generic form: 
"cdns,macb".
   Use "cdns,pc302-gem" for Picochip picoXcell pc302 and later devices based on
   the Cadence GEM, or the generic form: "cdns,gem".
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH net-next 03/24] phy: Use phy_read() instead of mdiobus_read()

2016-01-05 Thread Andrew Lunn
On Mon, Jan 04, 2016 at 12:07:43PM -0800, Florian Fainelli wrote:
> On 04/01/16 09:36, Andrew Lunn wrote:
> > Since we have a phydev, make use of it and the phy_read() function.
> > This will help with later refactoring.
> > 
> > Signed-off-by: Andrew Lunn 
> > ---
> 
> [snip]
> 
> 
> > diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
> > index 47cd306dbb3c..67a77956ae6f 100644
> > --- a/drivers/net/phy/phy.c
> > +++ b/drivers/net/phy/phy.c
> > @@ -407,8 +407,7 @@ int phy_mii_ioctl(struct phy_device *phydev, struct 
> > ifreq *ifr, int cmd)
> > /* fall through */
> >  
> > case SIOCGMIIREG:
> > -   mii_data->val_out = mdiobus_read(phydev->bus, mii_data->phy_id,
> > -mii_data->reg_num);
> > +   mii_data->val_out = phy_read(phydev, mii_data->reg_num);
> > return 0;
> 
> Do we have any guarantee that users of this interface do a prior
> SIOCGMIIPHY ioctl() to select the PHY address? If not, then this change
> forces it. Arguably, you are current allowed to issue MII reads/writes
> using a PHY device that can be different from the intent, that does not
> sound like a robust interface...

This and the following hunk are wrong, and i will drop them. You are
supposed to be able to read/write any address on the bus, and i
removed this ability.

Thanks for pointing this out.

   Andrew
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 net-next 1/3] net: ethernet: cadence-macb: Add disabled usrio caps

2016-01-05 Thread Nicolas Ferre
Le 05/01/2016 14:39, Neil Armstrong a écrit :
> On some platforms, the macb integration does not use the USRIO
> register to configure the (R)MII port and clocks.
> When the register is not implemented and the MACB error signal
> is connected to the bus error, reading or writing to the USRIO
> register can trigger some Imprecise External Aborts on ARM platforms.
> 
> Signed-off-by: Neil Armstrong 

Acked-by: Nicolas Ferre 

Thanks!

> ---
>  drivers/net/ethernet/cadence/macb.c | 27 +++
>  drivers/net/ethernet/cadence/macb.h |  1 +
>  2 files changed, 16 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/net/ethernet/cadence/macb.c 
> b/drivers/net/ethernet/cadence/macb.c
> index 8b45bc9..fa53bc3 100644
> --- a/drivers/net/ethernet/cadence/macb.c
> +++ b/drivers/net/ethernet/cadence/macb.c
> @@ -2124,7 +2124,8 @@ static void macb_get_regs(struct net_device *dev, 
> struct ethtool_regs *regs,
>   regs_buff[10] = macb_tx_dma(&bp->queues[0], tail);
>   regs_buff[11] = macb_tx_dma(&bp->queues[0], head);
>  
> - regs_buff[12] = macb_or_gem_readl(bp, USRIO);
> + if (!(bp->caps & MACB_CAPS_USRIO_DISABLED))
> + regs_buff[12] = macb_or_gem_readl(bp, USRIO);
>   if (macb_is_gem(bp)) {
>   regs_buff[13] = gem_readl(bp, DMACFG);
>   }
> @@ -2403,19 +2404,21 @@ static int macb_init(struct platform_device *pdev)
>   dev->hw_features &= ~NETIF_F_SG;
>   dev->features = dev->hw_features;
>  
> - val = 0;
> - if (bp->phy_interface == PHY_INTERFACE_MODE_RGMII)
> - val = GEM_BIT(RGMII);
> - else if (bp->phy_interface == PHY_INTERFACE_MODE_RMII &&
> -  (bp->caps & MACB_CAPS_USRIO_DEFAULT_IS_MII))
> - val = MACB_BIT(RMII);
> - else if (!(bp->caps & MACB_CAPS_USRIO_DEFAULT_IS_MII))
> - val = MACB_BIT(MII);
> + if (!(bp->caps & MACB_CAPS_USRIO_DISABLED)) {
> + val = 0;
> + if (bp->phy_interface == PHY_INTERFACE_MODE_RGMII)
> + val = GEM_BIT(RGMII);
> + else if (bp->phy_interface == PHY_INTERFACE_MODE_RMII &&
> +  (bp->caps & MACB_CAPS_USRIO_DEFAULT_IS_MII))
> + val = MACB_BIT(RMII);
> + else if (!(bp->caps & MACB_CAPS_USRIO_DEFAULT_IS_MII))
> + val = MACB_BIT(MII);
>  
> - if (bp->caps & MACB_CAPS_USRIO_HAS_CLKEN)
> - val |= MACB_BIT(CLKEN);
> + if (bp->caps & MACB_CAPS_USRIO_HAS_CLKEN)
> + val |= MACB_BIT(CLKEN);
>  
> - macb_or_gem_writel(bp, USRIO, val);
> + macb_or_gem_writel(bp, USRIO, val);
> + }
>  
>   /* Set MII management clock divider */
>   val = macb_mdc_clk_div(bp);
> diff --git a/drivers/net/ethernet/cadence/macb.h 
> b/drivers/net/ethernet/cadence/macb.h
> index 5c03e81..0d4ecfc 100644
> --- a/drivers/net/ethernet/cadence/macb.h
> +++ b/drivers/net/ethernet/cadence/macb.h
> @@ -400,6 +400,7 @@
>  #define MACB_CAPS_USRIO_HAS_CLKEN0x0002
>  #define MACB_CAPS_USRIO_DEFAULT_IS_MII   0x0004
>  #define MACB_CAPS_NO_GIGABIT_HALF0x0008
> +#define MACB_CAPS_USRIO_DISABLED 0x0010
>  #define MACB_CAPS_FIFO_MODE  0x1000
>  #define MACB_CAPS_GIGABIT_MODE_AVAILABLE 0x2000
>  #define MACB_CAPS_SG_DISABLED0x4000
> 


-- 
Nicolas Ferre
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 net-next 2/3] net: macb: Add NP4 macb config using USRIO_DISABLED

2016-01-05 Thread Nicolas Ferre
Le 05/01/2016 14:39, Neil Armstrong a écrit :
> Declare a new NP4 SoC variant having USRIO_DISABLED as capability bit.
> 
> Signed-off-by: Neil Armstrong 

Acked-by: Nicolas Ferre 


> ---
>  drivers/net/ethernet/cadence/macb.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/net/ethernet/cadence/macb.c 
> b/drivers/net/ethernet/cadence/macb.c
> index fa53bc3..d12ee07 100644
> --- a/drivers/net/ethernet/cadence/macb.c
> +++ b/drivers/net/ethernet/cadence/macb.c
> @@ -2781,6 +2781,11 @@ static const struct macb_config emac_config = {
>   .init = at91ether_init,
>  };
>  
> +static const struct macb_config np4_config = {
> + .caps = MACB_CAPS_USRIO_DISABLED,
> + .clk_init = macb_clk_init,
> + .init = macb_init,
> +};
>  
>  static const struct macb_config zynqmp_config = {
>   .caps = MACB_CAPS_GIGABIT_MODE_AVAILABLE | MACB_CAPS_JUMBO,
> @@ -2801,6 +2806,7 @@ static const struct of_device_id macb_dt_ids[] = {
>   { .compatible = "cdns,at32ap7000-macb" },
>   { .compatible = "cdns,at91sam9260-macb", .data = &at91sam9260_config },
>   { .compatible = "cdns,macb" },
> + { .compatible = "cdns,np4-macb", .data = &np4_config },
>   { .compatible = "cdns,pc302-gem", .data = &pc302gem_config },
>   { .compatible = "cdns,gem", .data = &pc302gem_config },
>   { .compatible = "atmel,sama5d2-gem", .data = &sama5d2_config },
> 


-- 
Nicolas Ferre
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 net-next 3/3] dt-bindings: net: macb: Add NP4 macb variant

2016-01-05 Thread Nicolas Ferre
Le 05/01/2016 14:39, Neil Armstrong a écrit :
> Add NP4 macb SoC variant.
> 
> Signed-off-by: Neil Armstrong 

Acked-by: Nicolas Ferre 

Neil, thanks for your understanding and reactivity concerning this patch
series.

Bye,

> ---
>  Documentation/devicetree/bindings/net/macb.txt | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/Documentation/devicetree/bindings/net/macb.txt 
> b/Documentation/devicetree/bindings/net/macb.txt
> index 38c8e84..5c397ca 100644
> --- a/Documentation/devicetree/bindings/net/macb.txt
> +++ b/Documentation/devicetree/bindings/net/macb.txt
> @@ -4,6 +4,7 @@ Required properties:
>  - compatible: Should be "cdns,[-]{macb|gem}"
>Use "cdns,at91sam9260-macb" for Atmel at91sam9 SoCs or the 10/100Mbit IP
>available on sama5d3 SoCs.
> +  Use "cdns,np4-macb" for NP4 SoC devices.
>Use "cdns,at32ap7000-macb" for other 10/100 usage or use the generic form: 
> "cdns,macb".
>Use "cdns,pc302-gem" for Picochip picoXcell pc302 and later devices based 
> on
>the Cadence GEM, or the generic form: "cdns,gem".
> 


-- 
Nicolas Ferre
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Fix potential NULL pointer dereference in __skb_try_recv_datagram

2016-01-05 Thread Eric Dumazet
On Tue, 2016-01-05 at 12:07 +0100, Jacob Siverskog wrote:
> On Mon, Jan 4, 2016 at 4:25 PM, Eric Dumazet  wrote:
> > On Mon, 2016-01-04 at 10:10 +0100, Jacob Siverskog wrote:
> >> On Wed, Dec 30, 2015 at 11:30 PM, Cong Wang  
> >> wrote:
> >> > On Wed, Dec 30, 2015 at 6:30 AM, Jacob Siverskog
> >> >  wrote:
> >> >> On Wed, Dec 30, 2015 at 2:26 PM, Eric Dumazet  
> >> >> wrote:
> >> >>> How often can you trigger this bug ?
> >> >>
> >> >> Ok. I don't have a good repro to trigger it unfortunately, I've seen it 
> >> >> just a
> >> >> few times when bringing up/down network interfaces. Does the trace
> >> >> give any clue?
> >> >>
> >> >
> >> > A little bit. You need to help people to narrow down the problem
> >> > because there are too many places using skb->next and skb->prev.
> >> >
> >> > Since you mentioned it seems related to network interface flip,
> >> > what network interfaces are you using? What's is your TC setup?
> >> >
> >> > Thanks.
> >>
> >> The system contains only one physical network interface (TI WL1837,
> >> wl18xx module).
> >> The state prior to the crash was as follows:
> >> - One virtual network interface active (as STA, associated with access 
> >> point)
> >> - Bluetooth (BLE only) active (same physical chip, co-existence,
> >> btwilink/st_drv modules)
> >>
> >> Actions made around the time of the crash:
> >> - Bluetooth disabled
> >> - One additional virtual network interface brought up (also as STA)
> >>
> >> I believe the crash occurred between these two actions. I just saw
> >> that there are some interesting events in the log prior to the crash:
> >> kernel: Bluetooth: Unable to push skb to HCI core(-6)
> >> kernel: (stc):  proto stack 4's ->recv failed
> >> kernel: (stc): remove_channel_from_table: id 3
> >> kernel: (stc): remove_channel_from_table: id 2
> >> kernel: (stc): remove_channel_from_table: id 4
> >> kernel: (stc):  all chnl_ids unregistered
> >> kernel: (stk) :ldisc_install = 0(stc): st_tty_close
> >>
> >> The first print is from btwilink.c. However, I can't see the
> >> connection between Bluetooth (BLE) and UDP/IPv6 (we're not using
> >> 6LoWPAN or anything similar).
> >>
> >> Thanks, Jacob
> >
> > Definitely these details are useful ;)
> >
> > Could you try :
> >
> > diff --git a/drivers/misc/ti-st/st_core.c b/drivers/misc/ti-st/st_core.c
> > index 6e3af8b42cdd..0c99a74fb895 100644
> > --- a/drivers/misc/ti-st/st_core.c
> > +++ b/drivers/misc/ti-st/st_core.c
> > @@ -912,7 +912,9 @@ void st_core_exit(struct st_data_s *st_gdata)
> > skb_queue_purge(&st_gdata->txq);
> > skb_queue_purge(&st_gdata->tx_waitq);
> > kfree_skb(st_gdata->rx_skb);
> > +   st_gdata->rx_skb = NULL;
> > kfree_skb(st_gdata->tx_skb);
> > +   st_gdata->tx_skb = NULL;
> > /* TTY ldisc cleanup */
> > err = tty_unregister_ldisc(N_TI_WL);
> > if (err)
> >
> >
> 
> Sure. Since I don't have a good way to trigger the initial issue, I
> can't really know if there is a difference with your patch. However,
> normal usage seems to work as expected with your patch. I've tried to
> reproduce the initial issue with and without your patch repeatedly for
> hours and have not seen any crash in any of the runs so far.
> --

You might build a kernel with KASAN support to get maybe more chances to
trigger the bug.

( https://www.kernel.org/doc/Documentation/kasan.txt )



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Fix potential NULL pointer dereference in __skb_try_recv_datagram

2016-01-05 Thread Jacob Siverskog
On Tue, Jan 5, 2016 at 3:14 PM, Eric Dumazet  wrote:
> On Tue, 2016-01-05 at 12:07 +0100, Jacob Siverskog wrote:
>> On Mon, Jan 4, 2016 at 4:25 PM, Eric Dumazet  wrote:
>> > On Mon, 2016-01-04 at 10:10 +0100, Jacob Siverskog wrote:
>> >> On Wed, Dec 30, 2015 at 11:30 PM, Cong Wang  
>> >> wrote:
>> >> > On Wed, Dec 30, 2015 at 6:30 AM, Jacob Siverskog
>> >> >  wrote:
>> >> >> On Wed, Dec 30, 2015 at 2:26 PM, Eric Dumazet  
>> >> >> wrote:
>> >> >>> How often can you trigger this bug ?
>> >> >>
>> >> >> Ok. I don't have a good repro to trigger it unfortunately, I've seen 
>> >> >> it just a
>> >> >> few times when bringing up/down network interfaces. Does the trace
>> >> >> give any clue?
>> >> >>
>> >> >
>> >> > A little bit. You need to help people to narrow down the problem
>> >> > because there are too many places using skb->next and skb->prev.
>> >> >
>> >> > Since you mentioned it seems related to network interface flip,
>> >> > what network interfaces are you using? What's is your TC setup?
>> >> >
>> >> > Thanks.
>> >>
>> >> The system contains only one physical network interface (TI WL1837,
>> >> wl18xx module).
>> >> The state prior to the crash was as follows:
>> >> - One virtual network interface active (as STA, associated with access 
>> >> point)
>> >> - Bluetooth (BLE only) active (same physical chip, co-existence,
>> >> btwilink/st_drv modules)
>> >>
>> >> Actions made around the time of the crash:
>> >> - Bluetooth disabled
>> >> - One additional virtual network interface brought up (also as STA)
>> >>
>> >> I believe the crash occurred between these two actions. I just saw
>> >> that there are some interesting events in the log prior to the crash:
>> >> kernel: Bluetooth: Unable to push skb to HCI core(-6)
>> >> kernel: (stc):  proto stack 4's ->recv failed
>> >> kernel: (stc): remove_channel_from_table: id 3
>> >> kernel: (stc): remove_channel_from_table: id 2
>> >> kernel: (stc): remove_channel_from_table: id 4
>> >> kernel: (stc):  all chnl_ids unregistered
>> >> kernel: (stk) :ldisc_install = 0(stc): st_tty_close
>> >>
>> >> The first print is from btwilink.c. However, I can't see the
>> >> connection between Bluetooth (BLE) and UDP/IPv6 (we're not using
>> >> 6LoWPAN or anything similar).
>> >>
>> >> Thanks, Jacob
>> >
>> > Definitely these details are useful ;)
>> >
>> > Could you try :
>> >
>> > diff --git a/drivers/misc/ti-st/st_core.c b/drivers/misc/ti-st/st_core.c
>> > index 6e3af8b42cdd..0c99a74fb895 100644
>> > --- a/drivers/misc/ti-st/st_core.c
>> > +++ b/drivers/misc/ti-st/st_core.c
>> > @@ -912,7 +912,9 @@ void st_core_exit(struct st_data_s *st_gdata)
>> > skb_queue_purge(&st_gdata->txq);
>> > skb_queue_purge(&st_gdata->tx_waitq);
>> > kfree_skb(st_gdata->rx_skb);
>> > +   st_gdata->rx_skb = NULL;
>> > kfree_skb(st_gdata->tx_skb);
>> > +   st_gdata->tx_skb = NULL;
>> > /* TTY ldisc cleanup */
>> > err = tty_unregister_ldisc(N_TI_WL);
>> > if (err)
>> >
>> >
>>
>> Sure. Since I don't have a good way to trigger the initial issue, I
>> can't really know if there is a difference with your patch. However,
>> normal usage seems to work as expected with your patch. I've tried to
>> reproduce the initial issue with and without your patch repeatedly for
>> hours and have not seen any crash in any of the runs so far.
>> --
>
> You might build a kernel with KASAN support to get maybe more chances to
> trigger the bug.
>
> ( https://www.kernel.org/doc/Documentation/kasan.txt )
>

Ah. Doesn't seem to be supported on arm(32) unfortunately.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Fix potential NULL pointer dereference in __skb_try_recv_datagram

2016-01-05 Thread Eric Dumazet
On Tue, 2016-01-05 at 15:34 +0100, Jacob Siverskog wrote:
> On Tue, Jan 5, 2016 at 3:14 PM, Eric Dumazet  wrote:

> >
> > You might build a kernel with KASAN support to get maybe more chances to
> > trigger the bug.
> >
> > ( https://www.kernel.org/doc/Documentation/kasan.txt )
> >
> 
> Ah. Doesn't seem to be supported on arm(32) unfortunately.

Then you could at least use standard debugging features :

CONFIG_SLAB=y
CONFIG_SLABINFO=y
CONFIG_DEBUG_SLAB=y
CONFIG_DEBUG_SLAB_LEAK=y

(Or equivalent SLUB options)

and

CONFIG_DEBUG_PAGEALLOC=y

(If arm(32) has CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] net: filter: make JITs zero A for SKF_AD_ALU_XOR_X

2016-01-05 Thread Rabin Vincent
The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data
instructions since it XORs A with X while all the others replace A with
some loaded value.  All the BPF JITs fail to clear A if this is used as
the first instruction in a filter.  This was found using american fuzzy
lop.

Add a helper to determine if A needs to be cleared given the first
instruction in a filter, and use this in the JITs.  Except for ARM, the
rest have only been compile-tested.

Fixes: 3480593131e0 ("net: filter: get rid of BPF_S_* enum")
Signed-off-by: Rabin Vincent 
---
 arch/arm/net/bpf_jit_32.c   | 16 +---
 arch/mips/net/bpf_jit.c | 16 +---
 arch/powerpc/net/bpf_jit_comp.c | 13 ++---
 arch/sparc/net/bpf_jit_comp.c   | 17 ++---
 include/linux/filter.h  | 19 +++
 5 files changed, 25 insertions(+), 56 deletions(-)

diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index 591f9db3bf40..e153eb065fe4 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -187,19 +187,6 @@ static inline int mem_words_used(struct jit_ctx *ctx)
return fls(ctx->seen & SEEN_MEM);
 }
 
-static inline bool is_load_to_a(u16 inst)
-{
-   switch (inst) {
-   case BPF_LD | BPF_W | BPF_LEN:
-   case BPF_LD | BPF_W | BPF_ABS:
-   case BPF_LD | BPF_H | BPF_ABS:
-   case BPF_LD | BPF_B | BPF_ABS:
-   return true;
-   default:
-   return false;
-   }
-}
-
 static void jit_fill_hole(void *area, unsigned int size)
 {
u32 *ptr;
@@ -211,7 +198,6 @@ static void jit_fill_hole(void *area, unsigned int size)
 static void build_prologue(struct jit_ctx *ctx)
 {
u16 reg_set = saved_regs(ctx);
-   u16 first_inst = ctx->skf->insns[0].code;
u16 off;
 
 #ifdef CONFIG_FRAME_POINTER
@@ -241,7 +227,7 @@ static void build_prologue(struct jit_ctx *ctx)
emit(ARM_MOV_I(r_X, 0), ctx);
 
/* do not leak kernel data to userspace */
-   if ((first_inst != (BPF_RET | BPF_K)) && !(is_load_to_a(first_inst)))
+   if (bpf_needs_clear_a(&ctx->skf->insns[0]))
emit(ARM_MOV_I(r_A, 0), ctx);
 
/* stack space for the BPF_MEM words */
diff --git a/arch/mips/net/bpf_jit.c b/arch/mips/net/bpf_jit.c
index 77cb27309db2..1a8c96035716 100644
--- a/arch/mips/net/bpf_jit.c
+++ b/arch/mips/net/bpf_jit.c
@@ -521,19 +521,6 @@ static inline u16 align_sp(unsigned int num)
return num;
 }
 
-static bool is_load_to_a(u16 inst)
-{
-   switch (inst) {
-   case BPF_LD | BPF_W | BPF_LEN:
-   case BPF_LD | BPF_W | BPF_ABS:
-   case BPF_LD | BPF_H | BPF_ABS:
-   case BPF_LD | BPF_B | BPF_ABS:
-   return true;
-   default:
-   return false;
-   }
-}
-
 static void save_bpf_jit_regs(struct jit_ctx *ctx, unsigned offset)
 {
int i = 0, real_off = 0;
@@ -614,7 +601,6 @@ static unsigned int get_stack_depth(struct jit_ctx *ctx)
 
 static void build_prologue(struct jit_ctx *ctx)
 {
-   u16 first_inst = ctx->skf->insns[0].code;
int sp_off;
 
/* Calculate the total offset for the stack pointer */
@@ -641,7 +627,7 @@ static void build_prologue(struct jit_ctx *ctx)
emit_jit_reg_move(r_X, r_zero, ctx);
 
/* Do not leak kernel data to userspace */
-   if ((first_inst != (BPF_RET | BPF_K)) && !(is_load_to_a(first_inst)))
+   if (bpf_needs_clear_a(&ctx->skf->insns[0]))
emit_jit_reg_move(r_A, r_zero, ctx);
 }
 
diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index 04782164ee67..2d66a8446198 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -78,18 +78,9 @@ static void bpf_jit_build_prologue(struct bpf_prog *fp, u32 
*image,
PPC_LI(r_X, 0);
}
 
-   switch (filter[0].code) {
-   case BPF_RET | BPF_K:
-   case BPF_LD | BPF_W | BPF_LEN:
-   case BPF_LD | BPF_W | BPF_ABS:
-   case BPF_LD | BPF_H | BPF_ABS:
-   case BPF_LD | BPF_B | BPF_ABS:
-   /* first instruction sets A register (or is RET 'constant') */
-   break;
-   default:
-   /* make sure we dont leak kernel information to user */
+   /* make sure we dont leak kernel information to user */
+   if (bpf_needs_clear_a(&filter[0]))
PPC_LI(r_A, 0);
-   }
 }
 
 static void bpf_jit_build_epilogue(u32 *image, struct codegen_context *ctx)
diff --git a/arch/sparc/net/bpf_jit_comp.c b/arch/sparc/net/bpf_jit_comp.c
index 22564f5f2364..3e6e05a7c4c2 100644
--- a/arch/sparc/net/bpf_jit_comp.c
+++ b/arch/sparc/net/bpf_jit_comp.c
@@ -420,22 +420,9 @@ void bpf_jit_compile(struct bpf_prog *fp)
}
emit_reg_move(O7, r_saved_O7);
 
-   switch (filter[0].code) {
-   case BPF_RET | BPF_K:
-   case BPF_LD | BPF_W | BPF_LEN:
-   case BPF_LD | BPF_W | BPF_ABS:
-   

Re: [RFC v5 5/6] Add PTP_SYS_OFFSET_PRECISE for driver crosstimestamping

2016-01-05 Thread Richard Cochran
On Mon, Jan 04, 2016 at 04:45:22AM -0800, Christopher S. Hall wrote:
> @@ -138,6 +142,7 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, 
> unsigned long arg)
>   caps.n_per_out = ptp->info->n_per_out;
>   caps.pps = ptp->info->pps;
>   caps.n_pins = ptp->info->n_pins;
> + caps.cross_timestamping = ptp->info->getsynctime != NULL;
>   if (copy_to_user((void __user *)arg, &caps, sizeof(caps)))
>   err = -EFAULT;
>   break;
> @@ -180,6 +185,32 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, 
> unsigned long arg)
>   err = ops->enable(ops, &req, enable);
>   break;
>  
> + case PTP_SYS_OFFSET_PRECISE:
> + if (!ptp->info->getsynctime) {
> + err = -EINVAL;

-EOPNOTSUPP would be better here.

> + break;
> + }
> + err = ptp->info->getsynctime(ptp->info, &xtstamp);
> + if (err)
> + break;
> +
> + precise_offset.sys_real.sec =
> + div_u64_rem(ktime_to_ns(xtstamp.sys_realtime),
> + NSEC_PER_SEC, &rem);
> + precise_offset.sys_real.nsec = rem;

How about this instead:

ts = ktime_to_timespec64(xtstamp.sys_realtime);
precise_offset.sys_real.sec = ts.tv_sec;
precise_offset.sys_real.nsec = ts.tv_nsec;

> + precise_offset.sys_raw.sec =
> + div_u64_rem(ktime_to_ns(xtstamp.sys_monoraw),
> + NSEC_PER_SEC, &rem);
> + precise_offset.sys_raw.nsec = rem;
> + precise_offset.dev.sec =
> + div_u64_rem(ktime_to_ns(xtstamp.device), NSEC_PER_SEC,
> + &rem);
> + precise_offset.dev.nsec = rem;

And for these as well.

Thanks,
Richard
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH V5 4/9] Drivers: hv: ring_buffer: enhance hv_ringbuffer_read() to support hvsock

2016-01-05 Thread Dexuan Cui
> From: Vitaly Kuznetsov [mailto:vkuzn...@redhat.com]
> Sent: Tuesday, January 5, 2016 20:31
> ...
> > To get the payload of hvsock, we need raw=0 to skip the level-1 header
> > (i.e., struct vmpacket_descriptor desc) and we also need to skip the
> > level-2 header (i.e., struct vmpipe_proto_header pipe_hdr).
> >
> > NB: if the length of the hvsock payload is not aligned with the 8-byte
> > boundeary, at most 7 padding bytes are appended, so the real hvsock
> > payload's length must be retrieved by the pipe_hdr.data_size field.
> >
> > I 'upgrade' the 'raw' parameter of hv_ringbuffer_read() to a
> > 'read_flags', trying to share the logic of the function.
> 
> When I was touching this code last time I was actually thinking about
> eliminating 'raw' flag by making all ring reads raw and moving this
> header filtering job to the upper layer (as we already have
> vmbus_recvpacket()/vmbus_recvpacket_raw()) but for some reason I didn't
> do it. I believe you have more or less the same reasoing for introducing
> new read type instead of parsing this at a higher level. Some comments
> below ...

I feel it's more convenient to do the parsing in the vmbus driver than in
all the driver users of vmbus driver.

However, yes, I admit hv_ringbuffer_read() becomes less readable with
my introduction of 'read_flags'.

It may be a better idea to do the parsing in higher level, i.e., the hvsock 
driver,
in my case.
It looks I can avoid introducing vmbus_recvpacket_hvsock() and use 
vmbus_recvpacket() directly in my hvsock driver.

Let me try to make a new patch this way.

> > This patch is required by the next patch, which will introduce the hvsock
> > send/recv APIs.
> >
> > ...
> > @@ -619,9 +619,20 @@ int hv_ringbuffer_write(struct hv_ring_buffer_info
> *ring_info,
> > struct kvec *kv_list,
> > u32 kv_count, bool *signal);
> >
> > +/*
> > + * By default, a read_flags of 0 means: the payload offset is
> > + * sizeof(struct vmpacket_descriptor).
> > + *
> > + * If HV_RINGBUFFER_READ_FLAG_RAW is used, the payload offset is 0.
> > + *
> > + * If HV_RINGBUFFER_READ_FLAG_HVSOCK is used, the payload offset is
> > + * sizeof(struct vmpacket_descriptor) + sizeof(struct
> > vmpipe_proto_header).
> 
> So these are mutually exclusive, right? Should we introduce 'int
> payload_offset' parameter instead of flags?
Sorry for making the code less readable. :-)
As I mentioned above, let me try to do things in a better way.

> > @@ -415,17 +426,26 @@ int hv_ringbuffer_read(struct hv_ring_buffer_info
> *inring_info,
> > goto out_unlock;
> > }
> >
> > +   if (tot_hdrlen > buflen) {
> > +   ret = -ENOBUFS;
> > +   goto out_unlock;
> > +   }
> > +
> > +   desc = (struct vmpacket_descriptor *)buffer;
> > +
> > next_read_location = hv_get_next_read_location(inring_info);
> > -   next_read_location = hv_copyfrom_ringbuffer(inring_info, &desc,
> > -   sizeof(desc),
> > +   next_read_location = hv_copyfrom_ringbuffer(inring_info, desc,
> > +   tot_hdrlen,
> > next_read_location);
> > +   offset = 0;
> > +   if (!raw)
> > +   offset += (desc->offset8 << 3);
> > +   if (hvsock)
> > +   offset += sizeof(*pipe_hdr);
> 
> So in case of !raw and hvsock we add both offsets?
Yes...
 
Thanks for you review, Vitaly.

Thanks,
-- Dexuan
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 net-next 3/4] soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF

2016-01-05 Thread Craig Gallek
On Tue, Jan 5, 2016 at 4:38 AM, Daniel Borkmann  wrote:
> On 01/04/2016 11:41 PM, Craig Gallek wrote:
>>
>> From: Craig Gallek 
>>
>> Expose socket options for setting a classic or extended BPF program
>> for use when selecting sockets in an SO_REUSEPORT group.  These options
>> can be used on the first socket to belong to a group before bind or
>> on any socket in the group after bind.
>>
>> This change includes refactoring of the existing sk_filter code to
>> allow reuse of the existing BPF filter validation checks.
>>
>> Signed-off-by: Craig Gallek 
>
> [...]
>>
>> +static struct sock *run_bpf(struct sock_reuseport *reuse, u16 socks,
>> +   struct bpf_prog *prog, struct sk_buff *skb,
>> +   int hdr_len)
>> +{
>> +   struct sk_buff *nskb = NULL;
>> +   u32 index;
>> +
>> +   if (skb_shared(skb)) {
>> +   nskb = skb_clone(skb, GFP_ATOMIC);
>> +   if (!nskb)
>> +   return NULL;
>> +   skb = nskb;
>> +   }
>> +
>> +   /* temporarily advance data past protocol header */
>> +   if (!pskb_pull(skb, hdr_len)) {
>> +   consume_skb(nskb);
>
>
> Btw, this one could still be made kfree_skb() to indicate error condition
> here.
Good point.  I'll send a follow-up.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next] soreuseport: change consume_skb to kfree_skb in error case

2016-01-05 Thread Craig Gallek
From: Craig Gallek 

Fixes: 538950a1b752 ("soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF")
Suggested-by: Daniel Borkmann 
Signed-off-by: Craig Gallek 
---
 net/core/sock_reuseport.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/sock_reuseport.c b/net/core/sock_reuseport.c
index ae0969c0fc2e..1df98c557440 100644
--- a/net/core/sock_reuseport.c
+++ b/net/core/sock_reuseport.c
@@ -173,7 +173,7 @@ static struct sock *run_bpf(struct sock_reuseport *reuse, 
u16 socks,
 
/* temporarily advance data past protocol header */
if (!pskb_pull(skb, hdr_len)) {
-   consume_skb(nskb);
+   kfree_skb(nskb);
return NULL;
}
index = bpf_prog_run_save_cb(prog, skb);
-- 
2.6.0.rc2.230.g3dd15c0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: filter: make JITs zero A for SKF_AD_ALU_XOR_X

2016-01-05 Thread Eric Dumazet
On Tue, 2016-01-05 at 16:23 +0100, Rabin Vincent wrote:
> The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data
> instructions since it XORs A with X while all the others replace A with
> some loaded value.  All the BPF JITs fail to clear A if this is used as
> the first instruction in a filter.

Is x86_64 part of this 'All' subset ? ;)

>   This was found using american fuzzy
> lop.
> 
> Add a helper to determine if A needs to be cleared given the first
> instruction in a filter, and use this in the JITs.  Except for ARM, the
> rest have only been compile-tested.
> 
> Fixes: 3480593131e0 ("net: filter: get rid of BPF_S_* enum")
> Signed-off-by: Rabin Vincent 
> ---


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: filter: make JITs zero A for SKF_AD_ALU_XOR_X

2016-01-05 Thread Rabin Vincent
On Tue, Jan 05, 2016 at 08:00:45AM -0800, Eric Dumazet wrote:
> On Tue, 2016-01-05 at 16:23 +0100, Rabin Vincent wrote:
> > The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data
> > instructions since it XORs A with X while all the others replace A with
> > some loaded value.  All the BPF JITs fail to clear A if this is used as
> > the first instruction in a filter.
> 
> Is x86_64 part of this 'All' subset ? ;)

No, because it's an eBPF JIT.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH V5 7/9] Drivers: hv: vmbus: add a mechanism to pass hvsock events to the hvsock driver

2016-01-05 Thread Dexuan Cui
> From: Vitaly Kuznetsov [mailto:vkuzn...@redhat.com]
> > diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> >
> > +/* hvsock related definitions */
> > +enum hvsock_event {
> > +   /* The host application is close()-ing the connection */
> > +   HVSOCK_RESCIND_CHANNEL,
> > +};
> > +
> >  struct vmbus_channel {
> > /* Unique channel id */
> > int id;
> > @@ -740,6 +746,13 @@ struct vmbus_channel {
> > void (*sc_creation_callback)(struct vmbus_channel *new_sc);
> >
> > /*
> > +* hvsock event callback.
> > +* For now only 1 event is defined: HVSOCK_RESCIND_CHANNEL.
> > +*/
> > +   void (*hvsock_event_callback)(struct vmbus_channel *channel,
> > + enum hvsock_event event);
> 
> Would it make sense to rename it to something more general,
> e.g. sc_rescind_callback and call it for all drivers (even if we don't
> need it now) intead of introducing enum hvsock_event? When new events
Your suggestion is good: channel->hvsock_event_callback != NULL implies
is_hvsock_channel(channel) is true.

> arrive we'll just add new callbacks (or, alternatively, we could unify
> it to 'channel_event_callback' and merging with sc_creation_callback()
> but I'd say it is uglier).

I'm OK to use the idea "when new events arrive we'll just add new callbacks".

Let me make a new patch.

Thanks,
-- Dexuan
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rsi: Delete unnecessary variable initialisations in rsi_send_mgmt_pkt()

2016-01-05 Thread SF Markus Elfring
> Every time you send a set of patches,

I suggested some updates for Linux source files since October 2014.


> there are legitimate issues which people raise,

There was usual feedback.


> and every time they are discussed,

The discussion results were mixed between acceptance
and usual disagreement.


> you assert that your patches improve things

I guess that should be the default intention of every patch, shouldn't it?


> and seem to ignore the concerns people raise.

I hope not. - But I can imagine that you might understand some responses
from contributors in this way.
Are you waiting for another clarification on a specific issue?


> I've seen this same pattern of discussion here with these patches,
> with your patches to move labels into if statements, with the patches
> you sent late June last year, your patches to remove conditions before
> kfree() and friends, etc.

It seems that communication difficulties come partly from the fact
that I chose search patterns from static source code analysis so far
which belong to an error category that gets a lower priority.


> You need to change you attitude: just because you can see some benefit
> from your patches doesn't mean others do and it doesn't mean that
> they're willing to accept them.

I understand your advice.

Further update suggestions with higher importance might follow for various
software areas in the future.

Regards,
Markus
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH V5 5/9] Drivers: hv: vmbus: add APIs to send/recv hvsock packets

2016-01-05 Thread Dexuan Cui
> From: Vitaly Kuznetsov [mailto:vkuzn...@redhat.com]
> Sent: Tuesday, January 5, 2016 20:39
> ...
> > +/*
> > + * vmbus_sendpacket_hvsock - Send the hvsock payload 'buf' of a length 
> > 'len'
> > + */
> > +int vmbus_sendpacket_hvsock(struct vmbus_channel *channel, void *buf,
> u32 len)
> > ...
> > +
> > +   bufferlist[0].iov_base = &desc;
> > +   bufferlist[0].iov_len  = sizeof(struct vmpacket_descriptor);
> > +   bufferlist[1].iov_base = &pipe_hdr;
> > +   bufferlist[1].iov_len  = sizeof(struct vmpipe_proto_header);
> > +   bufferlist[2].iov_base = buf;
> > +   bufferlist[2].iov_len  = len;
> > +   bufferlist[3].iov_base = &aligned_data;
> > +   bufferlist[3].iov_len  = packetlen_aligned - packetlen;
> > +
> > +   ret = hv_ringbuffer_write(&channel->outbound, bufferlist, 4,
> > &signal);
> 
> Using ARRAY_SIZE(bufferlist) instead of 4 would allow us to keep this
> line untouched when we decide to add something (and compiler will
> optimize it to 4 anyway).

Thanks for the suggestion! I'll fix it.

-- Dexuan
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH stable-3.2 stable-3.12] net: fix checksum check in skb_copy_and_csum_datagram_iovec()

2016-01-05 Thread Jiri Slaby
On 12/28/2015, 03:01 PM, Michal Kubecek wrote:
> Recent fix "net: add length argument to
> skb_copy_and_csum_datagram_iovec" added to some pre-3.19 stable
> branches, namely
> 
>   stable-3.2.y: commit 127500d724f8
>   stable-3.12.y: commit 3e1ac3aafbd0

Applied this fix to 3.12. Thanks!

> doesn't handle truncated reads correctly. If read length is shorter than
> incoming datagram (but non-zero) and first segment of target iovec is
> sufficient for read length, skb_copy_and_csum_datagram() is used to copy
> checksum the data while copying it. For truncated reads this means only
> the copied part is checksummed (rather than the whole datagram) so that
> the check almost always fails.
> 
> Add checksum of the remaining part so that the proper checksum of the
> whole datagram is computed and checked. Special care must be taken if
> the copied length is odd.
> 
> For zero read length, we don't have to copy anything but we still should
> check the checksum so that a peek doesn't return with a datagram which
> is invalid and wouldn't be returned by an actual read.
> 
> Signed-off-by: Michal Kubecek 
> ---
>  net/core/datagram.c | 26 +-
>  1 file changed, 21 insertions(+), 5 deletions(-)
> 
> diff --git a/net/core/datagram.c b/net/core/datagram.c
> index f22f120771ef..af4bf368257c 100644
> --- a/net/core/datagram.c
> +++ b/net/core/datagram.c
> @@ -809,13 +809,14 @@ int skb_copy_and_csum_datagram_iovec(struct sk_buff 
> *skb,
>int hlen, struct iovec *iov, int len)
>  {
>   __wsum csum;
> - int chunk = skb->len - hlen;
> + int full_chunk = skb->len - hlen;
> + int chunk = min_t(int, full_chunk, len);
>  
> - if (chunk > len)
> - chunk = len;
> -
> - if (!chunk)
> + if (!chunk) {
> + if (__skb_checksum_complete(skb))
> + goto csum_error;
>   return 0;
> + }
>  
>   /* Skip filled elements.
>* Pretty silly, look at memcpy_toiovec, though 8)
> @@ -833,6 +834,21 @@ int skb_copy_and_csum_datagram_iovec(struct sk_buff *skb,
>   if (skb_copy_and_csum_datagram(skb, hlen, iov->iov_base,
>  chunk, &csum))
>   goto fault;
> + if (full_chunk > chunk) {
> + if (chunk % 2) {
> + __be16 odd = 0;
> +
> + if (skb_copy_bits(skb, hlen + chunk,
> +   (char *)&odd + 1, 1))
> + goto fault;
> + csum = add32_with_carry(odd, csum);
> + csum = skb_checksum(skb, hlen + chunk + 1,
> + full_chunk - chunk - 1,
> + csum);
> + } else
> + csum = skb_checksum(skb, hlen + chunk,
> + full_chunk - chunk, csum);
> + }
>   if (csum_fold(csum))
>   goto csum_error;
>   if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE))
> 


-- 
js
suse labs
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: filter: make JITs zero A for SKF_AD_ALU_XOR_X

2016-01-05 Thread Daniel Borkmann

On 01/05/2016 04:23 PM, Rabin Vincent wrote:

The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data
instructions since it XORs A with X while all the others replace A with
some loaded value.  All the BPF JITs fail to clear A if this is used as
the first instruction in a filter.  This was found using american fuzzy
lop.

Add a helper to determine if A needs to be cleared given the first
instruction in a filter, and use this in the JITs.  Except for ARM, the
rest have only been compile-tested.

Fixes: 3480593131e0 ("net: filter: get rid of BPF_S_* enum")
Signed-off-by: Rabin Vincent 


Excellent catch, thanks a lot! The fix looks good to me and should
go to -net tree.

Acked-by: Daniel Borkmann 

If you're interested, feel free to add a small test case for the
SKF_AD_ALU_XOR_X issue to lib/test_bpf.c for -net-next tree. Thanks!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: filter: make JITs zero A for SKF_AD_ALU_XOR_X

2016-01-05 Thread Daniel Borkmann

On 01/05/2016 05:03 PM, Rabin Vincent wrote:

On Tue, Jan 05, 2016 at 08:00:45AM -0800, Eric Dumazet wrote:

On Tue, 2016-01-05 at 16:23 +0100, Rabin Vincent wrote:

The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data
instructions since it XORs A with X while all the others replace A with
some loaded value.  All the BPF JITs fail to clear A if this is used as
the first instruction in a filter.


Is x86_64 part of this 'All' subset ? ;)


No, because it's an eBPF JIT.


Correct, filter conversion to eBPF clears it already.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH stable-3.2 stable-3.12] net: fix checksum check in skb_copy_and_csum_datagram_iovec()

2016-01-05 Thread Ben Hutchings
On Tue, 2016-01-05 at 17:36 +0100, Jiri Slaby wrote:
> On 12/28/2015, 03:01 PM, Michal Kubecek wrote:
> > Recent fix "net: add length argument to
> > skb_copy_and_csum_datagram_iovec" added to some pre-3.19 stable
> > branches, namely
> > 
> >   stable-3.2.y: commit 127500d724f8
> >   stable-3.12.y: commit 3e1ac3aafbd0
> 
> Applied this fix to 3.12. Thanks!
[...]

You don't want this, you want Eric's fix (commit 197c949e7, "udp:
properly support MSG_PEEK with truncated buffers") although that's not
upstream yet.

Ben.

-- 
Ben Hutchings
Tomorrow will be cancelled due to lack of interest.

signature.asc
Description: This is a digitally signed message part


Re: [PATCH net-next] soreuseport: change consume_skb to kfree_skb in error case

2016-01-05 Thread Daniel Borkmann

On 01/05/2016 04:57 PM, Craig Gallek wrote:

From: Craig Gallek 

Fixes: 538950a1b752 ("soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF")
Suggested-by: Daniel Borkmann 
Signed-off-by: Craig Gallek 


Thanks, Craig.

Acked-by: Daniel Borkmann 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH stable-3.2 stable-3.12] net: fix checksum check in skb_copy_and_csum_datagram_iovec()

2016-01-05 Thread Jiri Slaby
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

On 01/05/2016, 05:40 PM, Ben Hutchings wrote:
> On Tue, 2016-01-05 at 17:36 +0100, Jiri Slaby wrote:
>> On 12/28/2015, 03:01 PM, Michal Kubecek wrote:
>>> Recent fix "net: add length argument to 
>>> skb_copy_and_csum_datagram_iovec" added to some pre-3.19
>>> stable branches, namely
>>> 
>>> stable-3.2.y: commit 127500d724f8 stable-3.12.y: commit
>>> 3e1ac3aafbd0
>> 
>> Applied this fix to 3.12. Thanks!
> [...]
> 
> You don't want this, you want Eric's fix (commit 197c949e7, "udp: 
> properly support MSG_PEEK with truncated buffers") although that's
> not upstream yet.

Dropped then. Thanks!


- -- 
js
suse labs
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQIcBAEBCAAGBQJWi/IlAAoJEL0lsQQGtHBJ+HkP/RSxEleW2xhvRXBVz9S5iTgR
U5dDY3gdpajVuouc8S8ZC8o6/BT+rx0LwzrCcb8RDkIOPad3dFVrcPn5uwZvCZ+b
JSNKvNLTLgtn/jvSu5ZhjlL0F4zXu2Pl37KZpaimwEgGAEmfKAvLL52mdHWURGRq
AD7qD5UJxybejt8VNpExX6Oo+Q6nsRJtbrXXSRbalpeCwUhMxu+A/0vWFQ+bFR4M
GPMw6GqhCoCTy1fcpI4csyvEJImg91PAnNc2+37vOUqxrpr4hqBtNnVerampaI1N
u6QGl5WxfqvX+sDeYv7JDCR+2D750EixcfI6PYFdINSgTBn/7veKTsN1buj9JzlU
LBVFKXLILX3pOj16rIiwVtmqapinyJB9c3vCUDP19opn7LtkR16a5wDOTD3T59zs
KX6FZ0mO3uRacJNpVqeAcYQ8MVFIN6JRCCokJoTgFmia19rSpxV3Th1MoFy/bafX
yc5T+n0CYguTVirME2hn2vt2JHvUzUCcabktT0chY8EhXTc6XMBefRGWnhL3raBe
zRCn8tnGUbawr46ZbwMggpavfzUg3lGE6Y5/g5l5dNmdrLfU5SnNmid769Uvlrad
MFf56DMfMFHmojcqOukcN1v97jvmaK9eY2xXRvuoGnzYexL2JCiPEKfB21NQfZhW
NZSya8PXfCtyAnUS+56r
=hGxr
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 1/2] Support outside netns for tunnels.

2016-01-05 Thread Nicolas Dichtel

Le 04/01/2016 19:45, Saurabh Mohan a écrit :


This patch enchances a tunnel interface, like gre, to have the tunnel
encap/decap be in the context of a network namespace that is different from
the namespace of the tunnel interface.

 From userspace this feature may be configured using the new 'onetns' keyword:
ip netns exec custa ip link add dev tun1 type gre local 10.0.0.1 \
  remote 10.0.0.2 onetns outside

In the above example the tunnel would be in the 'custa' namespace and the
tunnel endpoints would be in the 'outside' namespace.

What is the difference with the following commands?

ip netns exec outside ip link add dev tun1 type gre local 10.0.0.1 \
   remote 10.0.0.2
ip netns exec outside ip link set tun1 netns custa

or

ip exec custa ip netns set outside 1234
ip exec custa ip link add tun1 link-netnsid 1234 type gre local 10.0.0.1 \
   remote 10.0.0.2


Regards,
Nicolas
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: linux-next: manual merge of the rdma tree with the net-next tree

2016-01-05 Thread Or Gerlitz

On 1/5/2016 3:51 AM, Stephen Rothwell wrote:

Hi Doug,

Today's linux-next merge of the rdma tree got conflicts in:

   drivers/net/ethernet/mellanox/mlx5/core/vport.c
   include/linux/mlx5/mlx5_ifc.h
   include/linux/mlx5/vport.h

between commits:

   e1d7d349c69d ("net/mlx5: Update access functions to Query/Modify vport MAC 
address")
   e75465148b7d ("net/mlx5: Introduce access functions to modify/query vport 
state")

from the net-next tree and commit:

   e5f6175c5b66 ("net/mlx5_core: Break down the vport mac address query 
function")

from the rdma tree and maybe some others.

I have no hope of fixing this stuff up, so I have dropped the rdma tree
again for today.  There is similar functionality being introduced in
both trees ... please sort this mess out ...


Hi Stephen,

Saeed/Matan and Co are working here on a fix which would solve the 
conflict.


We have the basic patch and people will be testing it later/tomorrow.

Could you please advice how would it be possible to provide you the
git rerere output -- the pre-image and post-images files from .git/rr-cache?

Or.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] net: sched: fix missing free per cpu on qstats

2016-01-05 Thread John Fastabend
When a qdisc is using per cpu stats (currently just the ingress
qdisc) only the bstats are being freed. This also free's the qstats.

Signed-off-by: John Fastabend 
---
 net/sched/sch_generic.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index e82a1ad..16bc83b 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -658,8 +658,10 @@ static void qdisc_rcu_free(struct rcu_head *head)
 {
struct Qdisc *qdisc = container_of(head, struct Qdisc, rcu_head);
 
-   if (qdisc_is_percpu_stats(qdisc))
+   if (qdisc_is_percpu_stats(qdisc)) {
free_percpu(qdisc->cpu_bstats);
+   free_percpu(qdisc->cpu_qstats);
+   }
 
kfree((char *) qdisc - qdisc->padded);
 }

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] ARM: net: bpf: fix zero right shift

2016-01-05 Thread Rabin Vincent
The LSR instruction cannot be used to perform a zero right shift since a
0 as the immediate value (imm5) in the LSR instruction encoding means
that a shift of 32 is perfomed.  See DecodeIMMShift() in the ARM ARM.

Make the JIT skip generation of the LSR if a zero-shift is requested.

This was found using american fuzzy lop.

Signed-off-by: Rabin Vincent 
---
 arch/arm/net/bpf_jit_32.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index e153eb065fe4..93d0b6d0b63e 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -756,7 +756,8 @@ load_ind:
case BPF_ALU | BPF_RSH | BPF_K:
if (unlikely(k > 31))
return -1;
-   emit(ARM_LSR_I(r_A, r_A, k), ctx);
+   if (k)
+   emit(ARM_LSR_I(r_A, r_A, k), ctx);
break;
case BPF_ALU | BPF_RSH | BPF_X:
update_on_xread(ctx);
-- 
2.6.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] net: lan78xx: Fix to write to OTP(One Time Programmable) per magic number.

2016-01-05 Thread Woojung.Huh
This patch fixes a bug writing to EEPROM in lan78xx_ethtool_set_eeprom()
when asked to write to OTP.

Signed-off-by: Woojung Huh 
---
 drivers/net/usb/lan78xx.c | 55 ++-
 1 file changed, 54 insertions(+), 1 deletion(-)

diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index 226668e..d54f536 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -603,6 +603,59 @@ static int lan78xx_read_raw_otp(struct lan78xx_net *dev, 
u32 offset,
return 0;
 }
 
+static int lan78xx_write_raw_otp(struct lan78xx_net *dev, u32 offset,
+u32 length, u8 *data)
+{
+   int i;
+   int ret;
+   u32 buf;
+   unsigned long timeout;
+
+   ret = lan78xx_read_reg(dev, OTP_PWR_DN, &buf);
+
+   if (buf & OTP_PWR_DN_PWRDN_N_) {
+   /* clear it and wait to be cleared */
+   ret = lan78xx_write_reg(dev, OTP_PWR_DN, 0);
+
+   timeout = jiffies + HZ;
+   do {
+   udelay(1);
+   ret = lan78xx_read_reg(dev, OTP_PWR_DN, &buf);
+   if (time_after(jiffies, timeout)) {
+   netdev_warn(dev->net,
+   "timeout on OTP_PWR_DN completion");
+   return -EIO;
+   }
+   } while (buf & OTP_PWR_DN_PWRDN_N_);
+   }
+
+   /* set to BYTE program mode */
+   ret = lan78xx_write_reg(dev, OTP_PRGM_MODE, OTP_PRGM_MODE_BYTE_);
+
+   for (i = 0; i < length; i++) {
+   ret = lan78xx_write_reg(dev, OTP_ADDR1,
+   ((offset + i) >> 8) & OTP_ADDR1_15_11);
+   ret = lan78xx_write_reg(dev, OTP_ADDR2,
+   ((offset + i) & OTP_ADDR2_10_3));
+   ret = lan78xx_write_reg(dev, OTP_PRGM_DATA, data[i]);
+   ret = lan78xx_write_reg(dev, OTP_TST_CMD, OTP_TST_CMD_PRGVRFY_);
+   ret = lan78xx_write_reg(dev, OTP_CMD_GO, OTP_CMD_GO_GO_);
+
+   timeout = jiffies + HZ;
+   do {
+   udelay(1);
+   ret = lan78xx_read_reg(dev, OTP_STATUS, &buf);
+   if (time_after(jiffies, timeout)) {
+   netdev_warn(dev->net,
+   "Timeout on OTP_STATUS completion");
+   return -EIO;
+   }
+   } while (buf & OTP_STATUS_BUSY_);
+   }
+
+   return 0;
+}
+
 static int lan78xx_read_otp(struct lan78xx_net *dev, u32 offset,
u32 length, u8 *data)
 {
@@ -969,7 +1022,7 @@ static int lan78xx_ethtool_set_eeprom(struct net_device 
*netdev,
 (ee->offset == 0) &&
 (ee->len == 512) &&
 (data[0] == OTP_INDICATOR_1))
-   return lan78xx_write_raw_eeprom(dev, ee->offset, ee->len, data);
+   return lan78xx_write_raw_otp(dev, ee->offset, ee->len, data);
 
return -EINVAL;
 }
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] arm64: net: bpf: don't BUG() on large shifts

2016-01-05 Thread Rabin Vincent
Attempting to generate UBFM/SBFM instructions with shifts that can't be
encoded in the immediate fields of the opcodes leads to a trigger of a
BUG() in the instruction generation code.  As the ARMv8 ARM says: "The
shift amounts must be in the range 0 to one less than the register width
of the instruction, inclusive."  Make the JIT reject unencodable shifts
instead of crashing.

 [ cut here ]
 kernel BUG at arch/arm64/kernel/insn.c:766!
 Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
 CPU: 0 PID: 669 Comm: insmod Not tainted 4.4.0-rc8+ #4
 PC is at aarch64_insn_gen_bitfield+0xcc/0xd4
 LR is at build_body+0x1000/0x2914
 ..
 Call trace:
 [] aarch64_insn_gen_bitfield+0xcc/0xd4
 [] build_body+0x1000/0x2914
 [] bpf_int_jit_compile+0x7c/0x1b4
 [] bpf_prog_select_runtime+0x20/0xcc
 [] bpf_prepare_filter+0x3d8/0x3e8
 [] bpf_prog_create+0x74/0xa4
 [] test_bpf_init+0x1d4/0x748 [test_bpf]
 [] do_one_initcall+0x90/0x1a8
 [] do_init_module+0x60/0x1c8
 [] load_module+0x1554/0x1c98
 [] SyS_init_module+0x11c/0x140
 [] el0_svc_naked+0x24/0x28

Signed-off-by: Rabin Vincent 
---
 arch/arm64/net/bpf_jit_comp.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index b162ad70effc..3f4f089a85c0 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -255,6 +255,7 @@ static int build_insn(const struct bpf_insn *insn, struct 
jit_ctx *ctx)
const s32 imm = insn->imm;
const int i = insn - ctx->prog->insnsi;
const bool is64 = BPF_CLASS(code) == BPF_ALU64;
+   const int bits = is64 ? 64 : 32;
u8 jmp_cond;
s32 jmp_offset;
 
@@ -444,14 +445,20 @@ emit_bswap_uxt:
break;
case BPF_ALU | BPF_LSH | BPF_K:
case BPF_ALU64 | BPF_LSH | BPF_K:
+   if (imm < 0 || imm >= bits)
+   return -EINVAL;
emit(A64_LSL(is64, dst, dst, imm), ctx);
break;
case BPF_ALU | BPF_RSH | BPF_K:
case BPF_ALU64 | BPF_RSH | BPF_K:
+   if (imm < 0 || imm >= bits)
+   return -EINVAL;
emit(A64_LSR(is64, dst, dst, imm), ctx);
break;
case BPF_ALU | BPF_ARSH | BPF_K:
case BPF_ALU64 | BPF_ARSH | BPF_K:
+   if (imm < 0 || imm >= bits)
+   return -EINVAL;
emit(A64_ASR(is64, dst, dst, imm), ctx);
break;
 
-- 
2.6.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] soreuseport: change consume_skb to kfree_skb in error case

2016-01-05 Thread Eric Dumazet
On Tue, 2016-01-05 at 10:57 -0500, Craig Gallek wrote:
> From: Craig Gallek 
> 
> Fixes: 538950a1b752 ("soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF")
> Suggested-by: Daniel Borkmann 
> Signed-off-by: Craig Gallek 
> ---
>  net/core/sock_reuseport.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/core/sock_reuseport.c b/net/core/sock_reuseport.c
> index ae0969c0fc2e..1df98c557440 100644
> --- a/net/core/sock_reuseport.c
> +++ b/net/core/sock_reuseport.c
> @@ -173,7 +173,7 @@ static struct sock *run_bpf(struct sock_reuseport *reuse, 
> u16 socks,
>  
>   /* temporarily advance data past protocol header */
>   if (!pskb_pull(skb, hdr_len)) {
> - consume_skb(nskb);
> + kfree_skb(nskb);
>   return NULL;
>   }
>   index = bpf_prog_run_save_cb(prog, skb);

Note that we always call reuseport_select_sock() after pulling the
headers in skb->head anyway, so the pskb_pull() can never fail.

It really could be __skb_pull()

BTW, why UDP calls reuseport_select_sock() with hdr_len == 0 sometimes ?

I believe the following patch is needed.

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 835378365f25..52387096dbba 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -514,7 +514,8 @@ begin:
struct sock *sk2;
hash = udp_ehashfn(net, daddr, hnum,
   saddr, sport);
-   sk2 = reuseport_select_sock(sk, hash, NULL, 0);
+   sk2 = reuseport_select_sock(sk, hash, NULL,
+   sizeof(struct 
udphdr));
if (sk2) {
result = sk2;
goto found;
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 56fcb55fda31..da0a5fa02b0f 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -272,7 +272,8 @@ begin:
struct sock *sk2;
hash = udp6_ehashfn(net, daddr, hnum,
saddr, sport);
-   sk2 = reuseport_select_sock(sk, hash, NULL, 0);
+   sk2 = reuseport_select_sock(sk, hash, NULL,
+   sizeof(struct 
udphdr));
if (sk2) {
result = sk2;
goto found;


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: filter: make JITs zero A for SKF_AD_ALU_XOR_X

2016-01-05 Thread Alexei Starovoitov
On Tue, Jan 05, 2016 at 05:36:47PM +0100, Daniel Borkmann wrote:
> On 01/05/2016 04:23 PM, Rabin Vincent wrote:
> >The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data
> >instructions since it XORs A with X while all the others replace A with
> >some loaded value.  All the BPF JITs fail to clear A if this is used as
> >the first instruction in a filter.  This was found using american fuzzy
> >lop.
> >
> >Add a helper to determine if A needs to be cleared given the first
> >instruction in a filter, and use this in the JITs.  Except for ARM, the
> >rest have only been compile-tested.
> >
> >Fixes: 3480593131e0 ("net: filter: get rid of BPF_S_* enum")
> >Signed-off-by: Rabin Vincent 
> 
> Excellent catch, thanks a lot! The fix looks good to me and should
> go to -net tree.
> 
> Acked-by: Daniel Borkmann 

good catch indeed.
Classic bpf jits didn't have much love. Great to see this work.

Acked-by: Alexei Starovoitov 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: sched: fix missing free per cpu on qstats

2016-01-05 Thread Eric Dumazet
On Tue, 2016-01-05 at 09:11 -0800, John Fastabend wrote:
> When a qdisc is using per cpu stats (currently just the ingress
> qdisc) only the bstats are being freed. This also free's the qstats.
> 
> Signed-off-by: John Fastabend 
> ---

Acked-by: Eric Dumazet 

David, please add the following tag to ease backports to stable kernels:

Fixes: 22e0f8b9322cb ("net: sched: make bstats per cpu and estimator RCU safe")

Thanks !


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ARM: net: bpf: fix zero right shift

2016-01-05 Thread Alexei Starovoitov
On Tue, Jan 05, 2016 at 06:34:04PM +0100, Rabin Vincent wrote:
> The LSR instruction cannot be used to perform a zero right shift since a
> 0 as the immediate value (imm5) in the LSR instruction encoding means
> that a shift of 32 is perfomed.  See DecodeIMMShift() in the ARM ARM.
> 
> Make the JIT skip generation of the LSR if a zero-shift is requested.
> 
> This was found using american fuzzy lop.
> 
> Signed-off-by: Rabin Vincent 

Looks good as a fix for classic jit. For eBPF we would want to check
this in verifier.

Acked-by: Alexei Starovoitov 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] soreuseport: change consume_skb to kfree_skb in error case

2016-01-05 Thread Craig Gallek
On Tue, Jan 5, 2016 at 12:38 PM, Eric Dumazet  wrote:
> On Tue, 2016-01-05 at 10:57 -0500, Craig Gallek wrote:
>> From: Craig Gallek 
>>
>> Fixes: 538950a1b752 ("soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF")
>> Suggested-by: Daniel Borkmann 
>> Signed-off-by: Craig Gallek 
>> ---
>>  net/core/sock_reuseport.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/net/core/sock_reuseport.c b/net/core/sock_reuseport.c
>> index ae0969c0fc2e..1df98c557440 100644
>> --- a/net/core/sock_reuseport.c
>> +++ b/net/core/sock_reuseport.c
>> @@ -173,7 +173,7 @@ static struct sock *run_bpf(struct sock_reuseport 
>> *reuse, u16 socks,
>>
>>   /* temporarily advance data past protocol header */
>>   if (!pskb_pull(skb, hdr_len)) {
>> - consume_skb(nskb);
>> + kfree_skb(nskb);
>>   return NULL;
>>   }
>>   index = bpf_prog_run_save_cb(prog, skb);
>
> Note that we always call reuseport_select_sock() after pulling the
> headers in skb->head anyway, so the pskb_pull() can never fail.
>
> It really could be __skb_pull()
>
> BTW, why UDP calls reuseport_select_sock() with hdr_len == 0 sometimes ?
hdr_len only matters when you have an skb to work with.  In both of
the call sites of your suggested patch, NULL is passed for the skb
parameter so hdr_len is never used.  When no skb is available, the
code falls back to the hash-based method used in the non-BPF case.

> I believe the following patch is needed.
>
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index 835378365f25..52387096dbba 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -514,7 +514,8 @@ begin:
> struct sock *sk2;
> hash = udp_ehashfn(net, daddr, hnum,
>saddr, sport);
> -   sk2 = reuseport_select_sock(sk, hash, NULL, 
> 0);
> +   sk2 = reuseport_select_sock(sk, hash, NULL,
> +   sizeof(struct 
> udphdr));
> if (sk2) {
> result = sk2;
> goto found;
> diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
> index 56fcb55fda31..da0a5fa02b0f 100644
> --- a/net/ipv6/udp.c
> +++ b/net/ipv6/udp.c
> @@ -272,7 +272,8 @@ begin:
> struct sock *sk2;
> hash = udp6_ehashfn(net, daddr, hnum,
> saddr, sport);
> -   sk2 = reuseport_select_sock(sk, hash, NULL, 
> 0);
> +   sk2 = reuseport_select_sock(sk, hash, NULL,
> +   sizeof(struct 
> udphdr));
> if (sk2) {
> result = sk2;
> goto found;
>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] arm64: net: bpf: don't BUG() on large shifts

2016-01-05 Thread Alexei Starovoitov
On Tue, Jan 05, 2016 at 06:39:03PM +0100, Rabin Vincent wrote:
> Attempting to generate UBFM/SBFM instructions with shifts that can't be
> encoded in the immediate fields of the opcodes leads to a trigger of a
> BUG() in the instruction generation code.  As the ARMv8 ARM says: "The
> shift amounts must be in the range 0 to one less than the register width
> of the instruction, inclusive."  Make the JIT reject unencodable shifts
> instead of crashing.
> 
>  [ cut here ]
>  kernel BUG at arch/arm64/kernel/insn.c:766!
>  Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
>  CPU: 0 PID: 669 Comm: insmod Not tainted 4.4.0-rc8+ #4
>  PC is at aarch64_insn_gen_bitfield+0xcc/0xd4
>  LR is at build_body+0x1000/0x2914
>  ..
>  Call trace:
>  [] aarch64_insn_gen_bitfield+0xcc/0xd4
>  [] build_body+0x1000/0x2914
>  [] bpf_int_jit_compile+0x7c/0x1b4
>  [] bpf_prog_select_runtime+0x20/0xcc
>  [] bpf_prepare_filter+0x3d8/0x3e8
>  [] bpf_prog_create+0x74/0xa4
>  [] test_bpf_init+0x1d4/0x748 [test_bpf]
>  [] do_one_initcall+0x90/0x1a8
>  [] do_init_module+0x60/0x1c8
>  [] load_module+0x1554/0x1c98
>  [] SyS_init_module+0x11c/0x140
>  [] el0_svc_naked+0x24/0x28
> 
> Signed-off-by: Rabin Vincent 
> ---
>  arch/arm64/net/bpf_jit_comp.c | 7 +++
>  1 file changed, 7 insertions(+)

this one is better to be addressed in verifier instead of eBPF JITs.
Please reject it in check_alu_op() instead.
Though this bug is arm64 only and doesn't affect x64, it's better
to reject such invalid programs, since shifts with large constants
can be only be created manually. llvm doesn't generate such things.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: sched: fix missing free per cpu on qstats

2016-01-05 Thread Daniel Borkmann

On 01/05/2016 06:11 PM, John Fastabend wrote:

When a qdisc is using per cpu stats (currently just the ingress
qdisc) only the bstats are being freed. This also free's the qstats.

Signed-off-by: John Fastabend 


Acked-by: Daniel Borkmann 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] ppp: add netlink support

2016-01-05 Thread Guillaume Nault
On Wed, Dec 23, 2015 at 09:04:46PM +0100, Guillaume Nault wrote:
> This series adds netlink support for creating PPP devices.
> 
Any feedback on this series? I can see that it has been marked
"Deferred" in patchwork, so I'm unsure about what to do with this patch
set now.
Should I repost later (e.g. after the end of the merge window)? Are
there any concern/issues with the series itself? Or should the idea of
implementing netlink handlers for ppp devices be entirely dropped?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] soreuseport: change consume_skb to kfree_skb in error case

2016-01-05 Thread Eric Dumazet
On Tue, 2016-01-05 at 12:47 -0500, Craig Gallek wrote:
> On Tue, Jan 5, 2016 at 12:38 PM, Eric Dumazet  wrote:

> >
> > BTW, why UDP calls reuseport_select_sock() with hdr_len == 0 sometimes ?
> hdr_len only matters when you have an skb to work with.  In both of
> the call sites of your suggested patch, NULL is passed for the skb
> parameter so hdr_len is never used.  When no skb is available, the
> code falls back to the hash-based method used in the non-BPF case.

But skb _is_ available !

We expect the BPF to always be run to get consistent selection.

udp4_lib_lookup2() can be used if the hash bucket has more than 10
sockets. SO_REUSEPORT should work the same way, ie BPF filter shoulw
work regardless of number of sockets in hash table.

diff --git a/net/core/sock_reuseport.c b/net/core/sock_reuseport.c
index ae0969c0fc2e..2b0bbecbc4b5 100644
--- a/net/core/sock_reuseport.c
+++ b/net/core/sock_reuseport.c
@@ -220,7 +220,7 @@ struct sock *reuseport_select_sock(struct sock *sk,
/* paired with smp_wmb() in reuseport_add_sock() */
smp_rmb();
 
-   if (prog && skb)
+   if (prog)
sk2 = run_bpf(reuse, socks, prog, skb, hdr_len);
else
sk2 = reuse->socks[reciprocal_scale(hash, socks)];
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 835378365f25..3a66731e3af6 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -493,7 +493,8 @@ static u32 udp_ehashfn(const struct net *net, const __be32 
laddr,
 static struct sock *udp4_lib_lookup2(struct net *net,
__be32 saddr, __be16 sport,
__be32 daddr, unsigned int hnum, int dif,
-   struct udp_hslot *hslot2, unsigned int slot2)
+   struct udp_hslot *hslot2, unsigned int slot2,
+   struct sk_buff *skb)
 {
struct sock *sk, *result;
struct hlist_nulls_node *node;
@@ -514,7 +515,8 @@ begin:
struct sock *sk2;
hash = udp_ehashfn(net, daddr, hnum,
   saddr, sport);
-   sk2 = reuseport_select_sock(sk, hash, NULL, 0);
+   sk2 = reuseport_select_sock(sk, hash, skb,
+   sizeof(struct 
udphdr));
if (sk2) {
result = sk2;
goto found;
@@ -573,7 +575,7 @@ struct sock *__udp4_lib_lookup(struct net *net, __be32 
saddr,
 
result = udp4_lib_lookup2(net, saddr, sport,
  daddr, hnum, dif,
- hslot2, slot2);
+ hslot2, slot2, skb);
if (!result) {
hash2 = udp4_portaddr_hash(net, htonl(INADDR_ANY), 
hnum);
slot2 = hash2 & udptable->mask;
@@ -583,7 +585,7 @@ struct sock *__udp4_lib_lookup(struct net *net, __be32 
saddr,
 
result = udp4_lib_lookup2(net, saddr, sport,
  htonl(INADDR_ANY), hnum, dif,
- hslot2, slot2);
+ hslot2, slot2, skb);
}
rcu_read_unlock();
return result;
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 56fcb55fda31..5d2c2afffe7b 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -251,7 +251,8 @@ static inline int compute_score2(struct sock *sk, struct 
net *net,
 static struct sock *udp6_lib_lookup2(struct net *net,
const struct in6_addr *saddr, __be16 sport,
const struct in6_addr *daddr, unsigned int hnum, int dif,
-   struct udp_hslot *hslot2, unsigned int slot2)
+   struct udp_hslot *hslot2, unsigned int slot2,
+   struct sk_buff *skb)
 {
struct sock *sk, *result;
struct hlist_nulls_node *node;
@@ -272,7 +273,8 @@ begin:
struct sock *sk2;
hash = udp6_ehashfn(net, daddr, hnum,
saddr, sport);
-   sk2 = reuseport_select_sock(sk, hash, NULL, 0);
+   sk2 = reuseport_select_sock(sk, hash, skb,
+   sizeof(struct 
udphdr));
if (sk2) {
result = sk2;
goto found;
@@ -331,7 +333,7 @@ struct sock *__udp6_lib_lookup(struct net *net,
 
result = udp6_lib_lookup2(net, saddr, sport,
  daddr, hnum, dif,
- hslot2, slot2);
+ 

Re: [PATCH next] net/core/dev: Warn on an impossibly short offload frame

2016-01-05 Thread Aaron Conole
Joe Perches  writes:
> On Sat, 2016-01-02 at 19:25 -0500, Aaron Conole wrote:
>> When signaling that a GRO frame is ready to be processed, the network stack
>> correctly checks length and aborts processing when a frame is less than 14
>> bytes. However, such a condition is really indicative of a broken driver,
>> and should be loudly signaled, rather than silently dropped as the case is
>> today.
>> 
>> Convert the condition to use WARN_ON() to ensure that the stack loudly
>> complains about such broken drivers.
> []
>> diff --git a/net/core/dev.c b/net/core/dev.c
> []
>> @@ -4579,7 +4579,7 @@ static struct sk_buff *napi_frags_skb(struct
>> napi_struct *napi)
>>  eth = skb_gro_header_fast(skb, 0);
>>  if (unlikely(skb_gro_header_hard(skb, hlen))) {
>>  eth = skb_gro_header_slow(skb, hlen, 0);
>> -if (unlikely(!eth)) {
>> +if (WARN_ON(!eth)) {
>>  napi_reuse_skb(napi, skb);
>>  return NULL;
>>  }
>
> It's generally a good idea to use
> WARN_ON_RATELIMIT or WARN_ON_ONCE.

Okay, I'll respin switching to WARN_ON_RATELIMIT, if that's a better
approach.

Thanks for the review, Joe!

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 1/5] sctp: add the rhashtable apis for sctp global transport hashtable

2016-01-05 Thread Vlad Yasevich
On 12/30/2015 10:50 AM, Xin Long wrote:
> tranport hashtbale will replace the association hashtable to do the
> lookup for transport, and then get association by t->assoc, rhashtable
> apis will be used because of it's resizable, scalable and using rcu.
> 
> lport + rport + paddr will be the base hashkey to locate the chain,
> with net to protect one netns from another, then plus the laddr to
> compare to get the target.
> 
> this patch will provider the lookup functions:
> - sctp_epaddr_lookup_transport
> - sctp_addrs_lookup_transport
> 
> hash/unhash functions:
> - sctp_hash_transport
> - sctp_unhash_transport
> 
> init/destroy functions:
> - sctp_transport_hashtable_init
> - sctp_transport_hashtable_destroy
> 
> Signed-off-by: Xin Long 
> Signed-off-by: Marcelo Ricardo Leitner 
> ---
>  include/net/sctp/sctp.h|  11 
>  include/net/sctp/structs.h |   5 ++
>  net/sctp/input.c   | 131 
> +
>  3 files changed, 147 insertions(+)
> 
> diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
> index ce13cf2..7bbdfba 100644
> --- a/include/net/sctp/sctp.h
> +++ b/include/net/sctp/sctp.h
> @@ -143,6 +143,17 @@ void sctp_icmp_proto_unreachable(struct sock *sk,
>struct sctp_transport *t);
>  void sctp_backlog_migrate(struct sctp_association *assoc,
> struct sock *oldsk, struct sock *newsk);
> +int sctp_transport_hashtable_init(void);
> +void sctp_transport_hashtable_destroy(void);
> +void sctp_hash_transport(struct sctp_transport *t);
> +void sctp_unhash_transport(struct sctp_transport *t);
> +struct sctp_transport *sctp_addrs_lookup_transport(
> + struct net *net,
> + const union sctp_addr *laddr,
> + const union sctp_addr *paddr);
> +struct sctp_transport *sctp_epaddr_lookup_transport(
> + const struct sctp_endpoint *ep,
> + const union sctp_addr *paddr);
>  
>  /*
>   * sctp/proc.c
> diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
> index eea9bde..4ab87d0 100644
> --- a/include/net/sctp/structs.h
> +++ b/include/net/sctp/structs.h
> @@ -48,6 +48,7 @@
>  #define __sctp_structs_h__
>  
>  #include 
> +#include 
>  #include /* linux/in.h needs this!!*/
>  #include /* We get struct sockaddr_in. */
>  #include/* We get struct in6_addr */
> @@ -123,6 +124,8 @@ extern struct sctp_globals {
>   struct sctp_hashbucket *assoc_hashtable;
>   /* This is the sctp port control hash.  */
>   struct sctp_bind_hashbucket *port_hashtable;
> + /* This is the hash of all transports. */
> + struct rhashtable transport_hashtable;
>  
>   /* Sizes of above hashtables. */
>   int ep_hashsize;
> @@ -147,6 +150,7 @@ extern struct sctp_globals {
>  #define sctp_assoc_hashtable (sctp_globals.assoc_hashtable)
>  #define sctp_port_hashsize   (sctp_globals.port_hashsize)
>  #define sctp_port_hashtable  (sctp_globals.port_hashtable)
> +#define sctp_transport_hashtable (sctp_globals.transport_hashtable)
>  #define sctp_checksum_disable(sctp_globals.checksum_disable)
>  
>  /* SCTP Socket type: UDP or TCP style. */
> @@ -753,6 +757,7 @@ static inline int sctp_packet_empty(struct sctp_packet 
> *packet)
>  struct sctp_transport {
>   /* A list of transports. */
>   struct list_head transports;
> + struct rhash_head node;
>  
>   /* Reference counting. */
>   atomic_t refcnt;
> diff --git a/net/sctp/input.c b/net/sctp/input.c
> index b6493b3..bac8278 100644
> --- a/net/sctp/input.c
> +++ b/net/sctp/input.c
> @@ -782,6 +782,137 @@ hit:
>   return ep;
>  }
>  
> +/* rhashtable for transport */
> +struct sctp_hash_cmp_arg {
> + const union sctp_addr   *laddr;
> + const union sctp_addr   *paddr;
> + const struct net*net;
> +};
> +
> +static inline int sctp_hash_cmp(struct rhashtable_compare_arg *arg,
> + const void *ptr)
> +{
> + const struct sctp_hash_cmp_arg *x = arg->key;
> + const struct sctp_transport *t = ptr;
> + struct sctp_association *asoc = t->asoc;
> + const struct net *net = x->net;
> +
> + if (x->laddr->v4.sin_port != htons(asoc->base.bind_addr.port))
> + return 1;
> + if (!sctp_cmp_addr_exact(&t->ipaddr, x->paddr))
> + return 1;
> + if (!net_eq(sock_net(asoc->base.sk), net))
> + return 1;
> + if (!sctp_bind_addr_match(&asoc->base.bind_addr,
> +   x->laddr, sctp_sk(asoc->base.sk)))
> + return 1;
> +
> + return 0;
> +}
> +
> +static inline u32 sctp_hash_obj(const void *data, u32 len, u32 seed)
> +{
> + const struct sctp_transport *t = data;
> + const union sctp_addr *paddr = &t->ipaddr;
> + const struct net *net = sock_net(t->asoc

[PATCH v2 net-next] net: Implement fast csum_partial for x86_64

2016-01-05 Thread Tom Herbert
Implement assembly routine for csum_partial for 64 bit x86. This
primarily speeds up checksum calculation for smaller lengths such as
those that are present when doing skb_postpull_rcsum when getting
CHECKSUM_COMPLETE from device or after CHECKSUM_UNNECESSARY
conversion.

This implementation is similar to csum_partial implemented in
checksum_32.S, however since we are dealing with 8 bytes at a time
there are more cases for small lengths-- for that we employ a jump
table. Also, we don't do anything special for alignment, unaligned
accesses on x86 do not appear to be a performance issue.

Testing:

Verified correctness by testing arbitrary length buffer filled with
random data. For each buffer I compared the computed checksum
using the original algorithm for each possible alignment (0-7 bytes).

Checksum performance:

Isolating old and new implementation for some common cases:

Old  New
Casensecsnsecs Improvement
-+++-
1400 bytes (0 align)194.5174.3 10%(Big packet)
40 bytes (0 align)  13.8 5.8   57%(Ipv6 hdr common case)
8 bytes (4 align)   8.4  2.9   65%(UDP, VXLAN in IPv4)
14 bytes (0 align)  10.6 5.8   45%(Eth hdr)
14 bytes (4 align)  10.8 5.8   46%(Eth hdr in IPv4)

Signed-off-by: Tom Herbert 
---
 arch/x86/include/asm/checksum_64.h |   5 ++
 arch/x86/lib/csum-partial_64.S | 147 
 arch/x86/lib/csum-partial_64.c | 148 -
 3 files changed, 152 insertions(+), 148 deletions(-)
 create mode 100644 arch/x86/lib/csum-partial_64.S
 delete mode 100644 arch/x86/lib/csum-partial_64.c

diff --git a/arch/x86/include/asm/checksum_64.h 
b/arch/x86/include/asm/checksum_64.h
index cd00e17..a888f65 100644
--- a/arch/x86/include/asm/checksum_64.h
+++ b/arch/x86/include/asm/checksum_64.h
@@ -128,6 +128,11 @@ static inline __sum16 csum_tcpudp_magic(__be32 saddr, 
__be32 daddr,
  */
 extern __wsum csum_partial(const void *buff, int len, __wsum sum);
 
+static inline __sum16 ip_compute_csum(const void *buff, int len)
+{
+   return csum_fold(csum_partial(buff, len, 0));
+}
+
 #define  _HAVE_ARCH_COPY_AND_CSUM_FROM_USER 1
 #define HAVE_CSUM_COPY_USER 1
 
diff --git a/arch/x86/lib/csum-partial_64.S b/arch/x86/lib/csum-partial_64.S
new file mode 100644
index 000..8e387bb
--- /dev/null
+++ b/arch/x86/lib/csum-partial_64.S
@@ -0,0 +1,147 @@
+/* Copyright 2016 Tom Herbert 
+ *
+ * Checksum partial calculation
+ *
+ * __wsum csum_partial(const void *buff, int len, __wsum sum)
+ *
+ * Computes the checksum of a memory block at buff, length len,
+ * and adds in "sum" (32-bit)
+ *
+ * Returns a 32-bit number suitable for feeding into itself
+ * or csum_tcpudp_magic
+ *
+ * Register usage:
+ *   %rdi: argument 1, buff
+ *   %rsi: argument 2, length
+ *   %rdx: argument 3, add in value
+ *   %rax,%eax: accumulator and return value
+ *   %rcx,%ecx: counter and tmp
+ *   %r11: tmp
+ *
+ * Basic algorithm:
+ *   1) Sum 8 bytes at a time using adcq (unroll main loop
+ *  to do 64 bytes at a time)
+ *   2) Sum remaining length (less than 8 bytes)
+ *
+ * Note that buffer aligment is not considered, unaligned accesses on x86 don't
+ * seem to be a performance hit (CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is 
set).
+ */
+
+#include 
+#include 
+#include 
+
+#define branch_tbl_len .L_branch_tbl_len
+
+ENTRY(csum_partial)
+   movl%edx, %eax  /* Initialize with initial sum argument */
+
+   /* Check length */
+   cmpl$8, %esi
+   jg  10f
+   jl  20f
+
+   /* Exactly 8 bytes length */
+   addl(%rdi), %eax
+   adcl4(%rdi), %eax
+   adcl$0, %eax
+   ret
+
+   /* Less than 8 bytes length */
+20:clc
+   jmpq *branch_tbl_len(, %rsi, 8)
+
+   /* Greater than 8 bytes length. Determine number of quads (n). Sum
+* over first n % 8 quads
+*/
+10:movl%esi, %ecx
+   shrl$3, %ecx
+   andl$0x7, %ecx
+   negq%rcx
+   lea 20f(, %rcx, 4), %r11
+   clc
+   jmp *%r11
+
+.align 8
+   adcq6*8(%rdi),%rax
+   adcq5*8(%rdi),%rax
+   adcq4*8(%rdi),%rax
+   adcq3*8(%rdi),%rax
+   adcq2*8(%rdi),%rax
+   adcq1*8(%rdi),%rax
+   adcq0*8(%rdi),%rax
+   nop
+20:/* #quads % 8 jump table base */
+
+   adcq$0, %rax
+   shlq$3, %rcx
+   subq%rcx, %rdi /* %rcx is already negative length */
+
+   /* Now determine number of blocks of 8 quads. Sum 64 bytes at a time
+* using unrolled loop.
+*/
+   movl%esi, %ecx
+   shrl$6, %ecx
+   jz  30f
+   clc
+
+   /* Main loop */
+40:adcq0*8(%rdi),%rax
+   adcq1*8(%rdi),%rax
+   adcq2*8(%rdi),%rax
+   adcq3*8(%rdi),%rax
+   adcq

Re: [PATCH (net-next.git) 01/18] stmmac: share reset function between dwmac100 and dwmac1000

2016-01-05 Thread David Miller
From: Giuseppe CAVALLARO 
Date: Tue, 5 Jan 2016 10:03:28 +0100

> On 1/5/2016 4:25 AM, David Miller wrote:
>> From: Giuseppe Cavallaro 
>> Date: Mon, 4 Jan 2016 14:06:46 +0100
>>
>>> @@ -376,7 +376,8 @@ extern const struct stmmac_desc_ops ndesc_ops;
>>>   /* Specific DMA helpers */
>>>   struct stmmac_dma_ops {
>>> /* DMA core initialization */
>>> -   int (*init) (void __iomem *ioaddr, int pbl, int fb, int mb,
>>> +   int (*reset)(void __iomem *ioaddr);
>>> +   void (*init)(void __iomem *ioaddr, int pbl, int fb, int mb,
>>>  int burst_len, u32 dma_tx, u32 dma_rx, int atds);
>>
>> Since you change the return type of the 'init' method, and this
>> changes the column of the openning parenthesis, you have to fix the
>> indentation of the argument list on the next line.
>>
> 
> hmm, lines are well aligned.
> 
> I will check again, in case of I introduced some indentation problem.

Either it was wrong to begin with (I checked before I replied to this posting
and didn't see a misalignment) or it is wrong after the change since void is
one more column more than int.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next V3 0/4] Introduce mlx5 ethernet timestamping

2016-01-05 Thread David Miller
From: Richard Cochran 
Date: Tue, 5 Jan 2016 13:51:18 +0100

> On Mon, Jan 04, 2016 at 04:47:03PM -0500, David Miller wrote:
>> Richard, please review this series.
> 
> It looks fine to me now, and I acked the timestamping/phc bits.

Thank you.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 2/5] sctp: apply rhashtable api to send/recv path

2016-01-05 Thread Vlad Yasevich
On 12/30/2015 10:50 AM, Xin Long wrote:
> apply lookup apis to two functions, for __sctp_endpoint_lookup_assoc
> and __sctp_lookup_association, it's invoked in the protection of sock
> lock, it will be safe, but sctp_lookup_association need to call
> rcu_read_lock() and to detect the t->dead to protect it.
> 
> Signed-off-by: Xin Long 
> Signed-off-by: Marcelo Ricardo Leitner 
> ---
>  net/sctp/associola.c   |  5 +
>  net/sctp/endpointola.c | 35 ---
>  net/sctp/input.c   | 39 ++-
>  net/sctp/protocol.c|  6 ++
>  4 files changed, 29 insertions(+), 56 deletions(-)
> 
> diff --git a/net/sctp/associola.c b/net/sctp/associola.c
> index 559afd0..2bf8ec9 100644
> --- a/net/sctp/associola.c
> +++ b/net/sctp/associola.c
> @@ -383,6 +383,7 @@ void sctp_association_free(struct sctp_association *asoc)
>   list_for_each_safe(pos, temp, &asoc->peer.transport_addr_list) {
>   transport = list_entry(pos, struct sctp_transport, transports);
>   list_del_rcu(pos);
> + sctp_unhash_transport(transport);
>   sctp_transport_free(transport);
>   }
>  
> @@ -500,6 +501,8 @@ void sctp_assoc_rm_peer(struct sctp_association *asoc,
>  
>   /* Remove this peer from the list. */
>   list_del_rcu(&peer->transports);
> + /* Remove this peer from the transport hashtable */
> + sctp_unhash_transport(peer);
>  
>   /* Get the first transport of asoc. */
>   pos = asoc->peer.transport_addr_list.next;
> @@ -699,6 +702,8 @@ struct sctp_transport *sctp_assoc_add_peer(struct 
> sctp_association *asoc,
>   /* Attach the remote transport to our asoc.  */
>   list_add_tail_rcu(&peer->transports, &asoc->peer.transport_addr_list);
>   asoc->peer.transport_count++;
> + /* Add this peer into the transport hashtable */
> + sctp_hash_transport(peer);

This is actually problematic.  The issue is that transports are unhashed when 
removed.
however, transport removal happens after the association has been declared dead 
and
should have been removed from the hash and marked unreachable.

As a result, with the code above, you can now find and return a dead 
association.
Checking for 'dead' state is racy.

The best solution I've come up with is to hash the transports in 
sctp_hash_established()
and clean-up in __sctp_unhash_established(), and then handle ADD-IP case 
separately.

The above would also remove the necessity to check for temporary associations, 
since they
should never be hashed.

-vlad

>  
>   /* If we do not yet have a primary path, set one.  */
>   if (!asoc->peer.primary_path) {
> diff --git a/net/sctp/endpointola.c b/net/sctp/endpointola.c
> index 9da76ba..8838bf4 100644
> --- a/net/sctp/endpointola.c
> +++ b/net/sctp/endpointola.c
> @@ -314,8 +314,8 @@ struct sctp_endpoint *sctp_endpoint_is_match(struct 
> sctp_endpoint *ep,
>  }
>  
>  /* Find the association that goes with this chunk.
> - * We do a linear search of the associations for this endpoint.
> - * We return the matching transport address too.
> + * We lookup the transport from hashtable at first, then get association
> + * through t->assoc.
>   */
>  static struct sctp_association *__sctp_endpoint_lookup_assoc(
>   const struct sctp_endpoint *ep,
> @@ -323,12 +323,7 @@ static struct sctp_association 
> *__sctp_endpoint_lookup_assoc(
>   struct sctp_transport **transport)
>  {
>   struct sctp_association *asoc = NULL;
> - struct sctp_association *tmp;
> - struct sctp_transport *t = NULL;
> - struct sctp_hashbucket *head;
> - struct sctp_ep_common *epb;
> - int hash;
> - int rport;
> + struct sctp_transport *t;
>  
>   *transport = NULL;
>  
> @@ -337,26 +332,12 @@ static struct sctp_association 
> *__sctp_endpoint_lookup_assoc(
>*/
>   if (!ep->base.bind_addr.port)
>   goto out;
> + t = sctp_epaddr_lookup_transport(ep, paddr);
> + if (!t || t->asoc->temp)
> + goto out;
>  
> - rport = ntohs(paddr->v4.sin_port);
> -
> - hash = sctp_assoc_hashfn(sock_net(ep->base.sk), ep->base.bind_addr.port,
> -  rport);
> - head = &sctp_assoc_hashtable[hash];
> - read_lock(&head->lock);
> - sctp_for_each_hentry(epb, &head->chain) {
> - tmp = sctp_assoc(epb);
> - if (tmp->ep != ep || rport != tmp->peer.port)
> - continue;
> -
> - t = sctp_assoc_lookup_paddr(tmp, paddr);
> - if (t) {
> - asoc = tmp;
> - *transport = t;
> - break;
> - }
> - }
> - read_unlock(&head->lock);
> + *transport = t;
> + asoc = t->asoc;
>  out:
>   return asoc;
>  }
> diff --git a/net/sctp/input.c b/net/sctp/input.c
> index bac8278..6f075d8 100644
> --- a/net/sctp/input.c
> +++ b/net/sctp/input.c
> @@ -981,38 +981,19 @@ static struct sctp_associat

Re: [PATCH net-next] soreuseport: change consume_skb to kfree_skb in error case

2016-01-05 Thread Craig Gallek
On Tue, Jan 5, 2016 at 1:31 PM, Eric Dumazet  wrote:
> On Tue, 2016-01-05 at 12:47 -0500, Craig Gallek wrote:
>> On Tue, Jan 5, 2016 at 12:38 PM, Eric Dumazet  wrote:
>
>> >
>> > BTW, why UDP calls reuseport_select_sock() with hdr_len == 0 sometimes ?
>> hdr_len only matters when you have an skb to work with.  In both of
>> the call sites of your suggested patch, NULL is passed for the skb
>> parameter so hdr_len is never used.  When no skb is available, the
>> code falls back to the hash-based method used in the non-BPF case.
>
> But skb _is_ available !
>
> We expect the BPF to always be run to get consistent selection.
>
> udp4_lib_lookup2() can be used if the hash bucket has more than 10
> sockets. SO_REUSEPORT should work the same way, ie BPF filter shoulw
> work regardless of number of sockets in hash table.
OK, I can buy that an skb should be piped through udp4|6_lib_lookup2,
but I don't think it's safe to remove the skb NULL check in
reuseport_select_sock.  There's at least one path (udp_diag.c:
udp_dump_one) which does a lookup without an skb.  I'll start with
your patch in a separate thread and we can discuss outside the context
of the kfree_skb change.

> diff --git a/net/core/sock_reuseport.c b/net/core/sock_reuseport.c
> index ae0969c0fc2e..2b0bbecbc4b5 100644
> --- a/net/core/sock_reuseport.c
> +++ b/net/core/sock_reuseport.c
> @@ -220,7 +220,7 @@ struct sock *reuseport_select_sock(struct sock *sk,
> /* paired with smp_wmb() in reuseport_add_sock() */
> smp_rmb();
>
> -   if (prog && skb)
> +   if (prog)
> sk2 = run_bpf(reuse, socks, prog, skb, hdr_len);
> else
> sk2 = reuse->socks[reciprocal_scale(hash, socks)];
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index 835378365f25..3a66731e3af6 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -493,7 +493,8 @@ static u32 udp_ehashfn(const struct net *net, const 
> __be32 laddr,
>  static struct sock *udp4_lib_lookup2(struct net *net,
> __be32 saddr, __be16 sport,
> __be32 daddr, unsigned int hnum, int dif,
> -   struct udp_hslot *hslot2, unsigned int slot2)
> +   struct udp_hslot *hslot2, unsigned int slot2,
> +   struct sk_buff *skb)
>  {
> struct sock *sk, *result;
> struct hlist_nulls_node *node;
> @@ -514,7 +515,8 @@ begin:
> struct sock *sk2;
> hash = udp_ehashfn(net, daddr, hnum,
>saddr, sport);
> -   sk2 = reuseport_select_sock(sk, hash, NULL, 
> 0);
> +   sk2 = reuseport_select_sock(sk, hash, skb,
> +   sizeof(struct 
> udphdr));
> if (sk2) {
> result = sk2;
> goto found;
> @@ -573,7 +575,7 @@ struct sock *__udp4_lib_lookup(struct net *net, __be32 
> saddr,
>
> result = udp4_lib_lookup2(net, saddr, sport,
>   daddr, hnum, dif,
> - hslot2, slot2);
> + hslot2, slot2, skb);
> if (!result) {
> hash2 = udp4_portaddr_hash(net, htonl(INADDR_ANY), 
> hnum);
> slot2 = hash2 & udptable->mask;
> @@ -583,7 +585,7 @@ struct sock *__udp4_lib_lookup(struct net *net, __be32 
> saddr,
>
> result = udp4_lib_lookup2(net, saddr, sport,
>   htonl(INADDR_ANY), hnum, 
> dif,
> - hslot2, slot2);
> + hslot2, slot2, skb);
> }
> rcu_read_unlock();
> return result;
> diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
> index 56fcb55fda31..5d2c2afffe7b 100644
> --- a/net/ipv6/udp.c
> +++ b/net/ipv6/udp.c
> @@ -251,7 +251,8 @@ static inline int compute_score2(struct sock *sk, struct 
> net *net,
>  static struct sock *udp6_lib_lookup2(struct net *net,
> const struct in6_addr *saddr, __be16 sport,
> const struct in6_addr *daddr, unsigned int hnum, int dif,
> -   struct udp_hslot *hslot2, unsigned int slot2)
> +   struct udp_hslot *hslot2, unsigned int slot2,
> +   struct sk_buff *skb)
>  {
> struct sock *sk, *result;
> struct hlist_nulls_node *node;
> @@ -272,7 +273,8 @@ begin:
> struct sock *sk2;
> hash = udp6_ehashfn(net, daddr, hnum,
> saddr, sport);
> -   sk2 = reuseport_select_sock(sk, hash, NU

Re: [PATCH net-next V3 0/4] Introduce mlx5 ethernet timestamping

2016-01-05 Thread David Miller
From: Saeed Mahameed 
Date: Tue, 29 Dec 2015 14:58:28 +0200

> This patch series introduces the support for ConnectX-4 timestamping
> and the PTP kernel interface.

Series applied, thank you.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: sched: fix missing free per cpu on qstats

2016-01-05 Thread David Miller
From: Eric Dumazet 
Date: Tue, 05 Jan 2016 09:44:11 -0800

> David, please add the following tag to ease backports to stable kernels:
> 
> Fixes: 22e0f8b9322cb ("net: sched: make bstats per cpu and estimator RCU 
> safe")

Ok, will do, thanks!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next V3 0/4] Introduce mlx5 ethernet timestamping

2016-01-05 Thread Saeed Mahameed
Thank you David and Richard.

On Tue, Jan 5, 2016 at 9:12 PM, David Miller  wrote:
> From: Saeed Mahameed 
> Date: Tue, 29 Dec 2015 14:58:28 +0200
>
>> This patch series introduces the support for ConnectX-4 timestamping
>> and the PTP kernel interface.
>
> Series applied, thank you.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] ppp: add netlink support

2016-01-05 Thread David Miller
From: Guillaume Nault 
Date: Tue, 5 Jan 2016 19:10:20 +0100

> On Wed, Dec 23, 2015 at 09:04:46PM +0100, Guillaume Nault wrote:
>> This series adds netlink support for creating PPP devices.
>> 
> Any feedback on this series? I can see that it has been marked
> "Deferred" in patchwork, so I'm unsure about what to do with this patch
> set now.
> Should I repost later (e.g. after the end of the merge window)? Are
> there any concern/issues with the series itself? Or should the idea of
> implementing netlink handlers for ppp devices be entirely dropped?

Repost it again later, I want more people to see and review this
series and the holidays is not a good time for that...
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] ppp: add netlink support

2016-01-05 Thread Guillaume Nault
On Tue, Jan 05, 2016 at 02:15:34PM -0500, David Miller wrote:
> From: Guillaume Nault 
> Date: Tue, 5 Jan 2016 19:10:20 +0100
> 
> > On Wed, Dec 23, 2015 at 09:04:46PM +0100, Guillaume Nault wrote:
> >> This series adds netlink support for creating PPP devices.
> >> 
> > Any feedback on this series? I can see that it has been marked
> > "Deferred" in patchwork, so I'm unsure about what to do with this patch
> > set now.
> > Should I repost later (e.g. after the end of the merge window)? Are
> > there any concern/issues with the series itself? Or should the idea of
> > implementing netlink handlers for ppp devices be entirely dropped?
> 
> Repost it again later, I want more people to see and review this
> series and the holidays is not a good time for that...

Sure. Thanks for the feedback.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch v2] qlcnic: fix a timeout loop

2016-01-05 Thread Dan Carpenter
No.  This patch is a suspend resume thing and your bug is something
else.  Honestly, this patch is a static checker fix and I doubt it has
much real worl impact at all.

regards,
dan carpenter

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] soreuseport: change consume_skb to kfree_skb in error case

2016-01-05 Thread Eric Dumazet
On Tue, 2016-01-05 at 14:12 -0500, Craig Gallek wrote:

> OK, I can buy that an skb should be piped through udp4|6_lib_lookup2,
> but I don't think it's safe to remove the skb NULL check in
> reuseport_select_sock.  There's at least one path (udp_diag.c:
> udp_dump_one) which does a lookup without an skb.  I'll start with
> your patch in a separate thread and we can discuss outside the context
> of the kfree_skb change.

Well, udp_dump_one() is broken vs SO_REUSEPORT. Fortunately iproute2
does not use it (yet)

We need to to get the socket designated by req->id.idiag_cookie,
ie ignoring reusport hash or BPF completely.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


MY LAST RESPONSE

2016-01-05 Thread Mr. Wang Zhiqiang



I am Mr. Wang Zhiqiang working with Wing Lung Bank Hong Kong; I have a
highly mutual and legitimate Bequest for you to handle with me. This fund
was deposited in our bank by an Oil Magnate who lived in Hong Kong for
Twenty Eight years. He died along with his family during the Tsunami which
occurred Dec 2004. You can contact me for more details if interested.

Best regards,
Mr. Wang Zhiqiang
Private Email:wangzhiqian...@yahoo.com.hk

I am Mr. Wang Zhiqiang working with Wing Lung Bank Hong Kong; I have a
highly mutual and legitimate Bequest for you to handle with me. This fund
was deposited in our bank by an Oil Magnate who lived in Hong Kong for
Twenty Eight years. He died along with his family during the Tsunami which
occurred Dec 2004. You can contact me for more details if interested.

Best regards,
Mr. Wang Zhiqiang
Private Email:wangzhiqian...@yahoo.com.hk

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next 07/12] net/mlx5_core: Initialize namespaces only when supported by device

2016-01-05 Thread Saeed Mahameed
From: Maor Gottlieb 

Before we create the sub tree of a steering namespaces(kernel, bypass,
leftovers) we check that the device has the required capabilities
in order to create this subtree.

Signed-off-by: Maor Gottlieb 
Signed-off-by: Moni Shoua 
Signed-off-by: Matan Barak 
Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c |   70 ++--
 1 files changed, 49 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c 
b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
index 2c064ba..7e39b69 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -40,20 +40,17 @@
 #define INIT_TREE_NODE_ARRAY_SIZE(...) (sizeof((struct 
init_tree_node[]){__VA_ARGS__}) /\
 sizeof(struct init_tree_node))
 
-#define INIT_PRIO(min_level_val, max_ft_val,\
- ...) {.type = FS_TYPE_PRIO,\
+#define ADD_PRIO(min_level_val, max_ft_val, caps_val,\
+...) {.type = FS_TYPE_PRIO,\
.min_ft_level = min_level_val,\
.max_ft = max_ft_val,\
+   .caps = caps_val,\
.children = (struct init_tree_node[]) {__VA_ARGS__},\
.ar_size = INIT_TREE_NODE_ARRAY_SIZE(__VA_ARGS__) \
 }
 
-#define ADD_PRIO(min_level_val, max_ft_val, ...)\
-   INIT_PRIO(min_level_val, max_ft_val,\
- __VA_ARGS__)\
-
 #define ADD_FT_PRIO(max_ft_val, ...)\
-   INIT_PRIO(0, max_ft_val,\
+   ADD_PRIO(0, max_ft_val, {},\
  __VA_ARGS__)\
 
 #define ADD_NS(...) {.type = FS_TYPE_NAMESPACE,\
@@ -61,12 +58,26 @@
.ar_size = INIT_TREE_NODE_ARRAY_SIZE(__VA_ARGS__) \
 }
 
+#define INIT_CAPS_ARRAY_SIZE(...) (sizeof((long[]){__VA_ARGS__}) /\
+  sizeof(long))
+
+#define FS_CAP(cap) (__mlx5_bit_off(flow_table_nic_cap, cap))
+
+#define FS_REQUIRED_CAPS(...) {.arr_sz = INIT_CAPS_ARRAY_SIZE(__VA_ARGS__), \
+  .caps = (long[]) {__VA_ARGS__} }
+
 #define KERNEL_MAX_FT 2
 #define KENREL_MIN_LEVEL 2
+
+struct node_caps {
+   size_t  arr_sz;
+   long*caps;
+};
 static struct init_tree_node {
enum fs_node_type   type;
struct init_tree_node *children;
int ar_size;
+   struct node_caps caps;
int min_ft_level;
int prio;
int max_ft;
@@ -74,7 +85,7 @@ static struct init_tree_node {
.type = FS_TYPE_NAMESPACE,
.ar_size = 1,
.children = (struct init_tree_node[]) {
-   ADD_PRIO(KENREL_MIN_LEVEL, 0,
+   ADD_PRIO(KENREL_MIN_LEVEL, 0, {},
 ADD_NS(ADD_FT_PRIO(KERNEL_MAX_FT))),
}
 };
@@ -1153,11 +1164,31 @@ static struct mlx5_flow_namespace 
*fs_create_namespace(struct fs_prio *prio)
return ns;
 }
 
-static int init_root_tree_recursive(int max_ft_level, struct init_tree_node 
*init_node,
+#define FLOW_TABLE_BIT_SZ 1
+#define GET_FLOW_TABLE_CAP(dev, offset) \
+   ((be32_to_cpu(*((__be32 *)(dev->hca_caps_cur[MLX5_CAP_FLOW_TABLE]) +
\
+   offset / 32)) >>
\
+ (32 - FLOW_TABLE_BIT_SZ - (offset & 0x1f))) & FLOW_TABLE_BIT_SZ)
+static bool has_required_caps(struct mlx5_core_dev *dev, struct node_caps 
*caps)
+{
+   int i;
+
+   for (i = 0; i < caps->arr_sz; i++) {
+   if (!GET_FLOW_TABLE_CAP(dev, caps->caps[i]))
+   return false;
+   }
+   return true;
+}
+
+static int init_root_tree_recursive(struct mlx5_core_dev *dev,
+   struct init_tree_node *init_node,
struct fs_node *fs_parent_node,
struct init_tree_node *init_parent_node,
int index)
 {
+   int max_ft_level = MLX5_CAP_FLOWTABLE(dev,
+ flow_table_properties_nic_receive.
+ max_ft_level);
struct mlx5_flow_namespace *fs_ns;
struct fs_prio *fs_prio;
struct fs_node *base;
@@ -1165,8 +1196,9 @@ static int init_root_tree_recursive(int max_ft_level, 
struct init_tree_node *ini
int err;
 
if (init_node->type == FS_TYPE_PRIO) {
-   if (init_node->min_ft_level > max_ft_level)
-   return -ENOTSUPP;
+   if ((init_node->min_ft_level > max_ft_level) ||
+   !has_required_caps(dev, &init_node->caps))
+   return 0;
 
fs_get_obj(fs_ns, fs_parent_node);
fs_prio = fs_create_prio(fs_ns, index, init_node->max_ft);
@@ -1183,9 +1215,8 @@ static int init_root_tree_recursive(int max_ft_level, 
struct init_tree_node *ini
return -EINVAL;
}
for (i = 0; i < init_node->ar_size; i++) {
-   err = init_root_tree_recursive(max_ft_level,
-  

[PATCH net-next 10/12] net/mlx5_core: Export flow steering API

2016-01-05 Thread Saeed Mahameed
From: Maor Gottlieb 

Add exports to flow steering API for mlx5_ib usage.
The following functions are exported:

1. mlx5_create_auto_grouped_flow_table - used to create flow
table with auto flow grouping management (create and destroy
flow groups). In auto-grouped flow tables, we create groups
automatically if needed (if we don't find an existing
flow group with same match criteria when we add new rule).

2. mlx5_destroy_flow_table - used to destroy  a flow table.

3. mlx5_add_flow_rule - used to add flow rule into a flow table.

4. mlx5_del_flow_rule - used to delete flow rule from its flow table.

5. mlx5_get_flow_namespace - used to get a handle to the required
namespace sub-tree.

Signed-off-by: Maor Gottlieb 
Signed-off-by: Moni Shoua 
Signed-off-by: Matan Barak 
Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c 
b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
index 7198528..fa144e5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -702,6 +702,7 @@ struct mlx5_flow_table 
*mlx5_create_auto_grouped_flow_table(struct mlx5_flow_nam
 
return ft;
 }
+EXPORT_SYMBOL(mlx5_create_auto_grouped_flow_table);
 
 /* Flow table should be locked */
 static struct mlx5_flow_group *create_flow_group_common(struct mlx5_flow_table 
*ft,
@@ -1013,11 +1014,13 @@ unlock:
unlock_ref_node(&ft->node);
return rule;
 }
+EXPORT_SYMBOL(mlx5_add_flow_rule);
 
 void mlx5_del_flow_rule(struct mlx5_flow_rule *rule)
 {
tree_remove_node(&rule->node);
 }
+EXPORT_SYMBOL(mlx5_del_flow_rule);
 
 /* Assuming prio->node.children(flow tables) is sorted by level */
 static struct mlx5_flow_table *find_next_ft(struct mlx5_flow_table *ft)
@@ -1099,6 +1102,7 @@ int mlx5_destroy_flow_table(struct mlx5_flow_table *ft)
 
return err;
 }
+EXPORT_SYMBOL(mlx5_destroy_flow_table);
 
 void mlx5_destroy_flow_group(struct mlx5_flow_group *fg)
 {
@@ -1143,6 +1147,7 @@ struct mlx5_flow_namespace 
*mlx5_get_flow_namespace(struct mlx5_core_dev *dev,
 
return ns;
 }
+EXPORT_SYMBOL(mlx5_get_flow_namespace);
 
 static struct fs_prio *fs_create_prio(struct mlx5_flow_namespace *ns,
  unsigned prio, int max_ft)
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next 06/12] net/mlx5_core: Set priority attributes

2016-01-05 Thread Saeed Mahameed
From: Maor Gottlieb 

Each priority has two attributes:
1. max_ft - maximum allowed flow tables under this priority.
2. start_level - start level range of the flow tables
in the priority.

These attributes are set by traversing the tree nodes by
DFS and set start level and max flow tables to each priority.
Start level depends on the max flow tables of the prior priorities
in the tree.

The leaves of the trees have max_ft set in them. Each node accumulates
the max_ft of its children and set it accordingly.

Signed-off-by: Maor Gottlieb 
Signed-off-by: Moni Shoua 
Signed-off-by: Matan Barak 
Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c |   71 +++--
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.h |3 +
 2 files changed, 55 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c 
b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
index 4b4f2b8..2c064ba 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -41,20 +41,19 @@
 sizeof(struct init_tree_node))
 
 #define INIT_PRIO(min_level_val, max_ft_val,\
- start_level_val, ...) {.type = FS_TYPE_PRIO,\
+ ...) {.type = FS_TYPE_PRIO,\
.min_ft_level = min_level_val,\
-   .start_level = start_level_val,\
.max_ft = max_ft_val,\
.children = (struct init_tree_node[]) {__VA_ARGS__},\
.ar_size = INIT_TREE_NODE_ARRAY_SIZE(__VA_ARGS__) \
 }
 
-#define ADD_PRIO(min_level_val, max_ft_val, start_level_val, ...)\
-   INIT_PRIO(min_level_val, max_ft_val, start_level_val,\
+#define ADD_PRIO(min_level_val, max_ft_val, ...)\
+   INIT_PRIO(min_level_val, max_ft_val,\
  __VA_ARGS__)\
 
-#define ADD_FT_PRIO(max_ft_val, start_level_val, ...)\
-   INIT_PRIO(0, max_ft_val, start_level_val,\
+#define ADD_FT_PRIO(max_ft_val, ...)\
+   INIT_PRIO(0, max_ft_val,\
  __VA_ARGS__)\
 
 #define ADD_NS(...) {.type = FS_TYPE_NAMESPACE,\
@@ -62,8 +61,6 @@
.ar_size = INIT_TREE_NODE_ARRAY_SIZE(__VA_ARGS__) \
 }
 
-#define KERNEL_START_LEVEL 0
-#define KERNEL_P0_START_LEVEL KERNEL_START_LEVEL
 #define KERNEL_MAX_FT 2
 #define KENREL_MIN_LEVEL 2
 static struct init_tree_node {
@@ -73,15 +70,12 @@ static struct init_tree_node {
int min_ft_level;
int prio;
int max_ft;
-   int start_level;
 } root_fs = {
.type = FS_TYPE_NAMESPACE,
.ar_size = 1,
.children = (struct init_tree_node[]) {
-   ADD_PRIO(KENREL_MIN_LEVEL, KERNEL_MAX_FT,
-KERNEL_START_LEVEL,
-ADD_NS(ADD_FT_PRIO(KERNEL_MAX_FT,
-   KERNEL_P0_START_LEVEL))),
+   ADD_PRIO(KENREL_MIN_LEVEL, 0,
+ADD_NS(ADD_FT_PRIO(KERNEL_MAX_FT))),
}
 };
 
@@ -1117,8 +,7 @@ struct mlx5_flow_namespace 
*mlx5_get_flow_namespace(struct mlx5_core_dev *dev,
 }
 
 static struct fs_prio *fs_create_prio(struct mlx5_flow_namespace *ns,
- unsigned prio, int max_ft,
- int start_level)
+ unsigned prio, int max_ft)
 {
struct fs_prio *fs_prio;
 
@@ -1131,7 +1124,6 @@ static struct fs_prio *fs_create_prio(struct 
mlx5_flow_namespace *ns,
tree_add_node(&fs_prio->node, &ns->node);
fs_prio->max_ft = max_ft;
fs_prio->prio = prio;
-   fs_prio->start_level = start_level;
list_add_tail(&fs_prio->node.list, &ns->node.children);
 
return fs_prio;
@@ -1177,8 +1169,7 @@ static int init_root_tree_recursive(int max_ft_level, 
struct init_tree_node *ini
return -ENOTSUPP;
 
fs_get_obj(fs_ns, fs_parent_node);
-   fs_prio = fs_create_prio(fs_ns, index, init_node->max_ft,
-init_node->start_level);
+   fs_prio = fs_create_prio(fs_ns, index, init_node->max_ft);
if (IS_ERR(fs_prio))
return PTR_ERR(fs_prio);
base = &fs_prio->node;
@@ -1245,6 +1236,46 @@ static struct mlx5_flow_root_namespace 
*create_root_ns(struct mlx5_core_dev *dev
return root_ns;
 }
 
+static void set_prio_attrs_in_prio(struct fs_prio *prio, int acc_level);
+
+static int set_prio_attrs_in_ns(struct mlx5_flow_namespace *ns, int acc_level)
+{
+   struct fs_prio *prio;
+
+   fs_for_each_prio(prio, ns) {
+/* This updates prio start_level and max_ft */
+   set_prio_attrs_in_prio(prio, acc_level);
+   acc_level += prio->max_ft;
+   }
+   return acc_level;
+}
+
+static void set_prio_attrs_in_prio(struct fs_prio *prio, int acc_level)
+{
+   struct mlx5_flow_namespace *ns;
+   int acc_level_ns = acc_level;
+
+   prio->start_level = acc

[PATCH net-next 08/12] net/mlx5_core: Enable flow steering support for the IB driver

2016-01-05 Thread Saeed Mahameed
From: Maor Gottlieb 

When the driver is loaded, we create flow steering namespace
for kernel bypass with nine priorities and another namespace
for leftovers(in order to catch packets that weren't matched).
Verbs applications will use these priorities.
we found nine as a number that balances the requirements from the
user and retains performance.

The bypass namespace is used by verbs applications that want to bypass
the kernel networking stack. The leftovers namespace is used by verbs
applications and the sniffer in order to catch packets that weren't
handled by any preceding rules.

Signed-off-by: Maor Gottlieb 
Signed-off-by: Moni Shoua 
Signed-off-by: Matan Barak 
Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c |   55 ++---
 include/linux/mlx5/device.h   |2 +
 include/linux/mlx5/fs.h   |2 +
 3 files changed, 51 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c 
b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
index 7e39b69..7198528 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -40,18 +40,19 @@
 #define INIT_TREE_NODE_ARRAY_SIZE(...) (sizeof((struct 
init_tree_node[]){__VA_ARGS__}) /\
 sizeof(struct init_tree_node))
 
-#define ADD_PRIO(min_level_val, max_ft_val, caps_val,\
+#define ADD_PRIO(num_prios_val, min_level_val, max_ft_val, caps_val,\
 ...) {.type = FS_TYPE_PRIO,\
.min_ft_level = min_level_val,\
.max_ft = max_ft_val,\
+   .num_leaf_prios = num_prios_val,\
.caps = caps_val,\
.children = (struct init_tree_node[]) {__VA_ARGS__},\
.ar_size = INIT_TREE_NODE_ARRAY_SIZE(__VA_ARGS__) \
 }
 
-#define ADD_FT_PRIO(max_ft_val, ...)\
-   ADD_PRIO(0, max_ft_val, {},\
- __VA_ARGS__)\
+#define ADD_MULTIPLE_PRIO(num_prios_val, max_ft_val, ...)\
+   ADD_PRIO(num_prios_val, 0, max_ft_val, {},\
+__VA_ARGS__)\
 
 #define ADD_NS(...) {.type = FS_TYPE_NAMESPACE,\
.children = (struct init_tree_node[]) {__VA_ARGS__},\
@@ -66,7 +67,14 @@
 #define FS_REQUIRED_CAPS(...) {.arr_sz = INIT_CAPS_ARRAY_SIZE(__VA_ARGS__), \
   .caps = (long[]) {__VA_ARGS__} }
 
+#define LEFTOVERS_MAX_FT 1
+#define LEFTOVERS_NUM_PRIOS 1
+#define BY_PASS_PRIO_MAX_FT 1
+#define BY_PASS_MIN_LEVEL (KENREL_MIN_LEVEL + MLX5_BY_PASS_NUM_PRIOS +\
+  LEFTOVERS_MAX_FT)
+
 #define KERNEL_MAX_FT 2
+#define KERNEL_NUM_PRIOS 1
 #define KENREL_MIN_LEVEL 2
 
 struct node_caps {
@@ -79,14 +87,27 @@ static struct init_tree_node {
int ar_size;
struct node_caps caps;
int min_ft_level;
+   int num_leaf_prios;
int prio;
int max_ft;
 } root_fs = {
.type = FS_TYPE_NAMESPACE,
-   .ar_size = 1,
+   .ar_size = 3,
.children = (struct init_tree_node[]) {
-   ADD_PRIO(KENREL_MIN_LEVEL, 0, {},
-ADD_NS(ADD_FT_PRIO(KERNEL_MAX_FT))),
+   ADD_PRIO(0, BY_PASS_MIN_LEVEL, 0,
+
FS_REQUIRED_CAPS(FS_CAP(flow_table_properties_nic_receive.flow_modify_en),
+ 
FS_CAP(flow_table_properties_nic_receive.modify_root),
+ 
FS_CAP(flow_table_properties_nic_receive.identified_miss_table_mode),
+ 
FS_CAP(flow_table_properties_nic_receive.flow_table_modify)),
+ADD_NS(ADD_MULTIPLE_PRIO(MLX5_BY_PASS_NUM_PRIOS, 
BY_PASS_PRIO_MAX_FT))),
+   ADD_PRIO(0, KENREL_MIN_LEVEL, 0, {},
+ADD_NS(ADD_MULTIPLE_PRIO(KERNEL_NUM_PRIOS, 
KERNEL_MAX_FT))),
+   ADD_PRIO(0, BY_PASS_MIN_LEVEL, 0,
+
FS_REQUIRED_CAPS(FS_CAP(flow_table_properties_nic_receive.flow_modify_en),
+ 
FS_CAP(flow_table_properties_nic_receive.modify_root),
+ 
FS_CAP(flow_table_properties_nic_receive.identified_miss_table_mode),
+ 
FS_CAP(flow_table_properties_nic_receive.flow_table_modify)),
+ADD_NS(ADD_MULTIPLE_PRIO(LEFTOVERS_NUM_PRIOS, 
LEFTOVERS_MAX_FT))),
}
 };
 
@@ -1098,8 +1119,10 @@ struct mlx5_flow_namespace 
*mlx5_get_flow_namespace(struct mlx5_core_dev *dev,
return NULL;
 
switch (type) {
+   case MLX5_FLOW_NAMESPACE_BYPASS:
case MLX5_FLOW_NAMESPACE_KERNEL:
-   prio = 0;
+   case MLX5_FLOW_NAMESPACE_LEFTOVERS:
+   prio = type;
break;
case MLX5_FLOW_NAMESPACE_FDB:
if (dev->priv.fdb_root_ns)
@@ -1164,6 +1187,20 @@ static struct mlx5_flow_namespace 
*fs_create_namespace(struct fs_prio *prio)
return ns;
 }
 
+static in

[PATCH net-next 03/12] net/mlx5_core: Managing root flow table

2016-01-05 Thread Saeed Mahameed
From: Maor Gottlieb 

The root Flow Table for each Flow Table Type is defined,
by default, as the Flow Table with level 0.

In order not to use an empty flow tables and introduce new hops,
but still preserve space for flow-tables that have a priority
greater(lower number) than the current flow table, we introduce this
new set root flow table command.
This command tells the HW to start matching packets from the
assigned root flow table.
This command is used when we create new flow table with level lower than the
current lowest flow table or it is the first flow table.

Signed-off-by: Maor Gottlieb 
Signed-off-by: Moni Shoua 
Signed-off-by: Matan Barak 
Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c  |   18 
 drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h  |2 +
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c |   97 +++--
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.h |6 ++
 include/linux/mlx5/mlx5_ifc.h |   31 +++-
 5 files changed, 144 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c 
b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
index 5096f4f..d8b1195 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
@@ -38,6 +38,24 @@
 #include "fs_cmd.h"
 #include "mlx5_core.h"
 
+int mlx5_cmd_update_root_ft(struct mlx5_core_dev *dev,
+   struct mlx5_flow_table *ft)
+{
+   u32 in[MLX5_ST_SZ_DW(set_flow_table_root_in)];
+   u32 out[MLX5_ST_SZ_DW(set_flow_table_root_out)];
+
+   memset(in, 0, sizeof(in));
+
+   MLX5_SET(set_flow_table_root_in, in, opcode,
+MLX5_CMD_OP_SET_FLOW_TABLE_ROOT);
+   MLX5_SET(set_flow_table_root_in, in, table_type, ft->type);
+   MLX5_SET(set_flow_table_root_in, in, table_id, ft->id);
+
+   memset(out, 0, sizeof(out));
+   return mlx5_cmd_exec_check_status(dev, in, sizeof(in), out,
+ sizeof(out));
+}
+
 int mlx5_cmd_create_flow_table(struct mlx5_core_dev *dev,
   enum fs_flow_table_type type, unsigned int level,
   unsigned int log_size, unsigned int *table_id)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h 
b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h
index f39304e..70d18ec 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h
@@ -62,4 +62,6 @@ int mlx5_cmd_delete_fte(struct mlx5_core_dev *dev,
struct mlx5_flow_table *ft,
unsigned int index);
 
+int mlx5_cmd_update_root_ft(struct mlx5_core_dev *dev,
+   struct mlx5_flow_table *ft);
 #endif
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c 
b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
index e62cc59..6445489 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -510,6 +510,29 @@ static struct mlx5_flow_table *find_prev_chained_ft(struct 
fs_prio *prio)
return find_closest_ft(prio, true);
 }
 
+static int update_root_ft_create(struct mlx5_flow_table *ft, struct fs_prio
+*prio)
+{
+   struct mlx5_flow_root_namespace *root = find_root(&prio->node);
+   int min_level = INT_MAX;
+   int err;
+
+   if (root->root_ft)
+   min_level = root->root_ft->level;
+
+   if (ft->level >= min_level)
+   return 0;
+
+   err = mlx5_cmd_update_root_ft(root->dev, ft);
+   if (err)
+   mlx5_core_warn(root->dev, "Update root flow table of id=%u 
failed\n",
+  ft->id);
+   else
+   root->root_ft = ft;
+
+   return err;
+}
+
 struct mlx5_flow_table *mlx5_create_flow_table(struct mlx5_flow_namespace *ns,
   int prio,
   int max_fte)
@@ -526,14 +549,15 @@ struct mlx5_flow_table *mlx5_create_flow_table(struct 
mlx5_flow_namespace *ns,
return ERR_PTR(-ENODEV);
}
 
+   mutex_lock(&root->chain_lock);
fs_prio = find_prio(ns, prio);
-   if (!fs_prio)
-   return ERR_PTR(-EINVAL);
-
-   lock_ref_node(&fs_prio->node);
+   if (!fs_prio) {
+   err = -EINVAL;
+   goto unlock_root;
+   }
if (fs_prio->num_ft == fs_prio->max_ft) {
err = -ENOSPC;
-   goto unlock_prio;
+   goto unlock_root;
}
 
ft = alloc_flow_table(find_next_free_level(fs_prio),
@@ -541,7 +565,7 @@ struct mlx5_flow_table *mlx5_create_flow_table(struct 
mlx5_flow_namespace *ns,
  root->table_type);
if (!ft) {
err = -ENOMEM;
-   goto unlock_prio;
+   

[PATCH net-next 05/12] net/mlx5_core: Connect flow tables

2016-01-05 Thread Saeed Mahameed
From: Maor Gottlieb 

Flow tables from different priorities should be chained together.
When a packet arrives we search for a match in the
by-pass flow tables (first we search for a match in priority 0
and if we don't find a match we move to the next priority).
If we can't find a match in any of the bypass flow-tables, we continue
searching in the flow-tables of the next priority, which are the
kernel's flow tables.

Setting the miss flow table in a new flow table to be the next one in
the list is performed via create flow table API. If we want to change an
existing flow table, for example in order to point from an
existing flow table to the new next-in-list flow table, we use the
modify flow table API.

Signed-off-by: Maor Gottlieb 
Signed-off-by: Moni Shoua 
Signed-off-by: Matan Barak 
Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c  |7 +-
 drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h  |3 +-
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c |  104 +++--
 3 files changed, 104 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c 
b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
index 2b55625..a9894d2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c
@@ -58,7 +58,8 @@ int mlx5_cmd_update_root_ft(struct mlx5_core_dev *dev,
 
 int mlx5_cmd_create_flow_table(struct mlx5_core_dev *dev,
   enum fs_flow_table_type type, unsigned int level,
-  unsigned int log_size, unsigned int *table_id)
+  unsigned int log_size, struct mlx5_flow_table
+  *next_ft, unsigned int *table_id)
 {
u32 out[MLX5_ST_SZ_DW(create_flow_table_out)];
u32 in[MLX5_ST_SZ_DW(create_flow_table_in)];
@@ -69,6 +70,10 @@ int mlx5_cmd_create_flow_table(struct mlx5_core_dev *dev,
MLX5_SET(create_flow_table_in, in, opcode,
 MLX5_CMD_OP_CREATE_FLOW_TABLE);
 
+   if (next_ft) {
+   MLX5_SET(create_flow_table_in, in, table_miss_mode, 1);
+   MLX5_SET(create_flow_table_in, in, table_miss_id, next_ft->id);
+   }
MLX5_SET(create_flow_table_in, in, table_type, type);
MLX5_SET(create_flow_table_in, in, level, level);
MLX5_SET(create_flow_table_in, in, log_size, log_size);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h 
b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h
index 1ae9b68..9814d47 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.h
@@ -35,7 +35,8 @@
 
 int mlx5_cmd_create_flow_table(struct mlx5_core_dev *dev,
   enum fs_flow_table_type type, unsigned int level,
-  unsigned int log_size, unsigned int *table_id);
+  unsigned int log_size, struct mlx5_flow_table
+  *next_ft, unsigned int *table_id);
 
 int mlx5_cmd_destroy_flow_table(struct mlx5_core_dev *dev,
struct mlx5_flow_table *ft);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c 
b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
index 6445489..4b4f2b8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -510,6 +510,48 @@ static struct mlx5_flow_table *find_prev_chained_ft(struct 
fs_prio *prio)
return find_closest_ft(prio, true);
 }
 
+static int connect_fts_in_prio(struct mlx5_core_dev *dev,
+  struct fs_prio *prio,
+  struct mlx5_flow_table *ft)
+{
+   struct mlx5_flow_table *iter;
+   int i = 0;
+   int err;
+
+   fs_for_each_ft(iter, prio) {
+   i++;
+   err = mlx5_cmd_modify_flow_table(dev,
+iter,
+ft);
+   if (err) {
+   mlx5_core_warn(dev, "Failed to modify flow table %d\n",
+  iter->id);
+   /* The driver is out of sync with the FW */
+   if (i > 1)
+   WARN_ON(true);
+   return err;
+   }
+   }
+   return 0;
+}
+
+/* Connect flow tables from previous priority of prio to ft */
+static int connect_prev_fts(struct mlx5_core_dev *dev,
+   struct mlx5_flow_table *ft,
+   struct fs_prio *prio)
+{
+   struct mlx5_flow_table *prev_ft;
+
+   prev_ft = find_prev_chained_ft(prio);
+   if (prev_ft) {
+   struct fs_prio *prev_prio;
+
+   fs_get_obj(prev_prio, prev_ft->node.parent);
+   return connect_fts_in_prio(dev, prev_prio, ft);
+   }
+   retur

  1   2   >