[PATCH] vmxnet3: fix lock imbalance in vmxnet3_tq_xmit()

2016-03-14 Thread Arnd Bergmann
A recent bug fix rearranged the code in vmxnet3_tq_xmit() in a
way that left the error handling for oversized headers unlock
a lock that had not been taken yet. Gcc warns about the incorrect
use of the 'flags' variable because of that:

drivers/net/vmxnet3/vmxnet3_drv.c: In function 'vmxnet3_tq_xmit.constprop':
include/linux/spinlock.h:246:3: error: 'flags' may be used uninitialized in 
this function [-Werror=maybe-uninitialized]

This changes the error handling path to 'goto' the end of the function
beyond the lock/unlock pair.

Signed-off-by: Arnd Bergmann 
Fixes: cec05562fb1d ("vmxnet3: avoid calling pskb_may_pull with interrupts 
disabled")
---
 drivers/net/vmxnet3/vmxnet3_drv.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/vmxnet3/vmxnet3_drv.c 
b/drivers/net/vmxnet3/vmxnet3_drv.c
index fc895d0e85d9..b2348f67b00a 100644
--- a/drivers/net/vmxnet3/vmxnet3_drv.c
+++ b/drivers/net/vmxnet3/vmxnet3_drv.c
@@ -1022,14 +1022,16 @@ vmxnet3_tq_xmit(struct sk_buff *skb, struct 
vmxnet3_tx_queue *tq,
if (ctx.mss) {
if (unlikely(ctx.eth_ip_hdr_size + ctx.l4_hdr_size >
 VMXNET3_MAX_TX_BUF_SIZE)) {
-   goto hdr_too_big;
+   tq->stats.drop_oversized_hdr++;
+   goto drop_pkt;
}
} else {
if (skb->ip_summed == CHECKSUM_PARTIAL) {
if (unlikely(ctx.eth_ip_hdr_size +
 skb->csum_offset >
 VMXNET3_MAX_CSUM_OFFSET)) {
-   goto hdr_too_big;
+   tq->stats.drop_oversized_hdr++;
+   goto drop_pkt;
}
}
}
@@ -1123,8 +1125,6 @@ vmxnet3_tq_xmit(struct sk_buff *skb, struct 
vmxnet3_tx_queue *tq,
 
return NETDEV_TX_OK;
 
-hdr_too_big:
-   tq->stats.drop_oversized_hdr++;
 unlock_drop_pkt:
spin_unlock_irqrestore(>tx_lock, flags);
 drop_pkt:
-- 
2.7.0



Re: Generic TSO

2016-03-14 Thread Alexander Duyck
On Mon, Mar 14, 2016 at 3:32 AM, Edward Cree  wrote:
> On 14/03/16 10:26, Edward Cree wrote:
>> On 12/03/16 05:40, Alexander Duyck wrote:
>>> Well that is the thing.  Before we can actually start tinkering with
>>> the outer header we probably need to make sure we set the DF bit and
>>> that it would be honored on the outer headers for IPv4.  I don't
>>> believe any of the tunnels are currently doing that so repeating the
>>> IP ID would be the worst possible scenario until that is resolved
>>> since VXLAN tunneled frames can be fragmented while TCP frames cannot
>>> so we really shouldn't be repeating IP IDs for the outer headers.
>> So how do we progress with that?  I'm presuming it's not as simple as
>> just patching the tunnel drivers to set DF if the inner packet has it,
>> as that could break existing setups.  (I've heard that "but they're
>> already broken anyway" is not usually an acceptable argument.)  Some
>> sort of configuration option on the tunnel (like we do with udpcsum)?
> ...and immediately I find out it already exists.  (I guess I should have
> looked there first!)
> From drivers/net/vxlan.c:2001:
>> else if (info->key.tun_flags & TUNNEL_DONT_FRAGMENT)
>> df = htons(IP_DF);

I'm still not a fan of trying to freeze the outer IP header.  I think
it should be the one that should have the IP ID increment while the
inner IP header be the one that is frozen.  Maybe that is where we can
differ per device.  I would be okay with the outer tunnel headers and
inner IP header being frozen on ixgbe which will be needed in order to
compute outer UDP checksum anyway.  Then we could leave it up to the
driver's discretion as to if the outer header has the IP ID that
increments or the inner header.

- Alex


Re: When will net-next merge with linux-next?

2016-03-14 Thread gre...@linuxfoundation.org
On Mon, Mar 14, 2016 at 06:09:41AM +, Dexuan Cui wrote:
> Hi David,
> I have a pending patch of the hv_sock driver, which should go into the
> kernel through the net-next tree:
> https://lkml.org/lkml/2016/2/14/7
> 
> The VMBus side's supporting patches of hv_sock have been in Greg's tree
> and linux-next for more than 1 month, but they haven't been in net-next
> yet, I suppose this is because of the releasing of 4.5.
> 
> Now 4.5 is released. Will you merge with Greg's tree or linux-next?

linux-next is a merge of all of the maintainer's trees, and it is
rebased every day, it's impossible to merge that back into a maintainers
tree, sorry.

> I read netdev-FAQ.txt, but still don't have a clear idea about how things
> work in my case.

Try reading Documentation/development-process/ please.  Things will get
merged together into Linus's tree over the next 2 weeks as we ask him to
pull our trees.

thanks,

greg k-h


Re: [PATCH net-next 1/2] rtnetlink: add new RTM_GETSTATS message to dump link stats

2016-03-14 Thread Jiri Pirko
Sun, Mar 13, 2016 at 02:56:25AM CET, ro...@cumulusnetworks.com wrote:
>From: Roopa Prabhu 
>
>This patch adds a new RTM_GETSTATS message to query link stats via netlink
>from the kernel. RTM_NEWLINK also dumps stats today, but RTM_NEWLINK
>returns a lot more than just stats and is expensive in some cases when
>frequent polling for stats from userspace is a common operation.
>
>RTM_GETSTATS is an attempt to provide a light weight netlink message
>to explicity query only link stats from the kernel on an interface.
>The idea is to also keep it extensible so that new kinds of stats can be
>added to it in the future.
>
>This patch adds the following attribute for NETDEV stats:
>struct nla_policy ifla_stats_policy[IFLA_STATS_MAX + 1] = {
>[IFLA_STATS_LINK64]  = { .len = sizeof(struct rtnl_link_stats64) },
>};
>
>This patch also allows for af family stats (an example af stats for IPV6
>is available with the second patch in the series).
>
>Like any other rtnetlink message, RTM_GETSTATS can be used to get stats of
>a single interface or all interfaces with NLM_F_DUMP.
>
>Future possible new types of stat attributes:
>- IFLA_MPLS_STATS  (nested. for mpls/mdev stats)
>- IFLA_EXTENDED_STATS (nested. extended software netdev stats like bridge,
>  vlan, vxlan etc)
>- IFLA_EXTENDED_HW_STATS (nested. extended hardware stats which are
>  available via ethtool today)
>
>This patch also declares a filter mask for all stat attributes.
>User has to provide a mask of stats attributes to query. This will be
>specified in a new hdr 'struct if_stats_msg' for stats messages.
>
>Without any attributes in the filter_mask, no stats will be returned.
>
>This patch has been tested with modified iproute2 ifstat.
>
>Suggested-by: Jamal Hadi Salim 
>Signed-off-by: Roopa Prabhu 
>---
> include/net/rtnetlink.h|   5 ++
> include/uapi/linux/if_link.h   |  19 
> include/uapi/linux/rtnetlink.h |   7 ++
> net/core/rtnetlink.c   | 200 +
> 4 files changed, 231 insertions(+)
>
>diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h
>index 2f87c1b..fa68158 100644
>--- a/include/net/rtnetlink.h
>+++ b/include/net/rtnetlink.h
>@@ -131,6 +131,11 @@ struct rtnl_af_ops {
>   const struct nlattr *attr);
>   int (*set_link_af)(struct net_device *dev,
>  const struct nlattr *attr);
>+  size_t  (*get_link_af_stats_size)(const struct 
>net_device *dev,
>+u32 filter_mask);
>+  int (*fill_link_af_stats)(struct sk_buff *skb,
>+const struct net_device 
>*dev,
>+u32 filter_mask);
> };
> 
> void __rtnl_af_unregister(struct rtnl_af_ops *ops);
>diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
>index 249eef9..0840f3e 100644
>--- a/include/uapi/linux/if_link.h
>+++ b/include/uapi/linux/if_link.h
>@@ -741,4 +741,23 @@ enum {
> 
> #define IFLA_HSR_MAX (__IFLA_HSR_MAX - 1)
> 
>+/* STATS section */
>+
>+struct if_stats_msg {
>+  __u8  family;
>+  __u32 ifindex;
>+  __u32 filter_mask;

This limit future extension to only 32 groups of stats. I can imagine
that more than that can be added, easily. Why don't you use nested
attribute IFLA_STATS_FILTER with flag attributes for every type? That
would be easily extendable.

Using netlink header struct for this does not look correct to me.
In past, this was done lot of times and turned out to be a problem later.



>+};
>+
>+enum {
>+  IFLA_STATS_UNSPEC,
>+  IFLA_STATS_LINK64,
>+  IFLA_STATS_INET6,
>+  __IFLA_STATS_MAX,
>+};
>+
>+#define IFLA_STATS_MAX (__IFLA_STATS_MAX - 1)
>+
>+#define IFLA_STATS_FILTER_BIT(ATTR)   (1 << (ATTR))
>+


[PATCH trivial] netfilter: xt_limit: Spelling s/maxmum/maximum/

2016-03-14 Thread Geert Uytterhoeven
Signed-off-by: Geert Uytterhoeven 
---
 net/netfilter/xt_limit.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/xt_limit.c b/net/netfilter/xt_limit.c
index bef8505965589298..c1a4d5bf25d5bf9f 100644
--- a/net/netfilter/xt_limit.c
+++ b/net/netfilter/xt_limit.c
@@ -47,7 +47,7 @@ static DEFINE_SPINLOCK(limit_lock);
 
See Alexey's formal explanation in net/sched/sch_tbf.c.
 
-   To get the maxmum range, we multiply by this factor (ie. you get N
+   To get the maximum range, we multiply by this factor (ie. you get N
credits per jiffy).  We want to allow a rate as low as 1 per day
(slowest userspace tool allows), which means
CREDITS_PER_JIFFY*HZ*60*60*24 < 2^32. ie. */
-- 
1.9.1



Re: pull-request: wireless-drivers-next 2016-03-14

2016-03-14 Thread David Miller
From: Kalle Valo 
Date: Mon, 14 Mar 2016 10:31:48 +0200

> I know I'm late now that merge window was opened yesterday but here's
> one more set of patches I would like to get to 4.6 still. There isn't
> anything controversial so I hope this should be still safe to pull. The
> patches have been in linux-next since Friday and I haven't seen any
> reports about issues. But if you think it's too late just let me know
> and I'll resubmit these for 4.7.
> 
> The most notable part here of course is rtl8xxxu with over 100 patches.
> As the driver is new and under heavy development I think they are ok to
> take still. Otherwise there are mostly fixes with an exception of adding
> a new debugfs file to wl18xx.
> 
> Please let me know if you have any problems.

Pulled, thanks.

I really like Jes's work and I wish you had integrated it several
months ago, instead of sloshing him needlessly through a non-stop
cycle of very nit-picky issues, just FYI.


RE: [ethtool PATCH v4 10/11] ethtool.c: add support for ETHTOOL_xLINKSETTINGS ioctls

2016-03-14 Thread David Laight
> > +   /* ignore optional '0x' prefix */
> > +   if ((slen > 2) && (

Unnecessary ().

> > +   (0 == memcmp(s, "0x", 2)
> > +    || (0 == memcmp(s, "0X", 2) {

A-about-F comparisons.

> memcmp() is a really poor tool for comparing strings.  You should use
> strncasecmp() here.

Even that is overkill, why not just:
if (s[0] == '0' && (s[1] == 'x' || s[1] == 'X'))

David



[PATCH] fsl/fman: fix dtsec_set_tx_pause_frames

2016-03-14 Thread igal.liberman
From: Igal Liberman 

Fix a bug introduced in e06a03b (fsl/fman: fix the pause_time test)
When pause_time is set to '0' - pause frames are disabled and
there's no need to apply dTSEC-A003 Errata workaround.

Signed-off-by: Igal Liberman 
---
 drivers/net/ethernet/freescale/fman/fman_dtsec.c |7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/freescale/fman/fman_dtsec.c 
b/drivers/net/ethernet/freescale/fman/fman_dtsec.c
index 7c92eb8..c88918c 100644
--- a/drivers/net/ethernet/freescale/fman/fman_dtsec.c
+++ b/drivers/net/ethernet/freescale/fman/fman_dtsec.c
@@ -932,15 +932,14 @@ int dtsec_set_tx_pause_frames(struct fman_mac *dtsec,
if (!is_init_done(dtsec->dtsec_drv_param))
return -EINVAL;
 
-   /* FM_BAD_TX_TS_IN_B_2_B_ERRATA_DTSEC_A003 Errata workaround */
-   if (dtsec->fm_rev_info.major == 2)
-   if (pause_time <= 320) {
+   if (pause_time) {
+   /* FM_BAD_TX_TS_IN_B_2_B_ERRATA_DTSEC_A003 Errata workaround */
+   if (dtsec->fm_rev_info.major == 2 && pause_time <= 320) {
pr_warn("pause-time: %d illegal.Should be > 320\n",
pause_time);
return -EINVAL;
}
 
-   if (pause_time) {
ptv = ioread32be(>ptv);
ptv &= PTV_PTE_MASK;
ptv |= pause_time & PTV_PT_MASK;
-- 
1.7.9.5



[PATCH net-next] ixgbe: Avoid unaligned access in ixgbe_atr() for LLC packets

2016-03-14 Thread Sowmini Varadhan

For LLC based protocols like lldp, stp etc., the ethernet header
is an 802.3 header with a h_proto that is not 0x800, 0x86dd, or
even 0x806.  In this world, the skb_network_header() points at
the DSAP/SSAP/..  and is not likely to be NET_IP_ALIGNed in
ixgbe_atr().

With LLC, drivers are not likely to correctly find IPVERSION,
or "6", at hdr.ipv4->version, but will instead just needlessly
trigger an unaligned access. (IPv4/IPv6 over LLC is almost never
implemented).

The unaligned access is thus avoidable: bail out quickly after
examining skb->protocol.

Signed-off-by: Sowmini Varadhan 
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c 
b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 4d6223d..c3885a8 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -7602,6 +7602,11 @@ static void ixgbe_atr(struct ixgbe_ring *ring,
 #endif /* CONFIG_IXGBE_VXLAN */
}
 
+   if (skb->protocol != htons(ETH_P_IP) &&
+   skb->protocol != htons(ETH_P_IPV6) &&
+   skb->protocol != htons(ETH_P_ARP))
+   return;
+
/* Currently only IPv4/IPv6 with TCP is supported */
switch (hdr.ipv4->version) {
case IPVERSION:
-- 
1.7.1



Re: [PATCH net-next 1/2] rtnetlink: add new RTM_GETSTATS message to dump link stats

2016-03-14 Thread Nicolas Dichtel

Le 13/03/2016 02:56, Roopa Prabhu a écrit :

From: Roopa Prabhu 

This patch adds a new RTM_GETSTATS message to query link stats via netlink
from the kernel. RTM_NEWLINK also dumps stats today, but RTM_NEWLINK
returns a lot more than just stats and is expensive in some cases when
frequent polling for stats from userspace is a common operation.

RTM_GETSTATS is an attempt to provide a light weight netlink message
to explicity query only link stats from the kernel on an interface.
The idea is to also keep it extensible so that new kinds of stats can be
added to it in the future.

This patch adds the following attribute for NETDEV stats:
struct nla_policy ifla_stats_policy[IFLA_STATS_MAX + 1] = {
 [IFLA_STATS_LINK64]  = { .len = sizeof(struct rtnl_link_stats64) },
};

This patch also allows for af family stats (an example af stats for IPV6
is available with the second patch in the series).

Like any other rtnetlink message, RTM_GETSTATS can be used to get stats of
a single interface or all interfaces with NLM_F_DUMP.

Future possible new types of stat attributes:
- IFLA_MPLS_STATS  (nested. for mpls/mdev stats)
- IFLA_EXTENDED_STATS (nested. extended software netdev stats like bridge,
   vlan, vxlan etc)
- IFLA_EXTENDED_HW_STATS (nested. extended hardware stats which are
   available via ethtool today)

This patch also declares a filter mask for all stat attributes.
User has to provide a mask of stats attributes to query. This will be
specified in a new hdr 'struct if_stats_msg' for stats messages.

Without any attributes in the filter_mask, no stats will be returned.

This patch has been tested with modified iproute2 ifstat.

Suggested-by: Jamal Hadi Salim 
Signed-off-by: Roopa Prabhu 

[snip]

diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 249eef9..0840f3e 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h

[snip]

+enum {
+   IFLA_STATS_UNSPEC,
+   IFLA_STATS_LINK64,
+   IFLA_STATS_INET6,

IFLA_STATS_INET6 is part on patch #2, it's not used in this patch.


+   __IFLA_STATS_MAX,
+};
+
+#define IFLA_STATS_MAX (__IFLA_STATS_MAX - 1)
+
+#define IFLA_STATS_FILTER_BIT(ATTR)(1 << (ATTR))
+
  #endif /* _UAPI_LINUX_IF_LINK_H */
diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index ca764b5..2bbb300 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -139,6 +139,13 @@ enum {
RTM_GETNSID = 90,
  #define RTM_GETNSID RTM_GETNSID

+   RTM_NEWSTATS = 92,
+#define RTM_NEWSTATS RTM_NEWSTATS
+   RTM_DELSTATS = 93,
+#define RTM_DELSTATS RTM_DELSTATS

RTM_DELSTATS is never used.


+   RTM_GETSTATS = 94,
+#define RTM_GETSTATS RTM_GETSTATS
+
__RTM_MAX,
  #define RTM_MAX   (((__RTM_MAX + 3) & ~3) - 1)
  };
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index d2d9e5e..d1e3d17 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c

[snip]

+static noinline size_t if_nlmsg_stats_size(const struct net_device *dev,
+  u32 filter_mask)

Why are you using the 'noinline' attribute?


+{
+   size_t size = 0;
+
+   if (filter_mask & IFLA_STATS_FILTER_BIT(IFLA_STATS_LINK64))
+   size += nla_total_size(sizeof(struct rtnl_link_stats64));
+
+   size += rtnl_link_get_af_stats_size(dev, filter_mask);
+
+   return size;
+}




Re: [PATCH v3 0/8] arm64: rockchip: Initial GeekBox enablement

2016-03-14 Thread Giuseppe CAVALLARO

Hi Tomeu

On 3/14/2016 12:43 PM, Tomeu Vizoso wrote:

Hi Peppe,

with that patch I don't see any difference at all in my setup.

So to be clear, with these commits on top of next-20160314, I still
get the hang during boot:

209afef6f0cd ARM: dts: rockchip: Add mdio node to ethernet node
2315acc6cf7f Revert "stmmac: first frame prep at the end of xmit routine"
b5e08e810c63 stmmac: fix tx prepare for normal desc
37c15a31d850 i2c: immediately mark ourselves as registered
4342eec3c5a2 Add linux-next specific files for 20160314

[   27.521026] WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:303
dev_watchdog+0x284/0x288
[   27.529460] NETDEV WATCHDOG: eth0 (rk_gmac-dwmac): transmit queue 0 timed out


I do not reproduce the WATCHDOG but i am continuing to look at the code
to understand if normal descriptor management is ok or not. I keep you
informed.

Just an info, did you test with 2315acc6cf7f included? Just to
understand if it is introducing a problem. It works in case of
enhanced descriptors are used instead of.



https://git.collabora.com/cgit/user/tomeu/linux.git/log/?h=broken-eth-on-rock2


thx I will take a look at this

Regards
Peppe




Re: [PATCH v2 0/2] net: thunderx: Performance enhancement changes

2016-03-14 Thread David Miller
From: sunil.kovv...@gmail.com
Date: Mon, 14 Mar 2016 16:36:13 +0530

> Below patches attempts to improve performance by reducing
> no of atomic operations while allocating new receive buffers
> and reducing cache misses by adjusting nicvf structure elements.
> 
> Changes from v1:
>  No changes, resubmitting a fresh as per David's suggestion.

Series applied, thanks.


RE: [PATCH net-next 1/2] rtnetlink: add new RTM_GETSTATS message to dump link stats

2016-03-14 Thread Elad Raz


> -Original Message-
> From: netdev-ow...@vger.kernel.org [mailto:netdev-ow...@vger.kernel.org]
> On Behalf Of Roopa Prabhu
> Sent: Sunday, March 13, 2016 3:56 AM
> To: netdev@vger.kernel.org
> Cc: j...@mojatatu.com; da...@davemloft.net
> Subject: [PATCH net-next 1/2] rtnetlink: add new RTM_GETSTATS message to
> dump link stats
> 
> From: Roopa Prabhu 
> 
> This patch adds a new RTM_GETSTATS message to query link stats via
> netlink from the kernel. RTM_NEWLINK also dumps stats today, but
> RTM_NEWLINK returns a lot more than just stats and is expensive in some
> cases when frequent polling for stats from userspace is a common
> operation.
> 
> RTM_GETSTATS is an attempt to provide a light weight netlink message to
> explicity query only link stats from the kernel on an interface.
> The idea is to also keep it extensible so that new kinds of stats can be
> added to it in the future.
> 
> This patch adds the following attribute for NETDEV stats:
> struct nla_policy ifla_stats_policy[IFLA_STATS_MAX + 1] = {
> [IFLA_STATS_LINK64]  = { .len = sizeof(struct rtnl_link_stats64)
> }, };
> 
> This patch also allows for af family stats (an example af stats for IPV6
> is available with the second patch in the series).
> 
> Like any other rtnetlink message, RTM_GETSTATS can be used to get stats
> of a single interface or all interfaces with NLM_F_DUMP.
> 
> Future possible new types of stat attributes:
> - IFLA_MPLS_STATS  (nested. for mpls/mdev stats)
> - IFLA_EXTENDED_STATS (nested. extended software netdev stats like
> bridge,
>   vlan, vxlan etc)
> - IFLA_EXTENDED_HW_STATS (nested. extended hardware stats which are
>   available via ethtool today)
> 
> This patch also declares a filter mask for all stat attributes.
> User has to provide a mask of stats attributes to query. This will be
> specified in a new hdr 'struct if_stats_msg' for stats messages.
> 
> Without any attributes in the filter_mask, no stats will be returned.
> 
> This patch has been tested with modified iproute2 ifstat.
> 
> Suggested-by: Jamal Hadi Salim 
> Signed-off-by: Roopa Prabhu 
> ---
>  include/net/rtnetlink.h|   5 ++
>  include/uapi/linux/if_link.h   |  19 
>  include/uapi/linux/rtnetlink.h |   7 ++
>  net/core/rtnetlink.c   | 200
> +
>  4 files changed, 231 insertions(+)
> 
> diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h index
> 2f87c1b..fa68158 100644
> --- a/include/net/rtnetlink.h
> +++ b/include/net/rtnetlink.h
> @@ -131,6 +131,11 @@ struct rtnl_af_ops {
>   const struct nlattr *attr);
>   int (*set_link_af)(struct net_device *dev,
>  const struct nlattr *attr);
> + size_t  (*get_link_af_stats_size)(const struct
> net_device *dev,
> +   u32 filter_mask);
> + int (*fill_link_af_stats)(struct sk_buff *skb,
> +   const struct net_device 
> *dev,
> +   u32 filter_mask);
>  };
> 
>  void __rtnl_af_unregister(struct rtnl_af_ops *ops); diff --git
> a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h index
> 249eef9..0840f3e 100644
> --- a/include/uapi/linux/if_link.h
> +++ b/include/uapi/linux/if_link.h
> @@ -741,4 +741,23 @@ enum {
> 
>  #define IFLA_HSR_MAX (__IFLA_HSR_MAX - 1)
> 
> +/* STATS section */
> +
> +struct if_stats_msg {
> + __u8  family;
> + __u32 ifindex;
> + __u32 filter_mask;
> +};
> +
> +enum {
> + IFLA_STATS_UNSPEC,
> + IFLA_STATS_LINK64,
> + IFLA_STATS_INET6,
> + __IFLA_STATS_MAX,
> +};
> +
> +#define IFLA_STATS_MAX (__IFLA_STATS_MAX - 1)
> +
> +#define IFLA_STATS_FILTER_BIT(ATTR)  (1 << (ATTR))
> +
>  #endif /* _UAPI_LINUX_IF_LINK_H */
> diff --git a/include/uapi/linux/rtnetlink.h
> b/include/uapi/linux/rtnetlink.h index ca764b5..2bbb300 100644
> --- a/include/uapi/linux/rtnetlink.h
> +++ b/include/uapi/linux/rtnetlink.h
> @@ -139,6 +139,13 @@ enum {
>   RTM_GETNSID = 90,
>  #define RTM_GETNSID RTM_GETNSID
> 
> + RTM_NEWSTATS = 92,
> +#define RTM_NEWSTATS RTM_NEWSTATS

I think that RTM_NEWSTATS and RTM_DELSTATS aren't good names, since user 
doesn't add/del statistics but only query.
Maybe just stay with RTM_GETSTATS and the message back to user will be 
RTM_GETSTATS as well?

> + RTM_DELSTATS = 93,
> +#define RTM_DELSTATS RTM_DELSTATS

This is not in used

> + RTM_GETSTATS = 94,
> +#define RTM_GETSTATS RTM_GETSTATS
> +
>   __RTM_MAX,
>  #define RTM_MAX  (((__RTM_MAX + 3) & ~3) - 1)
>  };
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index
> d2d9e5e..d1e3d17 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -3410,6 +3410,203 @@ out:
>   return 

Re: Backport patch from 4.2 to 3.18

2016-03-14 Thread Sasha Levin
On 03/04/2016 04:26 PM, Sasha Levin wrote:
> On 03/04/2016 03:40 PM, Andrei Sharaev wrote:
>> > Hi Sasha,
>> > 
>> > Can you backport this patch for "inet-frag-fixes" to linux kernel 3.18 LTS?
>> > http://kernel.suse.com/cgit/kernel/commit/?h=v4.2-rc5=64b892ad2326348a5b8314167590d240e3bcc69e
>> > 
>> > I get 1-5 kernel panics in month for linux kernels 3.18.24-3.18.26 at my 
>> > NAT server with big IPv4 traffic (10-15 Gbps).
>> > My kernel panics have similar symptoms:
>>> >> <82>general protection fault:  [#1] SMP
>>> >> <82>Modules linked in: bonding ipt_NETFLOW(O) xt_recent configfs 
>>> >> x86_pkg_temp_thermal ixgbe(O)
>>> >> <86>CPU: 13 PID: 29908 Comm: kworker/13:2 Tainted: G  IO   
>>> >> 3.18.26 #1
>>> >> <86>Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS 
>>> >> SE5C610.86B.01.01.0005.101720141054 10/17/2014
>>> >> <82>Workqueue: events inet_frag_worker
>>> >> <86>task: 88046cdba9a0 ti: 880454928000 task.ti: 880454928000
>>> >> <82>RIP: 0010:[]  [] 
>>> >> inet_evict_bucket+0x109/0x160
>>> >> <86>RSP: 0018:88045492bd38  EFLAGS: 00010286
>>> >> <86>RAX: 880441d0e001 RBX: dead001000c0 RCX: 00018030002e
>>> >> <86>RDX: 00018030002f RSI: 880441d0e000 RDI: dead001000c0
>>> >> <86>RBP: 88045492bd88 R08:  R09: 88086cc88500
>>> >> <86>R10: 88046fdb5c50 R11: ea0011074380 R12: 0002
>>> >> <86>R13: 81e02200 R14:  R15: 88083f0942a0
>>> >> <86>FS:  () GS:88046fda() 
>>> >> knlGS:
>>> >> <86>CS:  0010 DS:  ES:  CR0: 80050033
>>> >> <86>CR2: 7fab8f466000 CR3: 00085f66d000 CR4: 001407e0
>>> >> <86>Stack:
>>> >> <82> 81e05a78 81e05a70 88046cdba9a0 88083f0942e0
>>> >> <82> 88046f808c00 0079 81e02200 81e06200
>>> >> <82> 0388 0007 88045492bdf8 815b928a
>>> >> <86>Call Trace:
>>> >> <82> [] inet_frag_worker+0x5a/0x230
>>> >> <82> [] process_one_work+0x12d/0x330
>>> >> <82> [] worker_thread+0x4b/0x450
>>> >> <82> [] ? cancel_delayed_work_sync+0x10/0x10
>>> >> <82> [] kthread+0xc4/0xe0
>>> >> <82> [] ? finish_task_switch+0x49/0xc0
>>> >> <82> [] ? kthread_create_on_node+0x170/0x170
>>> >> <82> [] ret_from_fork+0x58/0x90
>>> >> <82> [] ? kthread_create_on_node+0x170/0x170
>>> >> <82>Code: f6 0f 85 73 ff ff ff 48 8b 45 b8 80 40 08 01 48 8b 7d c8 48 85 
>>> >> ff 74 23 48 83 ef 40 75 0d eb 1b 66 90 48 83 eb 40 48 89 df 74 10 <48> 
>>> >> 8b 5f 40 41 ff 95 70 40 00 00 48 85 db 75 e7 48 83 c4 28 44
>>> >> <22>RIP  [] inet_evict_bucket+0x109/0x160
>>> >> <82> RSP 
>> > 
> Hey Andrei,
> 
> Thanks for the report.
> 
> Usually David Miller (Cc'ed) handles backporting network commits. In this 
> case, I see
> that he has elected not to backport it into 4.1 or 3.18, so I don't want to 
> do it without
> getting an ack from him first.
> 
> David, is it ok to backport these commits back to 3.18 (and probably 4.1)?

Ping?


Re: [PATCH][net-next] ipv6: replace write lock with read lock in addrconf_permanent_addr

2016-03-14 Thread David Miller
From: roy.qing...@gmail.com
Date: Mon, 14 Mar 2016 17:35:08 +0800

> From: Li RongQing 
> 
> nothing of idev is changed, so read lock is enough
> 
> Signed-off-by: Li RongQing 

We need it for the modifications made by fixup_permanent_addr().


Re: [PATCH 1/1] net: Fix use after free in the recvmmsg exit path

2016-03-14 Thread David Miller
From: Arnaldo Carvalho de Melo 
Date: Mon, 14 Mar 2016 09:56:35 -0300

> From: Arnaldo Carvalho de Melo 
> 
> The syzkaller fuzzer hit the following use-after-free:
> 
>   Call Trace:
>[] __asan_report_load8_noabort+0x3e/0x40 
> mm/kasan/report.c:295
>[] __sys_recvmmsg+0x6fa/0x7f0 net/socket.c:2261
>[< inline >] SYSC_recvmmsg net/socket.c:2281
>[] SyS_recvmmsg+0x16f/0x180 net/socket.c:2270
>[] entry_SYSCALL_64_fastpath+0x16/0x7a
>   arch/x86/entry/entry_64.S:185
> 
> And, as Dmitry rightly assessed, that is because we can drop the
> reference and then touch it when the underlying recvmsg calls return
> some packets and then hit an error, which will make recvmmsg to set
> sock->sk->sk_err, oops, fix it.
> 
> Reported-and-Tested-by: Dmitry Vyukov 
> Cc: Alexander Potapenko 
> Cc: Eric Dumazet 
> Cc: Kostya Serebryany 
> Cc: Sasha Levin 
> Fixes: a2e2725541fa ("net: Introduce recvmmsg socket syscall")
> http://lkml.kernel.org/r/20160122211644.gc2...@redhat.com
> Signed-off-by: Arnaldo Carvalho de Melo 

Applied and queued up for -stable, thanks Arnaldo!


Re: userns, netns, and quick physical memory consumption by unprivileged user

2016-03-14 Thread Yuriy M. Kaminskiy
On 03/14/16 12:14 , Michal Hocko wrote:
> On Fri 11-03-16 18:06:59, Yuriy M. Kaminskiy wrote:
> [...]
>> And also tried with memcg:
>>   t=/sys/fs/cgroup/memory/test1;mkdir $t;echo 0 >$t/tasks;
>>   echo 48M >$t/memory.limit_in_bytes; su testuser [...]
>> and it has not helped at all (rather opposite, it ended up with killed
>> init and kernel panic; well, later is pure (un)luck; but point is, memcg
>> apparently *CANNOT* curb net/ns allocations).
>
> It seems you were using memcg v1 here. This didn't have the kernel
> memory accounting enabled by default. With the v2 you get both user and

Hrr. Indeed. And used (distro) kernel compiled without MEMCG_KMEM, so
this test was useless. (However, as distro kernel lacks MEMCG_KMEM, it
means most users won't be able to use it as well[*], so unpriv userns are
unsafe to use for all of them and should be disabled).

That said, not sure if it would have helped in kernels <= 4.4 (would
those allocation be called in context that allows them to be
accounted by [correct] memcg?), but it looks like with upcoming change to
whitelisting (explicit GPF_ACCOUNT), it won't (as almost nothing in
net/* uses it).

> kernel (well some subset of it) accounting enabled. Whether we account
> also netns related data structures sufficiently is a question. I haven't

Except for conntrack tables, it is not exactly tied to netnes, it's
regular CAP_NET_ADMIN things (routing, addresses, links, iptables, etc
that can be added via netlink messages). Just userns+netns gives right
to tweak with them to regular user.

> checked.  But it would be worth trying and fix.



Re: [PATCH -next] bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdict

2016-03-14 Thread Zefir Kurtisi
On 03/12/2016 11:14 AM, Florian Westphal wrote:
> Zefir Kurtisi reported kernel panic with an openwrt specific patch.
> However, it turns out that mainline has a similar bug waiting to happen.
> 
> Once NF_HOOK() returns the skb is in undefined state and must not be
> used.   Moreover, the okfn must consume the skb to support async
> processing (NF_QUEUE).
> 
> Current okfn in this spot doesn't consume it and caller assumes that
> NF_HOOK return value tells us if skb was freed or not, but thats wrong.
> 
> It "works" because no in-tree user registers a NFPROTO_BRIDGE hook at
> LOCAL_IN that returns STOLEN or NF_QUEUE verdicts.
> 
> Once we add NF_QUEUE support for nftables bridge this will break --
> NF_QUEUE holds the skb for async processing, caller will erronoulsy
> return RX_HANDLER_PASS and on reinject netfilter will access free'd skb.
> 
> Fix this by pushing skb up the stack in the okfn instead.
> 
> NB: It also seems dubious to use LOCAL_IN while bypassing PRE_ROUTING
> completely in this case but this is how its been forever so it seems
> preferable to not change this.
> 
> Cc: Felix Fietkau 
> Cc: Zefir Kurtisi 
> Signed-off-by: Florian Westphal 
> ---
>  
Looks good: applying the same fix-pattern to OpenWRT private patches solved the
oops previously observed.

Thanks for the quick resolution.

Tested-by: Zefir Kurtisi 



Re: [PATCH net-next v1] tipc: make sure IPv6 header fits in skb headroom

2016-03-14 Thread David Miller
From: Richard Alpe 
Date: Mon, 14 Mar 2016 09:43:52 +0100

> Expand headroom further in order to be able to fit the larger IPv6
> header. Prior to this patch this caused a skb under panic for certain
> tipc packets when using IPv6 UDP bearer(s).
> 
> Signed-off-by: Richard Alpe 
> Acked-by: Jon Maloy 

Applied.


Re: [PATCH v6 net-next 00/10] API set for HW Buffer management

2016-03-14 Thread David Miller
From: Gregory CLEMENT 
Date: Mon, 14 Mar 2016 09:38:55 +0100

> This is the sixth version of the API set for HW Buffer management (that was
> initially submitted here:
> http://thread.gmane.org/gmane.linux.kernel/2125152).

Series applied, thanks.


Re: [net-next PATCH 3/7] net: bulk alloc and reuse of SKBs in NAPI context

2016-03-14 Thread Jesper Dangaard Brouer
On Sun, 13 Mar 2016 16:06:17 +0200
Rana Shahout  wrote:

> On Fri, Mar 4, 2016 at 3:01 PM, Jesper Dangaard Brouer
>  wrote:
> 
> >  /* build_skb() is wrapper over __build_skb(), that specifically
> >   * takes care of skb->head and skb->pfmemalloc
> >   * This means that if @frag_size is not zero, then @data must be backed
> > @@ -490,8 +500,8 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct 
> > *napi, unsigned int len,
> >
> > len += NET_SKB_PAD + NET_IP_ALIGN;
> >
> > -   if ((len > SKB_WITH_OVERHEAD(PAGE_SIZE)) ||
> > -   (gfp_mask & (__GFP_DIRECT_RECLAIM | GFP_DMA))) {
> > +   if (unlikely((len > SKB_WITH_OVERHEAD(PAGE_SIZE)) ||
> > +(gfp_mask & (__GFP_DIRECT_RECLAIM | GFP_DMA {  
> 
> Why unlikely? I know it is better for the common case where most
> likely linear SKBs are << SKB_WITH_OVERHEAD(PAGE_SIZE)), but what
> about the case of Hardware LRO,  where linear SKB is likely to be >>
> SKB_WITH_OVERHEAD(PAGE_SIZE)).

You said it yourself, this is better for the common case.  With
unlikely() I'm asking the compiler to layout the code for the common
case.  This helps the CPU instruction cache prefetcher.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer


Re: [net-next,iproute2] netconf: add support for ignore route attribute

2016-03-14 Thread Stephen Hemminger
On Mon, 14 Mar 2016 04:55:36 +
Zhang Shengju  wrote:

> Add support for ignore_routes_with_linkdown attribute.
> 
> Signed-off-by: Zhang Shengju 
> ---
>  ip/ipnetconf.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/ip/ipnetconf.c b/ip/ipnetconf.c
> index eca6eee..6fec818 100644
> --- a/ip/ipnetconf.c
> +++ b/ip/ipnetconf.c
> @@ -119,6 +119,10 @@ int print_netconf(const struct sockaddr_nl *who, struct 
> rtnl_ctrl_data *ctrl,
>   fprintf(fp, "proxy_neigh %s ",
>   *(int *)RTA_DATA(tb[NETCONFA_PROXY_NEIGH])?"on":"off");
>  
> + if (tb[NETCONFA_IGNORE_ROUTES_WITH_LINKDOWN])
> + fprintf(fp, "ignore_routes_with_linkdown %s ",
> + *(int 
> *)RTA_DATA(tb[NETCONFA_IGNORE_ROUTES_WITH_LINKDOWN])?"on":"off");

This is a good idea.

But the option name is too long, and the code does not follow current best 
practices.
  1. Lines are too long
  2. There needs to be whitespace around ? :
  3. There are helper routines (rte_getattr_XXX) which should be used rather 
than
 cast RTE_DATA directly.

Also, help and man page??


Re: [PATCH 2/3] net: hns: add Hisilicon RoCE support

2016-03-14 Thread Leon Romanovsky
On Mon, Mar 14, 2016 at 09:12:28AM +0800, Yankejian (Hackim Yim) wrote:
> 
> 
> On 2016/3/12 18:43, Leon Romanovsky wrote:
> > On Fri, Mar 11, 2016 at 06:37:10PM +0800, Lijun Ou wrote:
> >> It added hns_dsaf_roce_reset routine for roce driver.
> >> RoCE is a feature of hns.
> >> In hip06 SOC, in roce reset process, it's needed to configure
> >> dsaf channel reset,port and sl map info.
> >>
> >> Signed-off-by: Lijun Ou 
> >> Signed-off-by: Wei Hu(Xavier) 
> >> Signed-off-by: Lisheng 
> >> ---
> >>  drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c | 84 
> >> ++
> >>  drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.h | 14 
> >>  drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c | 62 +---
> >>  drivers/net/ethernet/hisilicon/hns/hns_dsaf_reg.h  | 13 
> >>  4 files changed, 163 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c 
> >> b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c
> >> index 38fc5be..a0f0d4f 100644
> >> --- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c
> >> +++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c
> >> @@ -12,6 +12,7 @@
> >>  #include 
> >>  #include 
> >>  #include 
> >> +#include 
> >>  #include 
> >>  #include 
> >>  #include 
> >> @@ -2593,6 +2594,89 @@ static struct platform_driver g_dsaf_driver = {
> >>  
> >>  module_platform_driver(g_dsaf_driver);
> >>  
> >> +/**
> >> + * hns_dsaf_roce_reset - reset dsaf and roce
> >> + * @dsaf_fwnode: Pointer to framework node for the dasf
> >> + * @val: 0 - request reset , 1 - drop reset
> >> + * retuen 0 - success , negative --fail
> >> + */
> >> +int hns_dsaf_roce_reset(struct fwnode_handle *dsaf_fwnode, u32 val)
> >> +{
> >> +  struct dsaf_device *dsaf_dev;
> >> +  struct platform_device *pdev;
> >> +  unsigned int mp;
> >> +  unsigned int sl;
> >> +  unsigned int credit;
> >> +  int i;
> >> +  const u32 port_map[DSAF_ROCE_CREDIT_CHN][DSAF_ROCE_CHAN_MODE] = {
> >> +  {0, 0, 0},
> >> +  {1, 0, 0},
> >> +  {2, 1, 0},
> >> +  {3, 1, 0},
> >> +  {4, 2, 1},
> >> +  {4, 2, 1},
> >> +  {5, 3, 1},
> >> +  {5, 3, 1},
> >> +  };
> >> +  const u32 sl_map[DSAF_ROCE_CREDIT_CHN][DSAF_ROCE_CHAN_MODE] = {
> >> +  {0, 0, 0},
> >> +  {0, 1, 1},
> >> +  {0, 0, 2},
> >> +  {0, 1, 3},
> >> +  {0, 0, 0},
> >> +  {1, 1, 1},
> >> +  {0, 0, 2},
> >> +  {1, 1, 3},
> >> +  };
> > Do you have a plan to send a version with enums/defines for this
> > numbers? Especially for _CHAN_MODE.
> >
> > .
> 
> Hi leon,
> 
> it seems the enums is added in hns_dsaf_main.h, as belows:
> 
> diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.h 
> b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.h
> index 5fea226..c917b9a 100644
> --- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.h
> +++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.h
> @@ -40,6 +40,16 @@ struct hns_mac_cb;
>  #define DSAF_DUMP_REGS_NUM 504
>  #define DSAF_STATIC_NUM 28
>  
> +#define DSAF_ROCE_CREDIT_CHN 8
> +#define DSAF_ROCE_CHAN_MODE 3
> +
> +enum dsaf_roce_port_port_mode {
> + DSAF_ROCE_6PORT_MODE,
> + DSAF_ROCE_4PORT_MODE,
> + DSAF_ROCE_2PORT_MODE,
> + DSAF_ROCE_CHAN_MODE_NUM
> +};
> +

These defines are used as an index entry into si_map and port_map arrays
and seems as not related to the actual data.

I suggest you to take this code back to drawing table, redesign it,
clean unused functions and defines and resubmit it.

Thanks.

> 
> MBR, Kejian
> 
> 
> 
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] infiniband: hns: add Hisilicon RoCE support(driver code)

2016-03-14 Thread oulijun
Hi Parav Pandit, thanks your reviewing.
On 2016/3/4 17:37, Parav Pandit wrote:
> On Fri, Mar 4, 2016 at 2:11 PM, Wei Hu(Xavier)  
> wrote:
>> +
>> +int hns_roce_register_device(struct hns_roce_dev *hr_dev)
>> +{
>> +   int ret;
>> +   struct hns_roce_ib_iboe *iboe = NULL;
>> +   struct ib_device *ib_dev = NULL;
>> +   struct device *dev = _dev->pdev->dev;
>> +
>> +   iboe = _dev->iboe;
>> +
>> +   ib_dev = _dev->ib_dev;
>> +   strlcpy(ib_dev->name, "hisi_%d", IB_DEVICE_NAME_MAX);
>> +
>> +   ib_dev->owner   = THIS_MODULE;
>> +   ib_dev->node_type   = RDMA_NODE_IB_CA;
>> +   ib_dev->dma_device  = dev;
>> +
>> +   ib_dev->phys_port_cnt   = hr_dev->caps.num_ports;
>> +   ib_dev->local_dma_lkey  = hr_dev->caps.reserved_lkey;
>> +   ib_dev->num_comp_vectors= hr_dev->caps.num_comp_vectors;
>> +   ib_dev->uverbs_abi_ver  = 1;
>> +   ib_dev->uverbs_cmd_mask =
>> +   (1ULL << IB_USER_VERBS_CMD_GET_CONTEXT) |
>> +   (1ULL << IB_USER_VERBS_CMD_QUERY_DEVICE) |
>> +   (1ULL << IB_USER_VERBS_CMD_QUERY_PORT) |
>> +   (1ULL << IB_USER_VERBS_CMD_ALLOC_PD) |
>> +   (1ULL << IB_USER_VERBS_CMD_DEALLOC_PD) |
>> +   (1ULL << IB_USER_VERBS_CMD_REG_MR) |
>> +   (1ULL << IB_USER_VERBS_CMD_DEREG_MR) |
>> +   (1ULL << IB_USER_VERBS_CMD_CREATE_COMP_CHANNEL) |
>> +   (1ULL << IB_USER_VERBS_CMD_CREATE_CQ) |
>> +   (1ULL << IB_USER_VERBS_CMD_DESTROY_CQ) |
>> +   (1ULL << IB_USER_VERBS_CMD_CREATE_QP) |
>> +   (1ULL << IB_USER_VERBS_CMD_MODIFY_QP) |
>> +   (1ULL << IB_USER_VERBS_CMD_QUERY_QP) |
>> +   (1ULL << IB_USER_VERBS_CMD_DESTROY_QP);
>> +
> 
> Since SRQ is not supported in this driver version, can you keep
> remaining code base also to not bother about SRQ specifically
> poll_cq_one, modify_qp, destroy_qp etc?
> SRQ support can come as complete additional patch along with cmd_mask,
> callbacks and rest of the code.
> 
> .
Sorry, I see your review in time.
Sure, SRQ is not supported in current roce driver. I have verified the function
for RDMA. It is not influence. For your question, we need to analyse it 
scientific.
after that, i will reply your doubt, is that ok?

thanks
Lijun Ou

> 





Re: [PATCH] b43: Fix memory leaks in b43_bus_dev_ssb_init and b43_bus_dev_bcma_init

2016-03-14 Thread Kalle Valo
Sudip Mukherjee  writes:

> From: Jia-Ju Bai 
>
> The memory allocated by kzalloc in b43_bus_dev_ssb_init and
> b43_bus_dev_bcma_init is not freed.
> This patch fixes the bug by adding kfree in b43_ssb_remove,
> b43_bcma_remove and error handling code of b43_bcma_probe.
>
> Thanks Michael for his suggestion.
>
> Signed-off-by: Jia-Ju Bai 
> Acked-by: Michael Büsch 
> Signed-off-by: Sudip Mukherjee 

If no objections I'm planning to queue this to 4.6-rc2.

-- 
Kalle Valo


Re: [PATCH V2 net 0/4] net: hns: bugs fix about hns driver

2016-03-14 Thread Lisheng011



We will appoint Daode Huang  to send these patchs.

Thanks.


在 2016/3/12 0:56, David Miller 写道:

This does not work.

I will not allow two sets of people sending me patches in parallel to the
same driver at the same time.

Have one person manage the maintainence of this driver and consolidate the
patch submissions to me.

Thanks.

.






Re: [net-next PATCH iproute2 v2 1/1] tc: introduce IFE action

2016-03-14 Thread Stephen Hemminger
On Wed,  9 Mar 2016 07:04:36 -0500
Jamal Hadi Salim  wrote:

> + fprintf(f, "\t Metadata: ");
> +
> + if (metalist[IFE_META_SKBMARK]) {
> + len = RTA_PAYLOAD(metalist[IFE_META_SKBMARK]);
> + if (len) {
> + __u32 *mmark = 
> RTA_DATA(metalist[IFE_META_SKBMARK]);
> + fprintf(f, "use mark %d ", *mmark);
> + } else
> + fprintf(f, "allow mark ");

This code has diverged way from the general rule that ip utilities display
format should match the command format. For example the properties shown
on "ip route show" match those of "ip route add".

Also over the last several years, the code in iproute2 has switched from casting
RTA_DATA() everywhere to a cleaner interface rte_getattr_u32() more like what
is used in mnl library.

The code has also gotten deeply intended creating lots of lines that are too 
long.

WARNING: 'doesnt' may be misspelled - perhaps 'doesn't'?
#21: 
then provide a default so the user doesnt have to specify it.

WARNING: Possible unwrapped commit description (prefer a maximum 75 chars per 
line)
#25: 
"Distributing Linux Traffic Control Classifier-Action Subsystem"

WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#143: 
new file mode 100644

ERROR: "foo * bar" should be "foo *bar"
#378: FILE: tc/m_ife.c:231:
+static int print_ife(struct action_util *au, FILE * f, struct rtattr *arg)

WARNING: Missing a blank line after declarations
#384: FILE: tc/m_ife.c:237:
+   int has_optional = 0;
+   SPRINT_BUF(b1);

WARNING: Missing a blank line after declarations
#406: FILE: tc/m_ife.c:259:
+   struct tcf_t *tm = RTA_DATA(tb[TCA_IFE_TM]);
+   print_tm(f, tm);

WARNING: line over 80 characters
#449: FILE: tc/m_ife.c:302:
+   __u32 *mmark = 
RTA_DATA(metalist[IFE_META_SKBMARK]);

WARNING: Missing a blank line after declarations
#450: FILE: tc/m_ife.c:303:
+   __u32 *mmark = 
RTA_DATA(metalist[IFE_META_SKBMARK]);
+   fprintf(f, "use mark %d ", *mmark);

WARNING: line over 80 characters
#458: FILE: tc/m_ife.c:311:
+   __u32 *mhash = 
RTA_DATA(metalist[IFE_META_HASHID]);

WARNING: Missing a blank line after declarations
#459: FILE: tc/m_ife.c:312:
+   __u32 *mhash = 
RTA_DATA(metalist[IFE_META_HASHID]);
+   fprintf(f, "use hash %d ", *mhash);

WARNING: line over 80 characters
#467: FILE: tc/m_ife.c:320:
+   __u32 *mprio = 
RTA_DATA(metalist[IFE_META_PRIO]);

WARNING: Missing a blank line after declarations
#468: FILE: tc/m_ife.c:321:
+   __u32 *mprio = 
RTA_DATA(metalist[IFE_META_PRIO]);
+   fprintf(f, "use prio %d ", *mprio);

total: 1 errors, 11 warnings, 343 lines checked


When will net-next merge with linux-next?

2016-03-14 Thread Dexuan Cui
Hi David,
I have a pending patch of the hv_sock driver, which should go into the
kernel through the net-next tree:
https://lkml.org/lkml/2016/2/14/7

The VMBus side's supporting patches of hv_sock have been in Greg's tree
and linux-next for more than 1 month, but they haven't been in net-next
yet, I suppose this is because of the releasing of 4.5.

Now 4.5 is released. Will you merge with Greg's tree or linux-next?

I read netdev-FAQ.txt, but still don't have a clear idea about how things
work in my case.

Thanks,
-- Dexuan




Re: [PATCH] kcm: fix variable type

2016-03-14 Thread Andrzej Hajda
On 03/11/2016 05:44 PM, David Miller wrote:
> From: Andrzej Hajda 
> Date: Fri, 11 Mar 2016 07:51:15 +0100
>
>> Function skb_splice_bits can return negative values, its result should
>> be assigned to signed variable to allow correct error checking.
>>
>> The problem has been detected using patch
>> scripts/coccinelle/tests/unsigned_lesser_than_zero.cocci.
>>
>> Signed-off-by: Andrzej Hajda 
> Since skb_splice_bits() returns an 'int', that would be a more appropriate
> type to use here.
>
> Thank you.
>
>
On the other side kcm_splice_read use this local var as return variable,
and the return type is ssize_t.

Digging deeper it looks like:
ssize_t kcm_splice_read(...) returns result of
int skb_splice_bits(...) which returns result of
ssize_t splice_cb(...) callback.

It looks code is somehow inconsistent, but maybe there are other reasons
which I am not aware of.

Regards
Andrzej






Re: [PATCH iproute2 net-next v4] bridge: mdb: add support for extended router port information

2016-03-14 Thread Stephen Hemminger
On Mon,  7 Mar 2016 11:10:32 +0100
Nikolay Aleksandrov  wrote:

>   rem = RTA_PAYLOAD(attr);
>   for (i = RTA_DATA(attr); RTA_OK(i, rem); i = RTA_NEXT(i, rem)) {
>   port_ifindex = RTA_DATA(i);
> - fprintf(f, "%s ", ll_index_to_name(*port_ifindex));
> + if (show_stats) {
> + struct rtattr *tb[MDBA_ROUTER_PATTR_MAX + 1];
> +
> + parse_rtattr(tb, MDBA_ROUTER_PATTR_MAX,
> +  MDB_RTR_RTA(RTA_DATA(i)),
> +  RTA_PAYLOAD(i) -
> +  RTA_ALIGN(sizeof(*port_ifindex)));
> +
> + fprintf(f, "router ports on %s: %s",
> + ll_index_to_name(brifidx),
> + ll_index_to_name(*port_ifindex));
> + if (tb[MDBA_ROUTER_PATTR_TIMER]) {
> + struct timeval tv;
> + __u32 tval;
> +
> + tval = rta_getattr_u32(
> + tb[MDBA_ROUTER_PATTR_TIMER]);
> + __jiffies_to_tv(, tval);
> + fprintf(f, " %4i.%.2i",
> + (int)tv.tv_sec, (int)tv.tv_usec/1);
> + }

You are having to cut lines short here to fit 80 characters, maybe
good time to make statistics a helper function.



[PATCH v3 1/9] net: arc_emac: make the rockchip emac document more compatible

2016-03-14 Thread Caesar Wang
Add the rk3036 SoCs to match driver for document since the emac driver
has supported the rk3036 SoCs.

This patch adds the rk3036/rk3066/rk3188 SoCS to compatible for rockchip
emac ducument. Also, that will suit for other SoCs in the future.

Signed-off-by: Caesar Wang 
Cc: Rob Herring 
Cc: devicet...@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: "David S. Miller" 
Cc: Alexander Kochetkov 

---

Changes in v3:
- %s/he/the
- Add the Cc people

Changes in v2:
- change the commit and remove the repeat the name 'rockchip'.

 Documentation/devicetree/bindings/net/emac_rockchip.txt | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/emac_rockchip.txt 
b/Documentation/devicetree/bindings/net/emac_rockchip.txt
index 8dc1c79..05bd7da 100644
--- a/Documentation/devicetree/bindings/net/emac_rockchip.txt
+++ b/Documentation/devicetree/bindings/net/emac_rockchip.txt
@@ -1,8 +1,10 @@
-* ARC EMAC 10/100 Ethernet platform driver for Rockchip Rk3066/RK3188 SoCs
+* ARC EMAC 10/100 Ethernet platform driver for Rockchip RK3036/RK3066/RK3188 
SoCs
 
 Required properties:
-- compatible: Should be "rockchip,rk3066-emac" or "rockchip,rk3188-emac"
-  according to the target SoC.
+- compatible: should be "rockchip,-emac"
+   "rockchip,rk3036-emac": found on RK3036 SoCs
+   "rockchip,rk3066-emac": found on RK3066 SoCs
+   "rockchip,rk3188-emac": found on RK3188 SoCs
 - reg: Address and length of the register set for the device
 - interrupts: Should contain the EMAC interrupts
 - rockchip,grf: phandle to the syscon grf used to control speed and mode
-- 
1.9.1



Re: [PATCH v6 net-next 01/10] misc: sram: add optional ioremap without write combining

2016-03-14 Thread Gregory CLEMENT
Hi Arnd,

I forgot to add you in CC for this patch.
What is your opinion about it?

Gregory
 
 On lun., mars 14 2016, Gregory CLEMENT  
wrote:

> From: Marcin Wojtas 
>
> Some SRAM users may require non-bufferable access to the memory, which is
> impossible, because devm_ioremap_wc() is used for setting sram->virt_base.
>
> This commit adds optional flag 'no-memory-wc', which allow to choose remap
> method, using DT property. Documentation is updated accordingly.
>
> Signed-off-by: Marcin Wojtas 
> ---
>  Documentation/devicetree/bindings/sram/sram.txt | 5 +
>  drivers/misc/sram.c | 5 -
>  2 files changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/devicetree/bindings/sram/sram.txt 
> b/Documentation/devicetree/bindings/sram/sram.txt
> index 42ee9438b771..227e3a341af1 100644
> --- a/Documentation/devicetree/bindings/sram/sram.txt
> +++ b/Documentation/devicetree/bindings/sram/sram.txt
> @@ -25,6 +25,11 @@ Required properties in the sram node:
>  - ranges : standard definition, should translate from local addresses
> within the sram to bus addresses
>  
> +Optional properties in the sram node:
> +
> +- no-memory-wc : the flag indicating, that SRAM memory region has not to
> + be remapped as write combining. WC is used by default.
> +
>  Required properties in the area nodes:
>  
>  - reg : iomem address range, relative to the SRAM range
> diff --git a/drivers/misc/sram.c b/drivers/misc/sram.c
> index 736dae715dbf..69cdabea9c03 100644
> --- a/drivers/misc/sram.c
> +++ b/drivers/misc/sram.c
> @@ -360,7 +360,10 @@ static int sram_probe(struct platform_device *pdev)
>   return -EBUSY;
>   }
>  
> - sram->virt_base = devm_ioremap_wc(sram->dev, res->start, size);
> + if (of_property_read_bool(pdev->dev.of_node, "no-memory-wc"))
> + sram->virt_base = devm_ioremap(sram->dev, res->start, size);
> + else
> + sram->virt_base = devm_ioremap_wc(sram->dev, res->start, size);
>   if (IS_ERR(sram->virt_base))
>   return PTR_ERR(sram->virt_base);
>  
> -- 
> 2.5.0
>

-- 
Gregory Clement, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com


Re: [PATCHv3 (net.git) 2/2] stmmac: fix MDIO settings

2016-03-14 Thread Giuseppe CAVALLARO

On 3/14/2016 10:14 AM, Gabriel Fernandez wrote:

Hi Peppe,

Just one remark below


diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c 
b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
index 6a52fa1..d2322e9 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c


[snip]


+static bool stmmac_dt_phy(struct plat_stmmacenet_data *plat,
+ struct device_node *np, struct device *dev)
+{
+   bool ret = true;
+
+   /* If phy-handle property is passed from DT, use it as the PHY */
+   plat->phy_node = of_parse_phandle(np, "phy-handle", 0);
+   if (plat->phy_node)
+   dev_dbg(dev, "Found phy-handle subnode\n");
+
+   /* If phy-handle is not specified, check if we have a fixed-phy */
+   if (!plat->phy_node && of_phy_is_fixed_link(np)) {
+   if ((of_phy_register_fixed_link(np) < 0))
+   return -ENODEV;
+

stmmac_dt_phy() function should return a Boolean


Thx Gabriel, I will fix return value in V4.

peppe




Best Regards.

Gabriel






[PATCH][net-next] ipv6: replace write lock with read lock in addrconf_permanent_addr

2016-03-14 Thread roy . qing . li
From: Li RongQing 

nothing of idev is changed, so read lock is enough

Signed-off-by: Li RongQing 
---
 net/ipv6/addrconf.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 8c0dab2..d3f0d87 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -3228,21 +3228,21 @@ static void addrconf_permanent_addr(struct net_device 
*dev)
if (!idev)
return;
 
-   write_lock_bh(>lock);
+   read_lock_bh(>lock);
 
list_for_each_entry_safe(ifp, tmp, >addr_list, if_list) {
if ((ifp->flags & IFA_F_PERMANENT) &&
fixup_permanent_addr(idev, ifp) < 0) {
-   write_unlock_bh(>lock);
+   read_unlock_bh(>lock);
ipv6_del_addr(ifp);
-   write_lock_bh(>lock);
+   read_lock_bh(>lock);
 
net_info_ratelimited("%s: Failed to add prefix route 
for address %pI6c; dropping\n",
 idev->dev->name, >addr);
}
}
 
-   write_unlock_bh(>lock);
+   read_unlock_bh(>lock);
 }
 
 static int addrconf_notify(struct notifier_block *this, unsigned long event,
-- 
2.1.4



Re: [PATCH v6 net-next 08/10] net: mvneta: bm: add support for hardware buffer management

2016-03-14 Thread Jesper Dangaard Brouer

On Mon, 14 Mar 2016 09:39:03 +0100 Gregory CLEMENT 
 wrote:

> diff --git a/drivers/net/ethernet/marvell/mvneta.c 
> b/drivers/net/ethernet/marvell/mvneta.c
> index b0ae69f84493..2847c0c291de 100644
> --- a/drivers/net/ethernet/marvell/mvneta.c
> +++ b/drivers/net/ethernet/marvell/mvneta.c
[...]
> -static void *mvneta_frag_alloc(const struct mvneta_port *pp)
> +void *mvneta_frag_alloc(unsigned int frag_size)
>  {
> - if (likely(pp->frag_size <= PAGE_SIZE))
> - return netdev_alloc_frag(pp->frag_size);
> + if (likely(frag_size <= PAGE_SIZE))
> + return netdev_alloc_frag(frag_size);

(I know you are modifying existing code here.)

Be aware that there is a significant performance advantage of using
napi_alloc_frag() over netdev_alloc_frag().  You obviously can only use
the NAPI call, if you indeed are running in NAPI/BH context.

>   else
> - return kmalloc(pp->frag_size, GFP_ATOMIC);
> + return kmalloc(frag_size, GFP_ATOMIC);
>  }
> +EXPORT_SYMBOL_GPL(mvneta_frag_alloc);
  


-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer


RE: [v6, 5/5] mmc: sdhci-of-esdhc: fix host version for T4240-R1.0-R2.0

2016-03-14 Thread Yangbo Lu
> -Original Message-
> From: Arnd Bergmann [mailto:a...@arndb.de]
> Sent: Monday, March 14, 2016 6:26 AM
> To: linuxppc-...@lists.ozlabs.org
> Cc: Yangbo Lu; devicet...@vger.kernel.org; linux-arm-
> ker...@lists.infradead.org; linux-ker...@vger.kernel.org; linux-
> c...@vger.kernel.org; linux-...@vger.kernel.org; iommu@lists.linux-
> foundation.org; netdev@vger.kernel.org; linux-...@vger.kernel.org;
> ulf.hans...@linaro.org; Zhao Qiang; Russell King; Bhupesh Sharma; Joerg
> Roedel; Santosh Shilimkar; Scott Wood; Rob Herring; Claudiu Manoil; Kumar
> Gala; Yang-Leo Li; Xiaobo Xie
> Subject: Re: [v6, 5/5] mmc: sdhci-of-esdhc: fix host version for T4240-
> R1.0-R2.0
> 
> On Wednesday 09 March 2016 18:08:51 Yangbo Lu wrote:
> > @@ -567,10 +580,20 @@ static void esdhc_init(struct platform_device
> *pdev, struct sdhci_host *host)
> > struct sdhci_pltfm_host *pltfm_host;
> > struct sdhci_esdhc *esdhc;
> > u16 host_ver;
> > +   u32 svr;
> >
> > pltfm_host = sdhci_priv(host);
> > esdhc = sdhci_pltfm_priv(pltfm_host);
> >
> > +   fsl_guts_init();
> > +   svr = fsl_guts_get_svr();
> > +   if (svr) {
> > +   esdhc->soc_ver = SVR_SOC_VER(svr);
> > +   esdhc->soc_rev = SVR_REV(svr);
> > +   } else {
> > +   dev_err(>dev, "Failed to get SVR value!\n");
> > +   }
> > +
> 
> This makes the driver non-portable. Better identify the specific
> workarounds based on the compatible string for this device, or add a
> boolean DT property for the quirk.
> 
>   Arnd

[Lu Yangbo-B47093] Hi Arnd, we did have a discussion about using DTS in v1 
before.
https://patchwork.kernel.org/patch/6834221/

We don't have a separate DTS file for each revision of an SOC and if we did, 
we'd constantly have people using the wrong one.
In addition, the device tree is stable ABI and errata are often discovered 
after device tree are deployed.
See the link for details.

So we decide to read SVR from the device-config/guts MMIO block other than 
using DTS.
Thanks.






[PATCH v3 0/9] arc_emac: fixes the emac issues and cleanup emac drivers

2016-03-14 Thread Caesar Wang
Hi all,
This series patches are based on kernel 4.5-rc7+ version.
Linux version 4.5.0-rc7-next-20160311+ (wxt@nb) (...) #45 SMP Sun Mar 13 
16:17:56

The history patch in here:
Patch-v1: https://lkml.org/lkml/2016/3/11/209
Patch-v2: https://lkml.org/lkml/2016/3/13/39

Verified on kylin board with my github.
https://github.com/Caesar-github/rockchip/tree/kylin/next

That's verified on kylin board with ubuntu os.

This series patches are built all pass with Mr.robot on
https://github.com/Caesar-github/linux/tree/build-emac-v3

How to test and verify?

You can refer to the following wiki document.
http://rockchip.wikidot.com/linux-develop-guide

bootup log:
[1.264740] rockchip_emac 1020.ethernet: no regulator found
[1.270908] rockchip_emac 1020.ethernet: ARC EMAC detected with id: 
0x7fd02
[1.278362] rockchip_emac 1020.ethernet: IRQ is 29
[1.283747] rockchip_emac 1020.ethernet: MAC address is now 
06:5d:61:c7:39:41
[1.291314] rockchip_emac 1020.ethernet: GPIO lookup for consumer 
phy-reset
[1.291333] rockchip_emac 1020.ethernet: using device tree for GPIO 
lookup
[1.663155] rockchip_emac 1020.ethernet: connected to Generic PHY phy 
with id 0xc816
[8.863448] rockchip_emac 1020.ethernet eth0: Link is Up - 100Mbps/Full 
- flow control off

root@localhost:/# busybox ping www.baidu.com
PING www.baidu.com (14.215.177.38): 56 data bytes
64 bytes from 14.215.177.38: seq=0 ttl=48 time=35.046 ms
64 bytes from 14.215.177.38: seq=1 ttl=48 time=35.095 ms
64 bytes from 14.215.177.38: seq=2 ttl=48 time=34.203 ms
64 bytes from 14.215.177.38: seq=3 ttl=48 time=38.516 ms
...
---

1) This series has 6 patches: (1--->9)
net: arc_emac: make the rockchip emac document more compatible
net: arc_emac: add phy reset is optional for device tree
net: arc_emac: support the phy reset for emac driver
net: arc: trivial: cleanup the emac driver
clk: rockchip: add node-id for rk3036 emac hclk
clk: rockchip: associate the rk3036 HCLK_EMAC clock-id
clk: rockchip: add clock-id for rk3036 emac pll source clock
clk: rockchip: associate SCLK_MAC_PLL and disable reparenting on rk3036
ARM: dts: rockchip: add support emac for RK3036

2) This series patches have the following descriptions:

Hi Rob, David:
PATCH[1/9-2/9]: >
net: arc_emac: make the rockchip emac document more compatible
net: arc_emac: add phy reset is optional for device tree

The patches change the rockchip emac document for more compatible and
Add the phy reset property for document.
---

Hi David
PATCH[3/9]: >
net: arc_emac: support the phy reset for emac driver

The emac didn't work on kylin board since in some case the clocks parent 
changed.
The kylin hardware connects the phy reset pin, we should use it with real world.
As the previous patch discuss on https://patchwork.kernel.org/patch/8186801/

And as sergei/Heiko suggestions on
https://patchwork.kernel.org/patch/8564571/
---

Hi David
PATCH[4/9]: >
net: arc: trivial: cleanup the emac driver

The first time to look the emac drivers, I think that have to cleanup the 
drivers with scripts.
Although it's the trivial things, in order to be more read.
---

Hi Heiko,Michael,Stephen:
PATCH[5/9-8/9]: > clk: rockchip: rk3036: fix and add node id for emac clock

Four-part from https://patchwork.kernel.org/patch/8564581/
clk: rockchip: add node-id for rk3036 emac hclk
clk: rockchip: associate the rk3036 HCLK_EMAC clock-id
clk: rockchip: add clock-id for rk3036 emac pll source clock
clk: rockchip: associate SCLK_MAC_PLL and disable reparenting on rk3036

Add the emac needed clocks for rk3036 SoCs
---

Hi Heiko:
PATCH[9/9]: >
ARM: dts: rockchip: add support emac for RK3036

Add the emac needed main info for rk3036 dts.
---

Thanks your reviewing! :)


Changes in v3:
- %s/he/the
- Add the Cc people
- As Sergei comments, the original name is better, so
  %s/reset-gpios/phy-reset-gpios
- Add the Cc people.
- Caused the build error since the missing include head file.
- %s/reset/phy-reset to match the device tree.
- Add the Cc people
- Add the Cc people.
- Add the Cc people.
- Add the Cc people.
- Add the Cc people.
- Add the Cc people.
- rename reset-gpio to phy-reset-gpios.
- change the commit.
- remove the pcfg_output_high, that's really not needed for emac.
- Add the Cc people.
- Fixes the 'zhengxing' to 'Xing Zheng'.

Changes in v2:
- change the commit and remove the repeat the name 'rockchip'.
- %s/phy-reset-gpios/reset-gpios
- As the pervious version, Sergei and Heiko comments on
  https://patchwork.kernel.org/patch/8564571/.
- Nevermind, add signed-off since Heiko the original patch,
  refer the Heiko's test patch on
  
https://github.com/mmind/linux-rockchip/commit/a943c588783438ff1c508dfa8c79f1709aa5775e
  :)
- As the robot notice the build error since overflow in implicit
  constant conversion.
- rename phy-reset-gpio to reset-gpios.

Caesar Wang (4):
  net: arc_emac: make the rockchip emac document more compatible
  net: arc_emac: add phy reset 

[PATCH v3 3/9] net: arc_emac: support the phy reset for emac driver

2016-03-14 Thread Caesar Wang
This patch adds to support the emac phy reset.

Different boards may require different phy reset duration. Add property
phy-reset-duration for emac driver, so that the boards that need
a longer reset duration can specify it in their device tree.

Signed-off-by: Heiko Stuebner 
Signed-off-by: Caesar Wang 
Cc: "David S. Miller" 
Cc: netdev@vger.kernel.org
Cc: Alexander Kochetkov 
Cc: Sergei Shtylyov 

---

Changes in v3:
- Caused the build error since the missing include head file.
- %s/reset/phy-reset to match the device tree.
- Add the Cc people

Changes in v2:
- As the pervious version, Sergei and Heiko comments on
  https://patchwork.kernel.org/patch/8564571/.
- Nevermind, add signed-off since Heiko the original patch,
  refer the Heiko's test patch on
  
https://github.com/mmind/linux-rockchip/commit/a943c588783438ff1c508dfa8c79f1709aa5775e
  :)

 drivers/net/ethernet/arc/emac.h  |  6 ++
 drivers/net/ethernet/arc/emac_mdio.c | 37 
 2 files changed, 43 insertions(+)

diff --git a/drivers/net/ethernet/arc/emac.h b/drivers/net/ethernet/arc/emac.h
index dae1ac3..1a40403 100644
--- a/drivers/net/ethernet/arc/emac.h
+++ b/drivers/net/ethernet/arc/emac.h
@@ -102,6 +102,11 @@ struct buffer_state {
DEFINE_DMA_UNMAP_LEN(len);
 };
 
+struct arc_emac_mdio_bus_data {
+   struct gpio_desc *reset_gpio;
+   int msec;
+};
+
 /**
  * struct arc_emac_priv - Storage of EMAC's private information.
  * @dev:   Pointer to the current device.
@@ -131,6 +136,7 @@ struct arc_emac_priv {
struct device *dev;
struct phy_device *phy_dev;
struct mii_bus *bus;
+   struct arc_emac_mdio_bus_data bus_data;
 
void __iomem *regs;
struct clk *clk;
diff --git a/drivers/net/ethernet/arc/emac_mdio.c 
b/drivers/net/ethernet/arc/emac_mdio.c
index d5ee986..caf7042 100644
--- a/drivers/net/ethernet/arc/emac_mdio.c
+++ b/drivers/net/ethernet/arc/emac_mdio.c
@@ -7,6 +7,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "emac.h"
 
@@ -99,6 +100,25 @@ static int arc_mdio_write(struct mii_bus *bus, int phy_addr,
 }
 
 /**
+ * arc_mdio_reset
+ * @bus: points to the mii_bus structure
+ * Description: reset the MII bus
+ */
+int arc_mdio_reset(struct mii_bus *bus)
+{
+   struct arc_emac_priv *priv = bus->priv;
+   struct arc_emac_mdio_bus_data *data = >bus_data;
+
+   if (data->reset_gpio) {
+   gpiod_set_value_cansleep(data->reset_gpio, 1);
+   msleep(data->msec);
+   gpiod_set_value_cansleep(data->reset_gpio, 0);
+   }
+
+   return 0;
+}
+
+/**
  * arc_mdio_probe - MDIO probe function.
  * @priv:  Pointer to ARC EMAC private data structure.
  *
@@ -109,6 +129,8 @@ static int arc_mdio_write(struct mii_bus *bus, int phy_addr,
  */
 int arc_mdio_probe(struct arc_emac_priv *priv)
 {
+   struct arc_emac_mdio_bus_data *data = >bus_data;
+   struct device_node *np = priv->dev->of_node;
struct mii_bus *bus;
int error;
 
@@ -122,6 +144,21 @@ int arc_mdio_probe(struct arc_emac_priv *priv)
bus->name = "Synopsys MII Bus",
bus->read = _mdio_read;
bus->write = _mdio_write;
+   bus->reset = _mdio_reset;
+
+   /* optional reset-related properties */
+   data->reset_gpio = devm_gpiod_get_optional(priv->dev, "phy-reset",
+  GPIOD_OUT_LOW);
+   if (IS_ERR(data->reset_gpio)) {
+   error = PTR_ERR(data->reset_gpio);
+   dev_err(priv->dev, "Failed to request gpio: %d\n", error);
+   return error;
+   }
+
+   of_property_read_u32(np, "phy-reset-duration", >msec);
+   /* A sane reset duration should not be longer than 1s */
+   if (data->msec > 1000)
+   data->msec = 1;
 
snprintf(bus->id, MII_BUS_ID_SIZE, "%s", bus->name);
 
-- 
1.9.1



[PATCH v3 2/9] net: arc_emac: add phy reset is optional for device tree

2016-03-14 Thread Caesar Wang
This patch adds the following property for arc_emac.

1) phy-reset-gpios:
The phy-reset-gpio is an optional property for arc emac device tree boot.
Change the binding document to match the driver code.

2) phy-reset-duration:
Different boards may require different phy reset duration. Add property
phy-reset-duration for device tree probe, so that the boards that need
a longer reset duration can specify it in their device tree.

Anyway, we can add the above property for arc emac.

Signed-off-by: Caesar Wang 
Cc: Rob Herring 
Cc: devicet...@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: "David S. Miller" 
Cc: Sergei Shtylyov 
Cc; Alexander Kochetkov 

---

Changes in v3:
- As Sergei comments, the original name is better, so
  %s/reset-gpios/phy-reset-gpios
- Add the Cc people.

Changes in v2:
- %s/phy-reset-gpios/reset-gpios

 Documentation/devicetree/bindings/net/arc_emac.txt | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/Documentation/devicetree/bindings/net/arc_emac.txt 
b/Documentation/devicetree/bindings/net/arc_emac.txt
index a1d71eb..c73a0e9 100644
--- a/Documentation/devicetree/bindings/net/arc_emac.txt
+++ b/Documentation/devicetree/bindings/net/arc_emac.txt
@@ -7,6 +7,13 @@ Required properties:
 - max-speed: see ethernet.txt file in the same directory.
 - phy: see ethernet.txt file in the same directory.
 
+Optional properties:
+- phy-reset-gpios : Should specify the gpio for phy reset
+- phy-reset-duration : Reset duration in milliseconds.  Should present
+  only if property "phy-reset-gpios" is available.  Missing the property
+  will have the duration be 1 millisecond.  Numbers greater than 1000 are
+  invalid and 1 millisecond will be used instead.
+
 Clock handling:
 The clock frequency is needed to calculate and set polling period of EMAC.
 It must be provided by one of:
-- 
1.9.1



[PATCH v3 5/9] clk: rockchip: add node-id for rk3036 emac hclk

2016-03-14 Thread Caesar Wang
From: Xing Zheng 

Add the node-id for the emac hclk to the binding header.

Signed-off-by: Xing Zheng 
Signed-off-by: Caesar Wang 
Cc: Xing Zheng 
Cc: Michael Turquette 
Cc: Heiko Stuebner 
Cc: Stephen Boyd 
Cc: linux-...@vger.kernel.org
Cc: linux-rockc...@lists.infradead.org

---

Changes in v3:
- Add the Cc people.

Changes in v2: None

 include/dt-bindings/clock/rk3036-cru.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/dt-bindings/clock/rk3036-cru.h 
b/include/dt-bindings/clock/rk3036-cru.h
index ebc7a7b..3396591 100644
--- a/include/dt-bindings/clock/rk3036-cru.h
+++ b/include/dt-bindings/clock/rk3036-cru.h
@@ -92,6 +92,7 @@
 #define HCLK_SDMMC 456
 #define HCLK_SDIO  457
 #define HCLK_EMMC  459
+#define HCLK_MAC   460
 #define HCLK_I2S   462
 #define HCLK_LCDC  465
 #define HCLK_ROM   467
-- 
1.9.1



[PATCH v3 7/9] clk: rockchip: add clock-id for rk3036 emac pll source clock

2016-03-14 Thread Caesar Wang
From: Xing Zheng 

Suitable PLLs for the emac on the rk3036 are difficult to find
and one of them is the (continuously changing) APLL. So in most
cases it will be necessary to select a PLL manually.
So add a clock-id for it.

Signed-off-by: Xing Zheng 
Signed-off-by: Caesar Wang 
Cc: Xing Zheng 
Cc: Michael Turquette 
Cc: Heiko Stuebner 
Cc: Stephen Boyd 
Cc: linux-...@vger.kernel.org
Cc: linux-rockc...@lists.infradead.org

---

Changes in v3:
- Add the Cc people.

Changes in v2: None

 include/dt-bindings/clock/rk3036-cru.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/dt-bindings/clock/rk3036-cru.h 
b/include/dt-bindings/clock/rk3036-cru.h
index 3396591..de44109 100644
--- a/include/dt-bindings/clock/rk3036-cru.h
+++ b/include/dt-bindings/clock/rk3036-cru.h
@@ -54,6 +54,7 @@
 #define SCLK_PVTM_VIDEO125
 #define SCLK_MAC   151
 #define SCLK_MACREF152
+#define SCLK_MACPLL153
 #define SCLK_SFC   160
 
 /* aclk gates */
-- 
1.9.1



[PATCH net-next v1] tipc: make sure IPv6 header fits in skb headroom

2016-03-14 Thread Richard Alpe
Expand headroom further in order to be able to fit the larger IPv6
header. Prior to this patch this caused a skb under panic for certain
tipc packets when using IPv6 UDP bearer(s).

Signed-off-by: Richard Alpe 
Acked-by: Jon Maloy 
---
 net/tipc/udp_media.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/tipc/udp_media.c b/net/tipc/udp_media.c
index 49b3c2e..6364cff 100644
--- a/net/tipc/udp_media.c
+++ b/net/tipc/udp_media.c
@@ -53,7 +53,7 @@
 /* IANA assigned UDP port */
 #define UDP_PORT_DEFAULT   6118
 
-#define UDP_MIN_HEADROOM28
+#define UDP_MIN_HEADROOM48
 
 /**
  * struct udp_media_addr - IP/UDP addressing information
-- 
2.1.4



Re: [PATCH net-next 3/3] net: dsa: refine netdev event notifier

2016-03-14 Thread Ido Schimmel
Sun, Mar 13, 2016 at 10:21:34PM IST, vivien.dide...@savoirfairelinux.com wrote:
>Rework the netdev event handler, similar to what the Mellanox Spectrum
>driver does, to easily welcome more events later (for example
>NETDEV_PRECHANGEUPPER) and use netdev helpers (such as
>netif_is_bridge_master).
>
>Signed-off-by: Vivien Didelot 

Acked-by: Ido Schimmel 


[PATCH v6 net-next 02/10] ARM: dts: armada-38x: add buffer manager nodes

2016-03-14 Thread Gregory CLEMENT
From: Marcin Wojtas 

Armada 38x network controller supports hardware buffer management (BM).
Since it is now enabled in mvneta driver, appropriate nodes can be added
to armada-38x.dtsi - for the actual common BM unit (bm@c8000) and its
internal SRAM (bm-bppi), which is used for indirect access to buffer
pointer ring residing in DRAM.

Pools - ports mapping, bm-bppi entry in 'soc' node's ranges and optional
parameters are supposed to be set in board files.

Signed-off-by: Marcin Wojtas 
Signed-off-by: Gregory CLEMENT 
---
 arch/arm/boot/dts/armada-38x.dtsi | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/arch/arm/boot/dts/armada-38x.dtsi 
b/arch/arm/boot/dts/armada-38x.dtsi
index e8b7f6726772..066a8f06405c 100644
--- a/arch/arm/boot/dts/armada-38x.dtsi
+++ b/arch/arm/boot/dts/armada-38x.dtsi
@@ -540,6 +540,14 @@
status = "disabled";
};
 
+   bm: bm@c8000 {
+   compatible = "marvell,armada-380-neta-bm";
+   reg = <0xc8000 0xac>;
+   clocks = < 13>;
+   internal-mem = <_bppi>;
+   status = "disabled";
+   };
+
sata@e {
compatible = "marvell,armada-380-ahci";
reg = <0xe 0x2000>;
@@ -618,6 +626,17 @@
#size-cells = <1>;
ranges = <0 MBUS_ID(0x09, 0x15) 0 0x800>;
};
+
+   bm_bppi: bm-bppi {
+   compatible = "mmio-sram";
+   reg = ;
+   ranges = <0 MBUS_ID(0x0c, 0x04) 0 0x10>;
+   #address-cells = <1>;
+   #size-cells = <1>;
+   clocks = < 13>;
+   no-memory-wc;
+   status = "disabled";
+   };
};
 
clocks {
-- 
2.5.0



Re: userns, netns, and quick physical memory consumption by unprivileged user

2016-03-14 Thread Michal Hocko
On Fri 11-03-16 18:06:59, Yuriy M. Kaminskiy wrote:
[...]
> And also tried with memcg:
>   t=/sys/fs/cgroup/memory/test1;mkdir $t;echo 0 >$t/tasks;
>   echo 48M >$t/memory.limit_in_bytes; su testuser [...]
> and it has not helped at all (rather opposite, it ended up with killed
> init and kernel panic; well, later is pure (un)luck; but point is, memcg
> apparently *CANNOT* curb net/ns allocations).

It seems you were using memcg v1 here. This didn't have the kernel
memory accounting enabled by default. With the v2 you get both user and
kernel (well some subset of it) accounting enabled. Whether we account
also netns related data structures sufficiently is a question. I haven't
checked.  But it would be worth trying and fix.

-- 
Michal Hocko
SUSE Labs


pull-request: wireless-drivers-next 2016-03-14

2016-03-14 Thread Kalle Valo
Hi Dave,

I know I'm late now that merge window was opened yesterday but here's
one more set of patches I would like to get to 4.6 still. There isn't
anything controversial so I hope this should be still safe to pull. The
patches have been in linux-next since Friday and I haven't seen any
reports about issues. But if you think it's too late just let me know
and I'll resubmit these for 4.7.

The most notable part here of course is rtl8xxxu with over 100 patches.
As the driver is new and under heavy development I think they are ok to
take still. Otherwise there are mostly fixes with an exception of adding
a new debugfs file to wl18xx.

Please let me know if you have any problems.

Kalle

The following changes since commit 836856e3bd61d0644e5178a2c1b51d90459e2788:

  wireless: cw1200: use __maybe_unused to hide pm functions_ (2016-03-08 
12:32:52 +0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next.git 
tags/wireless-drivers-next-for-davem-2016-03-14

for you to fetch changes up to ccfe1e85322090649d2fae599e55300c1512bf15:

  rtl8xxxu: Temporarily disable 8192eu device init (2016-03-10 15:29:21 +0200)


wireless-drivers patches for 4.6

Major changes:

rtl8xxxu

* add 8723bu support

wl18xx

* add radar_debug_mode debugfs file for DFS testing


Amitkumar Karwar (1):
  mwifiex: Empty Tx queue during suspend

Ayala Beker (1):
  iwlwifi: mvm: update GSCAN capabilities

Chaya Rachel Ivgi (4):
  iwlwifi: mvm: fix unregistration of thermal in some error flows
  iwlwifi: mvm: add ctdp operations to debugfs
  iwlwifi: mvm: add support for async rx handler without hold the mutex
  iwlwifi: mvm: return the cooling state index instead of the budget

Dan Carpenter (1):
  libertas: fix an error code in probe

Eliad Peller (2):
  wlcore: don't WARN_ON in case of existing ROC
  wlcore/wl18xx: add radar_debug_mode handling

Emmanuel Grumbach (4):
  iwlwifi: mvm: avoid panics with thermal device usage
  iwlwifi: mvm: don't let NDPs mess the packet tracking
  iwlwifi: mvm: remove RRM advertisement
  iwlwifi: mvm: adapt the firmware assert log to new firmware

Gregory Greenman (1):
  iwlwifi: pcie: avoid restocks inside rx loop if not emergency

Hui Wang (1):
  brcmfmac: Remove waitqueue_active check

Jakub Sitnicki (5):
  rtl8xxxu: Don't check for illegal offset when reading from efuse
  rtl8xxxu: Skip disabled efuse words early
  rtl8xxxu: rtl8723au: Introduce a pointer to efuse
  rtl8xxxu: rtl8192cu: Introduce a pointer to efuse
  rtl8xxxu: rtl8192eu: Map out EFUSE TX power area

Jes Sorensen (108):
  rtl8xxxu: Add initial code to parse rtl8192eu efuse
  rtl8xxxu: Identify chip vendors correctly
  rtl8xxxu: Use 1024 byte block loads for 8192eu firmware
  rtl8xxxu: Add rtl8192eu_nic.bin to the MODULE_FIRMWARE list
  rtl8xxxu: Implment rtl8192eu_power_on()
  rtl8xxxu: Add rtl8xxxu_auto_llt_table()
  rtl8xxxu: Init page boundaries before starting the firmware
  rtl8xxxu: Init the LLT after we start the firmware
  rtl8xxxu: Fix incorrect test for auto LLT failure
  rtl8xxxu: Kludge to drop incorrect USB OUT EP for 8192EU
  rtl8xxxu: Init REG_HIMR[01] for 8192eu parts
  rtl8xxxu: Initial rtl8723bu chip identification
  rtl8xxxu: Add rtl8723bu_parse_efuse() and 8723bu efuse definition
  rtl8xxxu: Use 1024 byte writes for writing 8723bu firmware
  rtl8xxxu: Only setup USB interrupts for parts which support it
  rtl8xxxu: Add rtl8723b_phy_1t_init_table
  rtl8xxxu: Add rtl8723bu_radioa_1t_init_table
  rtl8xxxu: Add rtl8723bu_phy_init_antenna_selection()
  rtl8xxxu: Add rtl8723b_mac_init_table
  rtl8xxxu: Add 8723by AGC table
  rtl8xxxu: Handle 32 bit mailbox extension regs found on 8723bu/8192eu/8812
  rtl8xxxu: Add some missing register definitions for 8723bu
  rtl8xxxu: Group USB fixups together for all chips
  rtl8xxxu: Add definitions for new generation h2c commands
  rtl8xxxu: rtl8192eu_parse_efuse(): Use a pointer to the struct 
rtl8192eu_efuse
  rtl8xxxu: rtl8723bu_parse_efuse(): Use a pointer to the struct 
rtl8723bu_efuse
  rtl8xxxu: rtl8xxxu_h2c_cmd(): Add size argument
  rtl8xxxu: Do BT_WLAN_CALIBRATION before doing IQK calibration
  rtl8xxxu: Do not overwrite rtl8xxxu_debug for untested chips
  rtl8xxxu: Use correct formatting type to print sizeof()
  rtl8xxxu: Make rtl8xxxu_add_path_on() use device specific init values
  rtl8xxxu: Add a couple of new register definitions
  rtl8xxxu: First stab at adding IQK calibration for 8723bu parts
  rtl8xxxu: Handle S0S1 register in lc_calibrate()
  rtl8xxxu: Do LC calibration before IQK calibration
  rtl8xxxu: Remove backing up certain registers, which was never 

[PATCH v6 net-next 05/10] ARM: dts: armada-xp: enable buffer manager support on Armada XP boards

2016-03-14 Thread Gregory CLEMENT
From: Marcin Wojtas 

Since mvneta driver supports using hardware buffer management (BM), in
order to use it, board files have to be adjusted accordingly. This commit
enables BM on AXP-DB and AXP-GP in same manner - because number of ports
on those boards is the same as number of possible pools, each port is
supposed to use single pool for all kind of packets.

Moreover appropriate entry is added to 'soc' node ranges, as well as "okay"
status for 'bm' and 'bm-bppi' (internal SRAM) nodes.

Signed-off-by: Marcin Wojtas 
Signed-off-by: Gregory CLEMENT 
---
 arch/arm/boot/dts/armada-xp-db.dts | 19 ++-
 arch/arm/boot/dts/armada-xp-gp.dts | 19 ++-
 2 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/arch/arm/boot/dts/armada-xp-db.dts 
b/arch/arm/boot/dts/armada-xp-db.dts
index f774101416a5..30657302305d 100644
--- a/arch/arm/boot/dts/armada-xp-db.dts
+++ b/arch/arm/boot/dts/armada-xp-db.dts
@@ -77,7 +77,8 @@
  MBUS_ID(0x01, 0x1d) 0 0 0xfff0 0x10
  MBUS_ID(0x01, 0x2f) 0 0 0xf000 0x100
  MBUS_ID(0x09, 0x09) 0 0 0xf810 0x1
- MBUS_ID(0x09, 0x05) 0 0 0xf811 0x1>;
+ MBUS_ID(0x09, 0x05) 0 0 0xf811 0x1
+ MBUS_ID(0x0c, 0x04) 0 0 0xf120 0x10>;
 
devbus-bootcs {
status = "okay";
@@ -181,21 +182,33 @@
status = "okay";
phy = <>;
phy-mode = "rgmii-id";
+   buffer-manager = <>;
+   bm,pool-long = <0>;
};
ethernet@74000 {
status = "okay";
phy = <>;
phy-mode = "rgmii-id";
+   buffer-manager = <>;
+   bm,pool-long = <1>;
};
ethernet@3 {
status = "okay";
phy = <>;
phy-mode = "sgmii";
+   buffer-manager = <>;
+   bm,pool-long = <2>;
};
ethernet@34000 {
status = "okay";
phy = <>;
phy-mode = "sgmii";
+   buffer-manager = <>;
+   bm,pool-long = <3>;
+   };
+
+   bm@c {
+   status = "okay";
};
 
mvsdio@d4000 {
@@ -230,5 +243,9 @@
};
};
};
+
+   bm-bppi {
+   status = "okay";
+   };
};
 };
diff --git a/arch/arm/boot/dts/armada-xp-gp.dts 
b/arch/arm/boot/dts/armada-xp-gp.dts
index 4878d7353069..a1ded01d0c07 100644
--- a/arch/arm/boot/dts/armada-xp-gp.dts
+++ b/arch/arm/boot/dts/armada-xp-gp.dts
@@ -96,7 +96,8 @@
  MBUS_ID(0x01, 0x1d) 0 0 0xfff0 0x10
  MBUS_ID(0x01, 0x2f) 0 0 0xf000 0x100
  MBUS_ID(0x09, 0x09) 0 0 0xf810 0x1
- MBUS_ID(0x09, 0x05) 0 0 0xf811 0x1>;
+ MBUS_ID(0x09, 0x05) 0 0 0xf811 0x1
+ MBUS_ID(0x0c, 0x04) 0 0 0xf120 0x10>;
 
devbus-bootcs {
status = "okay";
@@ -196,21 +197,29 @@
status = "okay";
phy = <>;
phy-mode = "qsgmii";
+   buffer-manager = <>;
+   bm,pool-long = <0>;
};
ethernet@74000 {
status = "okay";
phy = <>;
phy-mode = "qsgmii";
+   buffer-manager = <>;
+   bm,pool-long = <1>;
};
ethernet@3 {
status = "okay";
phy = <>;
phy-mode = "qsgmii";
+   buffer-manager = <>;
+   bm,pool-long = <2>;
};
ethernet@34000 {
status = "okay";
phy = <>;
phy-mode = "qsgmii";
+  

[PATCH v6 net-next 03/10] ARM: dts: armada-38x: enable buffer manager support on Armada 38x boards

2016-03-14 Thread Gregory CLEMENT
From: Marcin Wojtas 

Since mvneta driver supports using hardware buffer management (BM), in
order to use it, board files have to be adjusted accordingly. This commit
enables BM on:
* A385-DB-AP - each port has its own pool for long and common pool for
short packets,
* A388-ClearFog - same as above,
* A388-DB - to each port unique 'short' and 'long' pools are mapped,
* A388-GP - same as above.

Moreover appropriate entry is added to 'soc' node ranges, as well as "okay"
status for 'bm' and 'bm-bppi' (internal SRAM) nodes.

[gregory.clem...@free-electrons.com: add suppport for the ClearFog board]

Signed-off-by: Marcin Wojtas 
Signed-off-by: Gregory CLEMENT 
Acked-by: Russell King 
---
 arch/arm/boot/dts/armada-385-db-ap.dts  | 20 +++-
 arch/arm/boot/dts/armada-388-clearfog.dts   |  6 ++
 arch/arm/boot/dts/armada-388-db.dts | 17 -
 arch/arm/boot/dts/armada-388-gp.dts | 17 -
 arch/arm/boot/dts/armada-38x-solidrun-microsom.dtsi | 15 ++-
 5 files changed, 71 insertions(+), 4 deletions(-)

diff --git a/arch/arm/boot/dts/armada-385-db-ap.dts 
b/arch/arm/boot/dts/armada-385-db-ap.dts
index acd5b1519edb..5f9451be21ff 100644
--- a/arch/arm/boot/dts/armada-385-db-ap.dts
+++ b/arch/arm/boot/dts/armada-385-db-ap.dts
@@ -61,7 +61,8 @@
ranges = ;
+ MBUS_ID(0x09, 0x15) 0 0xf111 0x1
+ MBUS_ID(0x0c, 0x04) 0 0xf120 0x10>;
 
internal-regs {
spi1: spi@10680 {
@@ -138,12 +139,18 @@
status = "okay";
phy = <>;
phy-mode = "sgmii";
+   buffer-manager = <>;
+   bm,pool-long = <1>;
+   bm,pool-short = <3>;
};
 
ethernet@34000 {
status = "okay";
phy = <>;
phy-mode = "sgmii";
+   buffer-manager = <>;
+   bm,pool-long = <2>;
+   bm,pool-short = <3>;
};
 
ethernet@7 {
@@ -157,6 +164,13 @@
status = "okay";
phy = <>;
phy-mode = "rgmii-id";
+   buffer-manager = <>;
+   bm,pool-long = <0>;
+   bm,pool-short = <3>;
+   };
+
+   bm@c8000 {
+   status = "okay";
};
 
nfc: flash@d {
@@ -178,6 +192,10 @@
};
};
 
+   bm-bppi {
+   status = "okay";
+   };
+
pcie-controller {
status = "okay";
 
diff --git a/arch/arm/boot/dts/armada-388-clearfog.dts 
b/arch/arm/boot/dts/armada-388-clearfog.dts
index c6e180eb3b11..c60206efb583 100644
--- a/arch/arm/boot/dts/armada-388-clearfog.dts
+++ b/arch/arm/boot/dts/armada-388-clearfog.dts
@@ -78,6 +78,9 @@
internal-regs {
ethernet@3 {
phy-mode = "sgmii";
+   buffer-manager = <>;
+   bm,pool-long = <2>;
+   bm,pool-short = <1>;
status = "okay";
 
fixed-link {
@@ -88,6 +91,9 @@
 
ethernet@34000 {
phy-mode = "sgmii";
+   buffer-manager = <>;
+   bm,pool-long = <3>;
+   bm,pool-short = <1>;
status = "okay";
 
fixed-link {
diff --git a/arch/arm/boot/dts/armada-388-db.dts 
b/arch/arm/boot/dts/armada-388-db.dts
index ff47af57f091..ea93ed727030 100644
--- a/arch/arm/boot/dts/armada-388-db.dts
+++ b/arch/arm/boot/dts/armada-388-db.dts
@@ -66,7 +66,8 @@
ranges = ;
+ MBUS_ID(0x09, 

[PATCH v6 net-next 04/10] ARM: dts: armada-xp: add buffer manager nodes

2016-03-14 Thread Gregory CLEMENT
From: Marcin Wojtas 

Armada XP network controller supports hardware buffer management (BM).
Since it is now enabled in mvneta driver, appropriate nodes can be added
to armada-xp.dtsi - for the actual common BM unit (bm@c) and its
internal SRAM (bm-bppi), which is used for indirect access to buffer
pointer ring residing in DRAM.

Pools - ports mapping, bm-bppi entry in 'soc' node's ranges and optional
parameters are supposed to be set in board files.

Signed-off-by: Marcin Wojtas 
Signed-off-by: Gregory CLEMENT 
---
 arch/arm/boot/dts/armada-xp.dtsi | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/arch/arm/boot/dts/armada-xp.dtsi b/arch/arm/boot/dts/armada-xp.dtsi
index be23196829bb..553349c07f28 100644
--- a/arch/arm/boot/dts/armada-xp.dtsi
+++ b/arch/arm/boot/dts/armada-xp.dtsi
@@ -253,6 +253,14 @@
marvell,crypto-sram-size = <0x800>;
};
 
+   bm: bm@c {
+   compatible = "marvell,armada-380-neta-bm";
+   reg = <0xc 0xac>;
+   clocks = < 13>;
+   internal-mem = <_bppi>;
+   status = "disabled";
+   };
+
xor@f0900 {
compatible = "marvell,orion-xor";
reg = <0xF0900 0x100
@@ -291,6 +299,17 @@
#size-cells = <1>;
ranges = <0 MBUS_ID(0x09, 0x05) 0 0x800>;
};
+
+   bm_bppi: bm-bppi {
+   compatible = "mmio-sram";
+   reg = ;
+   ranges = <0 MBUS_ID(0x0c, 0x04) 0 0x10>;
+   #address-cells = <1>;
+   #size-cells = <1>;
+   clocks = < 13>;
+   no-memory-wc;
+   status = "disabled";
+   };
};
 
clocks {
-- 
2.5.0



[PATCH v6 net-next 06/10] ARM: dts: armada-xp-openblocks-ax3-4: Add BM support

2016-03-14 Thread Gregory CLEMENT
Allow Openblock AX3 using hardware buffer management with mvneta.

Signed-off-by: Gregory CLEMENT 
---
 arch/arm/boot/dts/armada-xp-openblocks-ax3-4.dts | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/armada-xp-openblocks-ax3-4.dts 
b/arch/arm/boot/dts/armada-xp-openblocks-ax3-4.dts
index a5db17782e08..3aa29a91c7b8 100644
--- a/arch/arm/boot/dts/armada-xp-openblocks-ax3-4.dts
+++ b/arch/arm/boot/dts/armada-xp-openblocks-ax3-4.dts
@@ -67,7 +67,8 @@
  MBUS_ID(0x01, 0x1d) 0 0 0xfff0 0x10
  MBUS_ID(0x01, 0x2f) 0 0 0xf000 0x800
  MBUS_ID(0x09, 0x09) 0 0 0xf810 0x1
- MBUS_ID(0x09, 0x05) 0 0 0xf811 0x1>;
+ MBUS_ID(0x09, 0x05) 0 0 0xf811 0x1
+ MBUS_ID(0x0c, 0x04) 0 0 0xd120 0x10>;
 
devbus-bootcs {
status = "okay";
@@ -176,21 +177,29 @@
status = "okay";
phy = <>;
phy-mode = "sgmii";
+   buffer-manager = <>;
+   bm,pool-long = <0>;
};
ethernet@74000 {
status = "okay";
phy = <>;
phy-mode = "sgmii";
+   buffer-manager = <>;
+   bm,pool-long = <1>;
};
ethernet@3 {
status = "okay";
phy = <>;
phy-mode = "sgmii";
+   buffer-manager = <>;
+   bm,pool-long = <2>;
};
ethernet@34000 {
status = "okay";
phy = <>;
phy-mode = "sgmii";
+   buffer-manager = <>;
+   bm,pool-long = <3>;
};
i2c@11000 {
status = "okay";
@@ -219,6 +228,14 @@
usb@51000 {
status = "okay";
};
+
+   bm@c {
+   status = "okay";
+   };
+   };
+
+   bm-bppi {
+   status = "okay";
};
};
 };
-- 
2.5.0



[PATCH v3 4/9] net: arc: trivial: cleanup the emac driver

2016-03-14 Thread Caesar Wang
This patch will make the driver more readability

The emac has the error and warnings if you run
'scripts/checkpatch.pl -f --subjective xxx' to check.

Let's clean up such trivial details.

Signed-off-by: Caesar Wang 
Cc: Jiri Kosina 
Cc: "David S. Miller" 
Cc: Alexander Kochetkov 
Cc: netdev@vger.kernel.org

---

Changes in v3:
- Add the Cc people.

Changes in v2:
- As the robot notice the build error since overflow in implicit
  constant conversion.

 drivers/net/ethernet/arc/emac.h  | 54 +---
 drivers/net/ethernet/arc/emac_main.c | 35 ++---
 drivers/net/ethernet/arc/emac_mdio.c |  2 +-
 drivers/net/ethernet/arc/emac_rockchip.c | 41 +---
 4 files changed, 75 insertions(+), 57 deletions(-)

diff --git a/drivers/net/ethernet/arc/emac.h b/drivers/net/ethernet/arc/emac.h
index 1a40403..ca562bc 100644
--- a/drivers/net/ethernet/arc/emac.h
+++ b/drivers/net/ethernet/arc/emac.h
@@ -14,36 +14,36 @@
 #include 
 
 /* STATUS and ENABLE Register bit masks */
-#define TXINT_MASK (1<<0)  /* Transmit interrupt */
-#define RXINT_MASK (1<<1)  /* Receive interrupt */
-#define ERR_MASK   (1<<2)  /* Error interrupt */
-#define TXCH_MASK  (1<<3)  /* Transmit chaining error interrupt */
-#define MSER_MASK  (1<<4)  /* Missed packet counter error */
-#define RXCR_MASK  (1<<8)  /* RXCRCERR counter rolled over  */
-#define RXFR_MASK  (1<<9)  /* RXFRAMEERR counter rolled over */
-#define RXFL_MASK  (1<<10) /* RXOFLOWERR counter rolled over */
-#define MDIO_MASK  (1<<12) /* MDIO complete interrupt */
-#define TXPL_MASK  (1<<31) /* Force polling of BD by EMAC */
+#define TXINT_MASK (1 << 0)/* Transmit interrupt */
+#define RXINT_MASK (1 << 1)/* Receive interrupt */
+#define ERR_MASK   (1 << 2)/* Error interrupt */
+#define TXCH_MASK  (1 << 3)/* Transmit chaining error interrupt */
+#define MSER_MASK  (1 << 4)/* Missed packet counter error */
+#define RXCR_MASK  (1 << 8)/* RXCRCERR counter rolled over  */
+#define RXFR_MASK  (1 << 9)/* RXFRAMEERR counter rolled over */
+#define RXFL_MASK  (1 << 10)   /* RXOFLOWERR counter rolled over */
+#define MDIO_MASK  (1 << 12)   /* MDIO complete interrupt */
+#define TXPL_MASK  (1 << 31)   /* Force polling of BD by EMAC */
 
 /* CONTROL Register bit masks */
-#define EN_MASK(1<<0)  /* VMAC enable */
-#define TXRN_MASK  (1<<3)  /* TX enable */
-#define RXRN_MASK  (1<<4)  /* RX enable */
-#define DSBC_MASK  (1<<8)  /* Disable receive broadcast */
-#define ENFL_MASK  (1<<10) /* Enable Full-duplex */
-#define PROM_MASK  (1<<11) /* Promiscuous mode */
+#define EN_MASK(1 << 0)/* VMAC enable */
+#define TXRN_MASK  (1 << 3)/* TX enable */
+#define RXRN_MASK  (1 << 4)/* RX enable */
+#define DSBC_MASK  (1 << 8)/* Disable receive broadcast */
+#define ENFL_MASK  (1 << 10)   /* Enable Full-duplex */
+#define PROM_MASK  (1 << 11)   /* Promiscuous mode */
 
 /* Buffer descriptor INFO bit masks */
-#define OWN_MASK   (1<<31) /* 0-CPU owns buffer, 1-EMAC owns buffer */
-#define FIRST_MASK (1<<16) /* First buffer in chain */
-#define LAST_MASK  (1<<17) /* Last buffer in chain */
+#define OWN_MASK   (1 << 31)   /* 0-CPU or 1-EMAC owns buffer */
+#define FIRST_MASK (1 << 16)   /* First buffer in chain */
+#define LAST_MASK  (1 << 17)   /* Last buffer in chain */
 #define LEN_MASK   0x07FF  /* last 11 bits */
-#define CRLS   (1<<21)
-#define DEFR   (1<<22)
-#define DROP   (1<<23)
-#define RTRY   (1<<24)
-#define LTCL   (1<<28)
-#define UFLO   (1<<29)
+#define CRLS   (1 << 21)
+#define DEFR   (1 << 22)
+#define DROP   (1 << 23)
+#define RTRY   (1 << 24)
+#define LTCL   (1 << 28)
+#define UFLO   (1 << 29)
 
 #define FOR_EMAC   OWN_MASK
 #define FOR_CPU0
@@ -66,7 +66,7 @@ enum {
R_MDIO,
 };
 
-#define TX_TIMEOUT (400*HZ/1000)   /* Transmission timeout */
+#define TX_TIMEOUT (400 * HZ / 1000) /* Transmission timeout */
 
 #define ARC_EMAC_NAPI_WEIGHT   40  /* Workload for NAPI */
 
@@ -196,6 +196,7 @@ static inline unsigned int arc_reg_get(struct arc_emac_priv 
*priv, int reg)
 static inline void arc_reg_or(struct arc_emac_priv *priv, int reg, int mask)
 {
unsigned int value = arc_reg_get(priv, reg);
+
arc_reg_set(priv, reg, value | mask);
 }
 
@@ -211,6 +212,7 @@ static inline void arc_reg_or(struct arc_emac_priv *priv, 
int reg, int mask)
 static inline void arc_reg_clr(struct arc_emac_priv *priv, int reg, int mask)
 {
unsigned int value = arc_reg_get(priv, reg);
+
   

[PATCH v3 8/9] clk: rockchip: associate SCLK_MAC_PLL and disable reparenting on rk3036

2016-03-14 Thread Caesar Wang
From: Heiko Stuebner 

The emac needs constant and very specific rate but the possible PLL-sources
are very limited, so we expect the PLL source to be set manually on per
board and don't want it to get changed in an automatic way later.
So add the necessary clock-id and disable reparenting on set_rate calls.

Signed-off-by: Heiko Stuebner 
Cc: Michael Turquette 
Cc: Heiko Stuebner 
Cc: Stephen Boyd 
Cc: linux-...@vger.kernel.org

Signed-off-by: Caesar Wang 
---

Changes in v3:
- Add the Cc people.

Changes in v2: None

 drivers/clk/rockchip/clk-rk3036.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/clk/rockchip/clk-rk3036.c 
b/drivers/clk/rockchip/clk-rk3036.c
index cc66e5f..7cdb2d6 100644
--- a/drivers/clk/rockchip/clk-rk3036.c
+++ b/drivers/clk/rockchip/clk-rk3036.c
@@ -348,7 +348,7 @@ static struct rockchip_clk_branch rk3036_clk_branches[] 
__initdata = {
RK2928_CLKSEL_CON(16), 0, 2, MFLAGS, 2, 5, DFLAGS,
RK2928_CLKGATE_CON(10), 5, GFLAGS),
 
-   COMPOSITE_NOGATE(0, "mac_pll_src", mux_pll_src_3plls_p, 0,
+   COMPOSITE_NOGATE(SCLK_MACPLL, "mac_pll_src", mux_pll_src_3plls_p, 
CLK_SET_RATE_NO_REPARENT,
RK2928_CLKSEL_CON(21), 0, 2, MFLAGS, 9, 5, DFLAGS),
MUX(SCLK_MACREF, "mac_clk_ref", mux_mac_p, CLK_SET_RATE_PARENT,
RK2928_CLKSEL_CON(21), 3, 1, MFLAGS),
-- 
1.9.1



[PATCH v3 9/9] ARM: dts: rockchip: add to support emac for rk3036 SoCs

2016-03-14 Thread Caesar Wang
From: Xing Zheng 

This patch adds the emac device node for rk3036 SoCs.
We need to let mac clock under the DPLL which is able to provide
the accurate 50MHz what mac_ref need, since that will cause some
unstable things if the cpufreq is working.

Signed-off-by: Xing Zheng 
Signed-off-by: Caesar Wang 
Cc: linux-rockc...@lists.infradead.org
Cc: Xing Zheng 
Cc: Heiko Stuebner 
Cc: linux-arm-ker...@lists.infradead.org

---

Changes in v3:
- rename reset-gpio to phy-reset-gpios.
- change the commit.
- remove the pcfg_output_high, that's really not needed for emac.
- Add the Cc people.
- Fixes the 'zhengxing' to 'Xing Zheng'.

Changes in v2:
- rename phy-reset-gpio to reset-gpios.

 arch/arm/boot/dts/rk3036-evb.dts   | 14 ++
 arch/arm/boot/dts/rk3036-kylin.dts | 14 ++
 arch/arm/boot/dts/rk3036.dtsi  | 39 ++
 3 files changed, 67 insertions(+)

diff --git a/arch/arm/boot/dts/rk3036-evb.dts b/arch/arm/boot/dts/rk3036-evb.dts
index 28a0336..b3d6ec8 100644
--- a/arch/arm/boot/dts/rk3036-evb.dts
+++ b/arch/arm/boot/dts/rk3036-evb.dts
@@ -47,6 +47,20 @@
compatible = "rockchip,rk3036-evb", "rockchip,rk3036";
 };
 
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_xfer>, <_mdio>;
+   phy = <>;
+   phy-reset-gpios = < 22 GPIO_ACTIVE_LOW>; /* PHY_RST */
+   phy-reset-duration = <10>; /* millisecond */
+
+   status = "okay";
+
+   phy0: ethernet-phy@0 {
+   reg = <0>;
+   };
+};
+
  {
status = "okay";
 
diff --git a/arch/arm/boot/dts/rk3036-kylin.dts 
b/arch/arm/boot/dts/rk3036-kylin.dts
index eb9c979..951f15d 100644
--- a/arch/arm/boot/dts/rk3036-kylin.dts
+++ b/arch/arm/boot/dts/rk3036-kylin.dts
@@ -112,6 +112,20 @@
status = "okay";
 };
 
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_xfer>, <_mdio>;
+   phy = <>;
+   phy-reset-gpios = < 22 GPIO_ACTIVE_LOW>; /* PHY_RST */
+   phy-reset-duration = <10>; /* millisecond */
+
+   status = "okay";
+
+   phy0: ethernet-phy@0 {
+   reg = <0>;
+   };
+};
+
  {
status = "okay";
 };
diff --git a/arch/arm/boot/dts/rk3036.dtsi b/arch/arm/boot/dts/rk3036.dtsi
index 90faa86..5175a2a 100644
--- a/arch/arm/boot/dts/rk3036.dtsi
+++ b/arch/arm/boot/dts/rk3036.dtsi
@@ -223,6 +223,27 @@
status = "disabled";
};
 
+   emac: ethernet@1020 {
+   compatible = "rockchip,rk3036-emac", "snps,arc-emac";
+   reg = <0x1020 0x4000>;
+   interrupts = ;
+   #address-cells = <1>;
+   #size-cells = <0>;
+   rockchip,grf = <>;
+   clocks = < HCLK_MAC>, < SCLK_MACREF>, < SCLK_MAC>;
+   clock-names = "hclk", "macref", "macclk";
+   /*
+* Fix the emac parent clock is DPLL instead of APLL.
+* since that will cause some unstable things if the cpufreq
+* is working. (e.g: the accurate 50MHz what mac_ref need)
+*/
+   assigned-clocks = < SCLK_MACPLL>;
+   assigned-clock-parents = < PLL_DPLL>;
+   max-speed = <100>;
+   phy-mode = "rmii";
+   status = "disabled";
+   };
+
sdmmc: dwmmc@10214000 {
compatible = "rockchip,rk3036-dw-mshc", 
"rockchip,rk3288-dw-mshc";
reg = <0x10214000 0x4000>;
@@ -628,6 +649,24 @@
};
};
 
+   emac {
+   emac_xfer: emac-xfer {
+   rockchip,pins = <2 10 RK_FUNC_1 
_pull_default>, /* crs_dvalid */
+   <2 13 RK_FUNC_1 
_pull_default>, /* tx_en */
+   <2 14 RK_FUNC_1 
_pull_default>, /* mac_clk */
+   <2 15 RK_FUNC_1 
_pull_default>, /* rx_err */
+   <2 16 RK_FUNC_1 
_pull_default>, /* rxd1 */
+   <2 17 RK_FUNC_1 
_pull_default>, /* rxd0 */
+   <2 18 RK_FUNC_1 
_pull_default>, /* txd1 */
+   <2 19 RK_FUNC_1 
_pull_default>; /* txd0 */
+   };
+
+   emac_mdio: emac-mdio {
+   rockchip,pins = <2 12 RK_FUNC_1 
_pull_default>, /* mac_md */
+   <2 25 RK_FUNC_1 
_pull_default>; /* mac_mdclk */
+   };
+   };
+
i2c0 {
i2c0_xfer: i2c0-xfer {
rockchip,pins = <0 0 RK_FUNC_1 _pull_none>,
-- 
1.9.1



[PATCH v3 6/9] clk: rockchip: associate the rk3036 HCLK_EMAC clock-id

2016-03-14 Thread Caesar Wang
From: Xing Zheng 

Associate the new clock id the clock.

Signed-off-by: Xing Zheng 
Signed-off-by: Caesar Wang 
Cc: Xing Zheng 
Cc: Michael Turquette 
Cc: Heiko Stuebner 
Cc: Stephen Boyd 
Cc: linux-...@vger.kernel.org
Cc: linux-rockc...@lists.infradead.org

---

Changes in v3:
- Add the Cc people.

Changes in v2: None

 drivers/clk/rockchip/clk-rk3036.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/clk/rockchip/clk-rk3036.c 
b/drivers/clk/rockchip/clk-rk3036.c
index 0703c8f..cc66e5f 100644
--- a/drivers/clk/rockchip/clk-rk3036.c
+++ b/drivers/clk/rockchip/clk-rk3036.c
@@ -408,7 +408,7 @@ static struct rockchip_clk_branch rk3036_clk_branches[] 
__initdata = {
GATE(HCLK_OTG1, "hclk_otg1", "hclk_peri", CLK_IGNORE_UNUSED, 
RK2928_CLKGATE_CON(7), 3, GFLAGS),
GATE(HCLK_I2S, "hclk_i2s", "hclk_peri", 0, RK2928_CLKGATE_CON(7), 2, 
GFLAGS),
GATE(0, "hclk_sfc", "hclk_peri", CLK_IGNORE_UNUSED, 
RK2928_CLKGATE_CON(3), 14, GFLAGS),
-   GATE(0, "hclk_mac", "hclk_peri", CLK_IGNORE_UNUSED, 
RK2928_CLKGATE_CON(3), 15, GFLAGS),
+   GATE(HCLK_MAC, "hclk_mac", "hclk_peri", 0, RK2928_CLKGATE_CON(3), 5, 
GFLAGS),
 
/* pclk_peri gates */
GATE(0, "pclk_peri_matrix", "pclk_peri", CLK_IGNORE_UNUSED, 
RK2928_CLKGATE_CON(4), 1, GFLAGS),
-- 
1.9.1



RE: [net-next,iproute2] netconf: add support for ignore route attribute

2016-03-14 Thread 张胜举
> On Mon, 14 Mar 2016 04:55:36 +
> Zhang Shengju  wrote:
> 
> > Add support for ignore_routes_with_linkdown attribute.
> >
> > Signed-off-by: Zhang Shengju 
> > ---
> >  ip/ipnetconf.c | 4 
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/ip/ipnetconf.c b/ip/ipnetconf.c index eca6eee..6fec818
> > 100644
> > --- a/ip/ipnetconf.c
> > +++ b/ip/ipnetconf.c
> > @@ -119,6 +119,10 @@ int print_netconf(const struct sockaddr_nl *who,
> struct rtnl_ctrl_data *ctrl,
> > fprintf(fp, "proxy_neigh %s ",
> > *(int
> *)RTA_DATA(tb[NETCONFA_PROXY_NEIGH])?"on":"off");
> >
> > +   if (tb[NETCONFA_IGNORE_ROUTES_WITH_LINKDOWN])
> > +   fprintf(fp, "ignore_routes_with_linkdown %s ",
> > +   *(int
> >
> +*)RTA_DATA(tb[NETCONFA_IGNORE_ROUTES_WITH_LINKDOWN])?"on":"
> off");
> 
> This is a good idea.
> 
> But the option name is too long, and the code does not follow current best
> practices.
I agree with you that the name is too long, but I can't figure out a shorter
name. 
Any good suggestion? What about "ignore_routes" ? 

>   1. Lines are too long
>   2. There needs to be whitespace around ? :
>   3. There are helper routines (rte_getattr_XXX) which should be used
rather
> than
>  cast RTE_DATA directly.
> 
> Also, help and man page??
Yes, man page need to be enhanced.







[PATCH v6 net-next 08/10] net: mvneta: bm: add support for hardware buffer management

2016-03-14 Thread Gregory CLEMENT
From: Marcin Wojtas 

Buffer manager (BM) is a dedicated hardware unit that can be used by all
ethernet ports of Armada XP and 38x SoC's. It allows to offload CPU on RX
path by sparing DRAM access on refilling buffer pool, hardware-based
filling of descriptor ring data and better memory utilization due to HW
arbitration for using 'short' pools for small packets.

Tests performed with A388 SoC working as a network bridge between two
packet generators showed increase of maximum processed 64B packets by
~20k (~555k packets with BM enabled vs ~535 packets without BM). Also
when pushing 1500B-packets with a line rate achieved, CPU load decreased
from around 25% without BM to 20% with BM.

BM comprise up to 4 buffer pointers' (BP) rings kept in DRAM, which
are called external BP pools - BPPE. Allocating and releasing buffer
pointers (BP) to/from BPPE is performed indirectly by write/read access
to a dedicated internal SRAM, where internal BP pools (BPPI) are placed.
BM hardware controls status of BPPE automatically, as well as assigning
proper buffers to RX descriptors. For more details please refer to
Functional Specification of Armada XP or 38x SoC.

In order to enable support for a separate hardware block, common for all
ports, a new driver has to be implemented ('mvneta_bm'). It provides
initialization sequence of address space, clocks, registers, SRAM,
empty pools' structures and also obtaining optional configuration
from DT (please refer to device tree binding documentation). mvneta_bm
exposes also a necessary API to mvneta driver, as well as a dedicated
structure with BM information (bm_priv), whose presence is used as a
flag notifying of BM usage by port. It has to be ensured that mvneta_bm
probe is executed prior to the ones in ports' driver. In case BM is not
used or its probe fails, mvneta falls back to use software buffer
management.

A sequence executed in mvneta_probe function is modified in order to have
an access to needed resources before possible port's BM initialization is
done. According to port-pools mapping provided by DT appropriate registers
are configured and the buffer pools are filled. RX path is modified
accordingly. Becaues the hardware allows a wide variety of configuration
options, following assumptions are made:
* using BM mechanisms can be selectively disabled/enabled basing
  on DT configuration among the ports
* 'long' pool's single buffer size is tied to port's MTU
* using 'long' pool by port is obligatory and it cannot be shared
* using 'short' pool for smaller packets is optional
* one 'short' pool can be shared among all ports

This commit enables hardware buffer management operation cooperating with
existing mvneta driver. New device tree binding documentation is added and
the one of mvneta is updated accordingly.

[gregory.clem...@free-electrons.com: removed the suspend/resume part]

Signed-off-by: Marcin Wojtas 
Signed-off-by: Gregory CLEMENT 
---
 .../bindings/net/marvell-armada-370-neta.txt   |  19 +-
 .../devicetree/bindings/net/marvell-neta-bm.txt|  49 ++
 drivers/net/ethernet/marvell/Kconfig   |  13 +
 drivers/net/ethernet/marvell/Makefile  |   1 +
 drivers/net/ethernet/marvell/mvneta.c  | 507 +--
 drivers/net/ethernet/marvell/mvneta_bm.c   | 546 +
 drivers/net/ethernet/marvell/mvneta_bm.h   | 189 +++
 7 files changed, 1285 insertions(+), 39 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/net/marvell-neta-bm.txt
 create mode 100644 drivers/net/ethernet/marvell/mvneta_bm.c
 create mode 100644 drivers/net/ethernet/marvell/mvneta_bm.h

diff --git a/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt 
b/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt
index d0cb8693963b..73be8970815e 100644
--- a/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt
+++ b/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt
@@ -18,15 +18,30 @@ Optional properties:
   "core" for core clock and "bus" for the optional bus clock.
 
 
+Optional properties (valid only for Armada XP/38x):
+
+- buffer-manager: a phandle to a buffer manager node. Please refer to
+  Documentation/devicetree/bindings/net/marvell-neta-bm.txt
+- bm,pool-long: ID of a pool, that will accept all packets of a size
+  higher than 'short' pool's threshold (if set) and up to MTU value.
+  Obligatory, when the port is supposed to use hardware
+  buffer management.
+- bm,pool-short: ID of a pool, that will be used for accepting
+  packets of a size lower than given threshold. If not set, the port
+  will use a single 'long' pool for all packets, as defined above.
+
 Example:
 
-ethernet@d007 {
+ethernet@7 {
compatible = "marvell,armada-370-neta";
-   reg = <0xd007 0x2500>;
+   reg = <0x7 0x2500>;
interrupts = <8>;
clocks = <_clk 

[PATCH v6 net-next 09/10] net: add a hardware buffer management helper API

2016-03-14 Thread Gregory CLEMENT
This basic implementation allows to share code between driver using
hardware buffer management. As the code is hardware agnostic, there is
few helpers, most of the optimization brought by the an HW BM has to be
done at driver level.

Tested-by: Sebastian Careba 
Signed-off-by: Gregory CLEMENT 
---
 include/net/hwbm.h | 28 ++
 net/Kconfig|  3 ++
 net/core/Makefile  |  1 +
 net/core/hwbm.c| 87 ++
 4 files changed, 119 insertions(+)
 create mode 100644 include/net/hwbm.h
 create mode 100644 net/core/hwbm.c

diff --git a/include/net/hwbm.h b/include/net/hwbm.h
new file mode 100644
index ..47d08662501b
--- /dev/null
+++ b/include/net/hwbm.h
@@ -0,0 +1,28 @@
+#ifndef _HWBM_H
+#define _HWBM_H
+
+struct hwbm_pool {
+   /* Capacity of the pool */
+   int size;
+   /* Size of the buffers managed */
+   int frag_size;
+   /* Number of buffers currently used by this pool */
+   int buf_num;
+   /* constructor called during alocation */
+   int (*construct)(struct hwbm_pool *bm_pool, void *buf);
+   /* protect acces to the buffer counter*/
+   spinlock_t lock;
+   /* private data */
+   void *priv;
+};
+#ifdef CONFIG_HWBM
+void hwbm_buf_free(struct hwbm_pool *bm_pool, void *buf);
+int hwbm_pool_refill(struct hwbm_pool *bm_pool, gfp_t gfp);
+int hwbm_pool_add(struct hwbm_pool *bm_pool, unsigned int buf_num, gfp_t gfp);
+#else
+void hwbm_buf_free(struct hwbm_pool *bm_pool, void *buf) {}
+int hwbm_pool_refill(struct hwbm_pool *bm_pool, gfp_t gfp) { return 0; }
+int hwbm_pool_add(struct hwbm_pool *bm_pool, unsigned int buf_num, gfp_t gfp)
+{ return 0; }
+#endif /* CONFIG_HWBM */
+#endif /* _HWBM_H */
diff --git a/net/Kconfig b/net/Kconfig
index 10640d5f8bee..e13449870d06 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -253,6 +253,9 @@ config XPS
depends on SMP
default y
 
+config HWBM
+   bool
+
 config SOCK_CGROUP_DATA
bool
default n
diff --git a/net/core/Makefile b/net/core/Makefile
index 014422e2561f..d6508c2ddca5 100644
--- a/net/core/Makefile
+++ b/net/core/Makefile
@@ -25,4 +25,5 @@ obj-$(CONFIG_CGROUP_NET_PRIO) += netprio_cgroup.o
 obj-$(CONFIG_CGROUP_NET_CLASSID) += netclassid_cgroup.o
 obj-$(CONFIG_LWTUNNEL) += lwtunnel.o
 obj-$(CONFIG_DST_CACHE) += dst_cache.o
+obj-$(CONFIG_HWBM) += hwbm.o
 obj-$(CONFIG_NET_DEVLINK) += devlink.o
diff --git a/net/core/hwbm.c b/net/core/hwbm.c
new file mode 100644
index ..941c28486896
--- /dev/null
+++ b/net/core/hwbm.c
@@ -0,0 +1,87 @@
+/* Support for hardware buffer manager.
+ *
+ * Copyright (C) 2016 Marvell
+ *
+ * Gregory CLEMENT 
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ */
+#include 
+#include 
+#include 
+#include 
+
+void hwbm_buf_free(struct hwbm_pool *bm_pool, void *buf)
+{
+   if (likely(bm_pool->frag_size <= PAGE_SIZE))
+   skb_free_frag(buf);
+   else
+   kfree(buf);
+}
+EXPORT_SYMBOL_GPL(hwbm_buf_free);
+
+/* Refill processing for HW buffer management */
+int hwbm_pool_refill(struct hwbm_pool *bm_pool, gfp_t gfp)
+{
+   int frag_size = bm_pool->frag_size;
+   void *buf;
+
+   if (likely(frag_size <= PAGE_SIZE))
+   buf = netdev_alloc_frag(frag_size);
+   else
+   buf = kmalloc(frag_size, gfp);
+
+   if (!buf)
+   return -ENOMEM;
+
+   if (bm_pool->construct)
+   if (bm_pool->construct(bm_pool, buf)) {
+   hwbm_buf_free(bm_pool, buf);
+   return -ENOMEM;
+   }
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(hwbm_pool_refill);
+
+int hwbm_pool_add(struct hwbm_pool *bm_pool, unsigned int buf_num, gfp_t gfp)
+{
+   int err, i;
+   unsigned long flags;
+
+   spin_lock_irqsave(_pool->lock, flags);
+   if (bm_pool->buf_num == bm_pool->size) {
+   pr_warn("pool already filled\n");
+   return bm_pool->buf_num;
+   }
+
+   if (buf_num + bm_pool->buf_num > bm_pool->size) {
+   pr_warn("cannot allocate %d buffers for pool\n",
+   buf_num);
+   return 0;
+   }
+
+   if ((buf_num + bm_pool->buf_num) < bm_pool->buf_num) {
+   pr_warn("Adding %d buffers to the %d current buffers will 
overflow\n",
+   buf_num,  bm_pool->buf_num);
+   return 0;
+   }
+
+   for (i = 0; i < buf_num; i++) {
+   err = hwbm_pool_refill(bm_pool, gfp);
+   if (err < 0)
+   break;
+   }
+
+   /* Update BM driver with number of buffers added to pool */
+   

[PATCH v6 net-next 01/10] misc: sram: add optional ioremap without write combining

2016-03-14 Thread Gregory CLEMENT
From: Marcin Wojtas 

Some SRAM users may require non-bufferable access to the memory, which is
impossible, because devm_ioremap_wc() is used for setting sram->virt_base.

This commit adds optional flag 'no-memory-wc', which allow to choose remap
method, using DT property. Documentation is updated accordingly.

Signed-off-by: Marcin Wojtas 
---
 Documentation/devicetree/bindings/sram/sram.txt | 5 +
 drivers/misc/sram.c | 5 -
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/sram/sram.txt 
b/Documentation/devicetree/bindings/sram/sram.txt
index 42ee9438b771..227e3a341af1 100644
--- a/Documentation/devicetree/bindings/sram/sram.txt
+++ b/Documentation/devicetree/bindings/sram/sram.txt
@@ -25,6 +25,11 @@ Required properties in the sram node:
 - ranges : standard definition, should translate from local addresses
within the sram to bus addresses
 
+Optional properties in the sram node:
+
+- no-memory-wc : the flag indicating, that SRAM memory region has not to
+ be remapped as write combining. WC is used by default.
+
 Required properties in the area nodes:
 
 - reg : iomem address range, relative to the SRAM range
diff --git a/drivers/misc/sram.c b/drivers/misc/sram.c
index 736dae715dbf..69cdabea9c03 100644
--- a/drivers/misc/sram.c
+++ b/drivers/misc/sram.c
@@ -360,7 +360,10 @@ static int sram_probe(struct platform_device *pdev)
return -EBUSY;
}
 
-   sram->virt_base = devm_ioremap_wc(sram->dev, res->start, size);
+   if (of_property_read_bool(pdev->dev.of_node, "no-memory-wc"))
+   sram->virt_base = devm_ioremap(sram->dev, res->start, size);
+   else
+   sram->virt_base = devm_ioremap_wc(sram->dev, res->start, size);
if (IS_ERR(sram->virt_base))
return PTR_ERR(sram->virt_base);
 
-- 
2.5.0



[PATCH v6 net-next 10/10] net: mvneta: Use the new hwbm framework

2016-03-14 Thread Gregory CLEMENT
Now that the hardware buffer management framework had been introduced,
let's use it.

Tested-by: Sebastian Careba 
Signed-off-by: Gregory CLEMENT 
---
 drivers/net/ethernet/marvell/Kconfig |   1 +
 drivers/net/ethernet/marvell/mvneta.c|  18 +++--
 drivers/net/ethernet/marvell/mvneta_bm.c | 125 ---
 drivers/net/ethernet/marvell/mvneta_bm.h |  17 ++---
 4 files changed, 49 insertions(+), 112 deletions(-)

diff --git a/drivers/net/ethernet/marvell/Kconfig 
b/drivers/net/ethernet/marvell/Kconfig
index ac6605c62f46..62d80fddbe34 100644
--- a/drivers/net/ethernet/marvell/Kconfig
+++ b/drivers/net/ethernet/marvell/Kconfig
@@ -43,6 +43,7 @@ config MVMDIO
 config MVNETA_BM
tristate "Marvell Armada 38x/XP network interface BM support"
depends on MVNETA
+   select HWBM
---help---
  This driver supports auxiliary block of the network
  interface units in the Marvell ARMADA XP and ARMADA 38x SoC
diff --git a/drivers/net/ethernet/marvell/mvneta.c 
b/drivers/net/ethernet/marvell/mvneta.c
index 2847c0c291de..3d8e7d357ec9 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "mvneta_bm.h"
 #include 
 #include 
@@ -1026,11 +1027,12 @@ static int mvneta_bm_port_init(struct platform_device 
*pdev,
 static void mvneta_bm_update_mtu(struct mvneta_port *pp, int mtu)
 {
struct mvneta_bm_pool *bm_pool = pp->pool_long;
+   struct hwbm_pool *hwbm_pool = _pool->hwbm_pool;
int num;
 
/* Release all buffers from long pool */
mvneta_bm_bufs_free(pp->bm_priv, bm_pool, 1 << pp->id);
-   if (bm_pool->buf_num) {
+   if (hwbm_pool->buf_num) {
WARN(1, "cannot free all buffers in pool %d\n",
 bm_pool->id);
goto bm_mtu_err;
@@ -1038,14 +1040,14 @@ static void mvneta_bm_update_mtu(struct mvneta_port 
*pp, int mtu)
 
bm_pool->pkt_size = MVNETA_RX_PKT_SIZE(mtu);
bm_pool->buf_size = MVNETA_RX_BUF_SIZE(bm_pool->pkt_size);
-   bm_pool->frag_size = SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) +
- SKB_DATA_ALIGN(MVNETA_RX_BUF_SIZE(bm_pool->pkt_size));
+   hwbm_pool->frag_size = SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) +
+   SKB_DATA_ALIGN(MVNETA_RX_BUF_SIZE(bm_pool->pkt_size));
 
/* Fill entire long pool */
-   num = mvneta_bm_bufs_add(pp->bm_priv, bm_pool, bm_pool->size);
-   if (num != bm_pool->size) {
+   num = hwbm_pool_add(hwbm_pool, hwbm_pool->size, GFP_ATOMIC);
+   if (num != hwbm_pool->size) {
WARN(1, "pool %d: %d of %d allocated\n",
-bm_pool->id, num, bm_pool->size);
+bm_pool->id, num, hwbm_pool->size);
goto bm_mtu_err;
}
mvneta_bm_pool_bufsize_set(pp, bm_pool->buf_size, bm_pool->id);
@@ -2066,14 +2068,14 @@ err_drop_frame:
}
 
/* Refill processing */
-   err = mvneta_bm_pool_refill(pp->bm_priv, bm_pool);
+   err = hwbm_pool_refill(_pool->hwbm_pool, GFP_ATOMIC);
if (err) {
netdev_err(dev, "Linux processing - Can't refill\n");
rxq->missed++;
goto err_drop_frame_ret_pool;
}
 
-   frag_size = bm_pool->frag_size;
+   frag_size = bm_pool->hwbm_pool.frag_size;
 
skb = build_skb(data, frag_size > PAGE_SIZE ? 0 : frag_size);
 
diff --git a/drivers/net/ethernet/marvell/mvneta_bm.c 
b/drivers/net/ethernet/marvell/mvneta_bm.c
index 8c968e7d2d8f..01fccec632ec 100644
--- a/drivers/net/ethernet/marvell/mvneta_bm.c
+++ b/drivers/net/ethernet/marvell/mvneta_bm.c
@@ -10,16 +10,17 @@
  * warranty of any kind, whether express or implied.
  */
 
-#include 
+#include 
 #include 
-#include 
-#include 
-#include 
+#include 
+#include 
 #include 
 #include 
-#include 
+#include 
 #include 
-#include 
+#include 
+#include 
+#include 
 #include "mvneta_bm.h"
 
 #define MVNETA_BM_DRIVER_NAME "mvneta_bm"
@@ -88,17 +89,13 @@ static void mvneta_bm_pool_target_set(struct mvneta_bm 
*priv, int pool_id,
mvneta_bm_write(priv, MVNETA_BM_XBAR_POOL_REG(pool_id), val);
 }
 
-/* Allocate skb for BM pool */
-void *mvneta_buf_alloc(struct mvneta_bm *priv, struct mvneta_bm_pool *bm_pool,
-  dma_addr_t *buf_phys_addr)
+int mvneta_bm_construct(struct hwbm_pool *hwbm_pool, void *buf)
 {
-   void *buf;
+   struct mvneta_bm_pool *bm_pool =
+   (struct mvneta_bm_pool *)hwbm_pool->priv;
+   struct mvneta_bm *priv = bm_pool->priv;
dma_addr_t phys_addr;
 
-   buf = mvneta_frag_alloc(bm_pool->frag_size);
-   if (!buf)
-   return NULL;
-
/* In order to update buf_cookie field of RX 

[PATCH v6 net-next 07/10] bus: mvebu-mbus: provide api for obtaining IO and DRAM window information

2016-03-14 Thread Gregory CLEMENT
From: Marcin Wojtas 

This commit enables finding appropriate mbus window and obtaining its
target id and attribute for given physical address in two separate
routines, both for IO and DRAM windows. This functionality
is needed for Armada XP/38x Network Controller's Buffer Manager and
PnC configuration.

[gregory.clem...@free-electrons.com: Fix size test for
mvebu_mbus_get_dram_win_info]

Signed-off-by: Marcin Wojtas 
[DRAM window information reference in LKv3.10]
Signed-off-by: Evan Wang 
Signed-off-by: Gregory CLEMENT 
---
 drivers/bus/mvebu-mbus.c | 52 
 include/linux/mbus.h |  3 +++
 2 files changed, 55 insertions(+)

diff --git a/drivers/bus/mvebu-mbus.c b/drivers/bus/mvebu-mbus.c
index c43c3d2baf73..c2e52864bb03 100644
--- a/drivers/bus/mvebu-mbus.c
+++ b/drivers/bus/mvebu-mbus.c
@@ -948,6 +948,58 @@ void mvebu_mbus_get_pcie_io_aperture(struct resource *res)
*res = mbus_state.pcie_io_aperture;
 }
 
+int mvebu_mbus_get_dram_win_info(phys_addr_t phyaddr, u8 *target, u8 *attr)
+{
+   const struct mbus_dram_target_info *dram;
+   int i;
+
+   /* Get dram info */
+   dram = mv_mbus_dram_info();
+   if (!dram) {
+   pr_err("missing DRAM information\n");
+   return -ENODEV;
+   }
+
+   /* Try to find matching DRAM window for phyaddr */
+   for (i = 0; i < dram->num_cs; i++) {
+   const struct mbus_dram_window *cs = dram->cs + i;
+
+   if (cs->base <= phyaddr &&
+   phyaddr <= (cs->base + cs->size - 1)) {
+   *target = dram->mbus_dram_target_id;
+   *attr = cs->mbus_attr;
+   return 0;
+   }
+   }
+
+   pr_err("invalid dram address 0x%x\n", phyaddr);
+   return -EINVAL;
+}
+EXPORT_SYMBOL_GPL(mvebu_mbus_get_dram_win_info);
+
+int mvebu_mbus_get_io_win_info(phys_addr_t phyaddr, u32 *size, u8 *target,
+  u8 *attr)
+{
+   int win;
+
+   for (win = 0; win < mbus_state.soc->num_wins; win++) {
+   u64 wbase;
+   int enabled;
+
+   mvebu_mbus_read_window(_state, win, , ,
+  size, target, attr, NULL);
+
+   if (!enabled)
+   continue;
+
+   if (wbase <= phyaddr && phyaddr <= wbase + *size)
+   return win;
+   }
+
+   return -EINVAL;
+}
+EXPORT_SYMBOL_GPL(mvebu_mbus_get_io_win_info);
+
 static __init int mvebu_mbus_debugfs_init(void)
 {
struct mvebu_mbus_state *s = _state;
diff --git a/include/linux/mbus.h b/include/linux/mbus.h
index 1f7bc630d225..ea34a867caa0 100644
--- a/include/linux/mbus.h
+++ b/include/linux/mbus.h
@@ -69,6 +69,9 @@ static inline const struct mbus_dram_target_info 
*mv_mbus_dram_info_nooverlap(vo
 int mvebu_mbus_save_cpu_target(u32 *store_addr);
 void mvebu_mbus_get_pcie_mem_aperture(struct resource *res);
 void mvebu_mbus_get_pcie_io_aperture(struct resource *res);
+int mvebu_mbus_get_dram_win_info(phys_addr_t phyaddr, u8 *target, u8 *attr);
+int mvebu_mbus_get_io_win_info(phys_addr_t phyaddr, u32 *size, u8 *target,
+  u8 *attr);
 int mvebu_mbus_add_window_remap_by_id(unsigned int target,
  unsigned int attribute,
  phys_addr_t base, size_t size,
-- 
2.5.0



[PATCH v6 net-next 00/10] API set for HW Buffer management

2016-03-14 Thread Gregory CLEMENT
Hi,

This is the sixth version of the API set for HW Buffer management (that was
initially submitted here:
http://thread.gmane.org/gmane.linux.kernel/2125152).

This version is just a rebasing onto the last net-next. I also added
the Tested-by flag from Sebastian Careba : "The patch set applies
successfully and it works well, no more Samba issues any longer".

For the record in the previous versions I made the following changes:
v4 -> v5:
- Add a field with the size of the buffer of the pool was added. It
  then allow to fix some misused size in the mvneta_bm code when using
  the new framework.

- Add a new patch from Marcin for sram allowing to require
  non-bufferable access to the memory. It was needed for the hardware
  buffer management of the mvneta.

- Fix the build issue notified by the 0-day builder when building the
  drivers as module.

v3 -> v4
- Fix build issue when HWBM is not selected

v2 -> v3
- Make a HWBM and a SWBM version of the mvneta_rx() function in order
  to reduce the the conditional code. Kept a condition inside the
  mvneta_poll because specializing this function would have means
  duplicating 95% of the code.

- Put back the register_netdev() call at the end of the mvneta_probe()
  function. In order to have a unique ID for each port, just used a
  global variable in the driver.

- Added a fix from Marcin in the "net: mvneta: bm: add support for
  hardware buffer management" patch: "when dropping packets, only
  buffer pointers passed from BM to descriptors have to be returned to
  the pool. In submitted version after closing the port and
  mvneta_rxq_deinit(), it was very likely that a lot of fake buffers
  are added to the pool, because all descriptors took part in
  iteration."

- Removed the select MVNETA_BM from the Kconfig, it will let the user
  the choice to use not use it if they want.

v1 -> v2
- The hardware buffer management helpers are no more built by default
  and now depend on a hidden config symbol which has to be selected
  by the driver if needed
- The hwbm_pool_refill() and hwbm_pool_add() now receive a gfp_t as
  argument allowing the caller to specify the flag it needs.
- buf_num is now tested to ensure there is no wrapping
- A spinlock has been added to protect the hwbm_pool_add() function in
  SMP or irq context.
- used pr_warn instead of pr_debug in case of errors.
- fixed the mvneta implementation by returning the buffer to the pool
  at various place instead of ignoring it.
- Squashed "bus: mvenus-mbus: Fix size test for
   mvebu_mbus_get_dram_win_info" into bus: mvebu-mbus: provide api for
   obtaining IO and DRAM window information.
- Added my signed-otf-by on all the patches as submitter of the series.
- Renamed the dts patches with the pattern "ARM: dts: platform:"
- Removed the patch "ARM: mvebu: enable SRAM support in
  mvebu_v7_defconfig" of this series and already applied it
- Modified the order of the patches.

In order to ease the test the branch mvneta-BM-framework-v6 is
available at g...@github.com:MISL-EBU-System-SW/mainline-public.git.

Thanks,

Gregory


Gregory CLEMENT (3):
  ARM: dts: armada-xp-openblocks-ax3-4: Add BM support
  net: add a hardware buffer management helper API
  net: mvneta: Use the new hwbm framework

Marcin Wojtas (7):
  misc: sram: add optional ioremap without write combining
  ARM: dts: armada-38x: add buffer manager nodes
  ARM: dts: armada-38x: enable buffer manager support on Armada 38x
boards
  ARM: dts: armada-xp: add buffer manager nodes
  ARM: dts: armada-xp: enable buffer manager support on Armada XP boards
  bus: mvebu-mbus: provide api for obtaining IO and DRAM window
information
  net: mvneta: bm: add support for hardware buffer management

 .../bindings/net/marvell-armada-370-neta.txt   |  19 +-
 .../devicetree/bindings/net/marvell-neta-bm.txt|  49 ++
 Documentation/devicetree/bindings/sram/sram.txt|   5 +
 arch/arm/boot/dts/armada-385-db-ap.dts |  20 +-
 arch/arm/boot/dts/armada-388-clearfog.dts  |   6 +
 arch/arm/boot/dts/armada-388-db.dts|  17 +-
 arch/arm/boot/dts/armada-388-gp.dts|  17 +-
 .../arm/boot/dts/armada-38x-solidrun-microsom.dtsi |  15 +-
 arch/arm/boot/dts/armada-38x.dtsi  |  19 +
 arch/arm/boot/dts/armada-xp-db.dts |  19 +-
 arch/arm/boot/dts/armada-xp-gp.dts |  19 +-
 arch/arm/boot/dts/armada-xp-openblocks-ax3-4.dts   |  19 +-
 arch/arm/boot/dts/armada-xp.dtsi   |  19 +
 drivers/bus/mvebu-mbus.c   |  52 +++
 drivers/misc/sram.c|   5 +-
 drivers/net/ethernet/marvell/Kconfig   |  14 +
 drivers/net/ethernet/marvell/Makefile  |   1 +
 drivers/net/ethernet/marvell/mvneta.c  | 509 +++--
 drivers/net/ethernet/marvell/mvneta_bm.c   | 487 
 drivers/net/ethernet/marvell/mvneta_bm.h   | 182 
 

Re: [PATCHv3 (net.git) 2/2] stmmac: fix MDIO settings

2016-03-14 Thread Gabriel Fernandez
Hi Peppe,

Just one remark below

> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c 
> b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> index 6a52fa1..d2322e9 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c

[snip]

> +static bool stmmac_dt_phy(struct plat_stmmacenet_data *plat,
> + struct device_node *np, struct device *dev)
> +{
> +   bool ret = true;
> +
> +   /* If phy-handle property is passed from DT, use it as the PHY */
> +   plat->phy_node = of_parse_phandle(np, "phy-handle", 0);
> +   if (plat->phy_node)
> +   dev_dbg(dev, "Found phy-handle subnode\n");
> +
> +   /* If phy-handle is not specified, check if we have a fixed-phy */
> +   if (!plat->phy_node && of_phy_is_fixed_link(np)) {
> +   if ((of_phy_register_fixed_link(np) < 0))
> +   return -ENODEV;
> +
stmmac_dt_phy() function should return a Boolean


Best Regards.

Gabriel


[PATCH iproute2 net-next v5] bridge: mdb: add support for extended router port information

2016-03-14 Thread Nikolay Aleksandrov
Recently a new temp router port mode was added and with it the dumped
information was extended similar to how mdb entries were done. This
patch adds support to dump the new information by using the "-s" switch.
Example:
$ bridge -d -s mdb show
dev br0 port eth1 grp ff02::1:ffbf:5716 temp 234.39
dev br0 port eth1 grp 239.0.0.2 temp  97.17
dev br0 port eth1 grp 239.0.0.3 temp 105.36
router ports on br0: eth10.00 permanent
router ports on br0: eth2  254.87 temp

It also updates the bridge man page.

Signed-off-by: Nikolay Aleksandrov 
---
v5: pull router port stats in their own function, rebase and retest
v4: minor optimization, parse new attributes only when show_stats has been
specified
v3: make is_temp_mcast_rtr bool and return the result directly as per
Stephen's comment
v2: print the values only if they were provided by the kernel, otherwise
we might print "permanent" for a temporary entry on an older kernel.

 bridge/br_common.h |  3 +++
 bridge/mdb.c   | 53 -
 man/man8/bridge.8  |  6 +-
 3 files changed, 52 insertions(+), 10 deletions(-)

diff --git a/bridge/br_common.h b/bridge/br_common.h
index 41eb0dc38293..5ea45c9e654d 100644
--- a/bridge/br_common.h
+++ b/bridge/br_common.h
@@ -1,6 +1,9 @@
 #define MDB_RTA(r) \
((struct rtattr *)(((char *)(r)) + RTA_ALIGN(sizeof(struct 
br_mdb_entry
 
+#define MDB_RTR_RTA(r) \
+   ((struct rtattr *)(((char *)(r)) + RTA_ALIGN(sizeof(__u32
+
 extern int print_linkinfo(const struct sockaddr_nl *who,
  struct nlmsghdr *n,
  void *arg);
diff --git a/bridge/mdb.c b/bridge/mdb.c
index 600596c94969..97da4dc98f23 100644
--- a/bridge/mdb.c
+++ b/bridge/mdb.c
@@ -33,19 +33,56 @@ static void usage(void)
exit(-1);
 }
 
-static void br_print_router_ports(FILE *f, struct rtattr *attr)
+static bool is_temp_mcast_rtr(__u8 type)
+{
+   return type == MDB_RTR_TYPE_TEMP_QUERY || type == MDB_RTR_TYPE_TEMP;
+}
+
+static void __print_router_port_stats(FILE *f, struct rtattr *pattr)
+{
+   struct rtattr *tb[MDBA_ROUTER_PATTR_MAX + 1];
+   struct timeval tv;
+   __u8 type;
+
+   parse_rtattr(tb, MDBA_ROUTER_PATTR_MAX, MDB_RTR_RTA(RTA_DATA(pattr)),
+RTA_PAYLOAD(pattr) - RTA_ALIGN(sizeof(uint32_t)));
+   if (tb[MDBA_ROUTER_PATTR_TIMER]) {
+   __jiffies_to_tv(,
+   rta_getattr_u32(tb[MDBA_ROUTER_PATTR_TIMER]));
+   fprintf(f, " %4i.%.2i",
+   (int)tv.tv_sec, (int)tv.tv_usec/1);
+   }
+   if (tb[MDBA_ROUTER_PATTR_TYPE]) {
+   type = rta_getattr_u8(tb[MDBA_ROUTER_PATTR_TYPE]);
+   fprintf(f, " %s",
+   is_temp_mcast_rtr(type) ? "temp" : "permanent");
+   }
+}
+
+static void br_print_router_ports(FILE *f, struct rtattr *attr, __u32 brifidx)
 {
uint32_t *port_ifindex;
struct rtattr *i;
int rem;
 
+   if (!show_stats)
+   fprintf(f, "router ports on %s: ", ll_index_to_name(brifidx));
+
rem = RTA_PAYLOAD(attr);
for (i = RTA_DATA(attr); RTA_OK(i, rem); i = RTA_NEXT(i, rem)) {
port_ifindex = RTA_DATA(i);
-   fprintf(f, "%s ", ll_index_to_name(*port_ifindex));
+   if (show_stats) {
+   fprintf(f, "router ports on %s: %s",
+   ll_index_to_name(brifidx),
+   ll_index_to_name(*port_ifindex));
+   __print_router_port_stats(f, i);
+   fprintf(f, "\n");
+   } else {
+   fprintf(f, "%s ", ll_index_to_name(*port_ifindex));
+   }
}
-
-   fprintf(f, "\n");
+   if (!show_stats)
+   fprintf(f, "\n");
 }
 
 static void print_mdb_entry(FILE *f, int ifindex, struct br_mdb_entry *e,
@@ -127,11 +164,9 @@ int print_mdb(const struct sockaddr_nl *who, struct 
nlmsghdr *n, void *arg)
 
if (tb[MDBA_ROUTER]) {
if (n->nlmsg_type == RTM_GETMDB) {
-   if (show_details) {
-   fprintf(fp, "router ports on %s: ",
-   ll_index_to_name(r->ifindex));
-   br_print_router_ports(fp, tb[MDBA_ROUTER]);
-   }
+   if (show_details)
+   br_print_router_ports(fp, tb[MDBA_ROUTER],
+ r->ifindex);
} else {
uint32_t *port_ifindex;
 
diff --git a/man/man8/bridge.8 b/man/man8/bridge.8
index 0e98edf4762f..08e8a5bf5a08 100644
--- a/man/man8/bridge.8
+++ b/man/man8/bridge.8
@@ -119,6 +119,10 @@ is given multiple times, the amount of information 
increases.
 As a rule, the information is statistics or some time values.
 
 .TP

[PATCH v2 1/2] net: thunderx: Set recevie buffer page usage count in bulk

2016-03-14 Thread sunil . kovvuri
From: Sunil Goutham 

Instead of calling get_page() for every receive buffer carved out
of page, set page's usage count at the end, to reduce no of atomic
calls.

Signed-off-by: Sunil Goutham 
---
 drivers/net/ethernet/cavium/thunder/nic.h  |1 +
 drivers/net/ethernet/cavium/thunder/nicvf_queues.c |   31 ++-
 2 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic.h 
b/drivers/net/ethernet/cavium/thunder/nic.h
index 092f097..872b22d 100644
--- a/drivers/net/ethernet/cavium/thunder/nic.h
+++ b/drivers/net/ethernet/cavium/thunder/nic.h
@@ -294,6 +294,7 @@ struct nicvf {
u32 speed;
struct page *rb_page;
u32 rb_page_offset;
+   u16 rb_pageref;
boolrb_alloc_fail;
boolrb_work_scheduled;
struct delayed_work rbdr_work;
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c 
b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index 0dd1abf..fa05e34 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -18,6 +18,15 @@
 #include "q_struct.h"
 #include "nicvf_queues.h"
 
+static void nicvf_get_page(struct nicvf *nic)
+{
+   if (!nic->rb_pageref || !nic->rb_page)
+   return;
+
+   atomic_add(nic->rb_pageref, >rb_page->_count);
+   nic->rb_pageref = 0;
+}
+
 /* Poll a register for a specific value */
 static int nicvf_poll_reg(struct nicvf *nic, int qidx,
  u64 reg, int bit_pos, int bits, int val)
@@ -81,16 +90,15 @@ static inline int nicvf_alloc_rcv_buffer(struct nicvf *nic, 
gfp_t gfp,
int order = (PAGE_SIZE <= 4096) ?  PAGE_ALLOC_COSTLY_ORDER : 0;
 
/* Check if request can be accomodated in previous allocated page */
-   if (nic->rb_page) {
-   if ((nic->rb_page_offset + buf_len + buf_len) >
-   (PAGE_SIZE << order)) {
-   nic->rb_page = NULL;
-   } else {
-   nic->rb_page_offset += buf_len;
-   get_page(nic->rb_page);
-   }
+   if (nic->rb_page &&
+   ((nic->rb_page_offset + buf_len) < (PAGE_SIZE << order))) {
+   nic->rb_pageref++;
+   goto ret;
}
 
+   nicvf_get_page(nic);
+   nic->rb_page = NULL;
+
/* Allocate a new page */
if (!nic->rb_page) {
nic->rb_page = alloc_pages(gfp | __GFP_COMP | __GFP_NOWARN,
@@ -102,7 +110,9 @@ static inline int nicvf_alloc_rcv_buffer(struct nicvf *nic, 
gfp_t gfp,
nic->rb_page_offset = 0;
}
 
+ret:
*rbuf = (u64 *)((u64)page_address(nic->rb_page) + nic->rb_page_offset);
+   nic->rb_page_offset += buf_len;
 
return 0;
 }
@@ -158,6 +168,9 @@ static int  nicvf_init_rbdr(struct nicvf *nic, struct rbdr 
*rbdr,
desc = GET_RBDR_DESC(rbdr, idx);
desc->buf_addr = virt_to_phys(rbuf) >> NICVF_RCV_BUF_ALIGN;
}
+
+   nicvf_get_page(nic);
+
return 0;
 }
 
@@ -241,6 +254,8 @@ refill:
new_rb++;
}
 
+   nicvf_get_page(nic);
+
/* make sure all memory stores are done before ringing doorbell */
smp_wmb();
 
-- 
1.7.1



[PATCH v2 2/2] net: thunderx: Adjust nicvf structure to reduce cache misses

2016-03-14 Thread sunil . kovvuri
From: Sunil Goutham 

Adjusted nicvf structure such that all elements used in hot
path like napi, xmit e.t.c fall into same cache line. This reduced
no of cache misses and resulted in ~2% increase in no of packets
handled on a core.

Also modified elements with :1 notation to boolean, to be
consistent with other element definitions.

Signed-off-by: Sunil Goutham 
---
 drivers/net/ethernet/cavium/thunder/nic.h |   52 
 1 files changed, 30 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/nic.h 
b/drivers/net/ethernet/cavium/thunder/nic.h
index 872b22d..83025bb 100644
--- a/drivers/net/ethernet/cavium/thunder/nic.h
+++ b/drivers/net/ethernet/cavium/thunder/nic.h
@@ -272,46 +272,54 @@ struct nicvf {
struct nicvf*pnicvf;
struct net_device   *netdev;
struct pci_dev  *pdev;
+   void __iomem*reg_base;
+   struct queue_set*qs;
+   struct nicvf_cq_poll*napi[8];
u8  vf_id;
-   u8  node;
-   u8  tns_mode:1;
-   u8  sqs_mode:1;
-   u8  loopback_supported:1;
+   u8  sqs_id;
+   boolsqs_mode;
boolhw_tso;
-   u16 mtu;
-   struct queue_set*qs;
+
+   /* Receive buffer alloc */
+   u32 rb_page_offset;
+   u16 rb_pageref;
+   boolrb_alloc_fail;
+   boolrb_work_scheduled;
+   struct page *rb_page;
+   struct delayed_work rbdr_work;
+   struct tasklet_struct   rbdr_task;
+
+   /* Secondary Qset */
+   u8  sqs_count;
 #defineMAX_SQS_PER_VF_SINGLE_NODE  5
 #defineMAX_SQS_PER_VF  11
-   u8  sqs_id;
-   u8  sqs_count; /* Secondary Qset count */
struct nicvf*snicvf[MAX_SQS_PER_VF];
+
+   /* Queue count */
u8  rx_queues;
u8  tx_queues;
u8  max_queues;
-   void __iomem*reg_base;
+
+   u8  node;
+   u8  cpi_alg;
+   u16 mtu;
boollink_up;
u8  duplex;
u32 speed;
-   struct page *rb_page;
-   u32 rb_page_offset;
-   u16 rb_pageref;
-   boolrb_alloc_fail;
-   boolrb_work_scheduled;
-   struct delayed_work rbdr_work;
-   struct tasklet_struct   rbdr_task;
-   struct tasklet_struct   qs_err_task;
-   struct tasklet_struct   cq_task;
-   struct nicvf_cq_poll*napi[8];
+   booltns_mode;
+   boolloopback_supported;
struct nicvf_rss_info   rss_info;
-   u8  cpi_alg;
+   struct tasklet_struct   qs_err_task;
+   struct work_struct  reset_task;
+
/* Interrupt coalescing settings */
u32 cq_coalesce_usecs;
-
u32 msg_enable;
+
+   /* Stats */
struct nicvf_hw_stats   hw_stats;
struct nicvf_drv_stats  drv_stats;
struct bgx_statsbgx_stats;
-   struct work_struct  reset_task;
 
/* MSI-X  */
boolmsix_enabled;
-- 
1.7.1



Re: [PATCH v3 0/8] arm64: rockchip: Initial GeekBox enablement

2016-03-14 Thread Tomeu Vizoso
On 11 March 2016 at 10:09, Giuseppe CAVALLARO <peppe.cavall...@st.com> wrote:
> On 3/10/2016 5:47 PM, Dinh Nguyen wrote:
>>
>> On Thu, Mar 10, 2016 at 3:13 AM, Giuseppe CAVALLARO
>> <peppe.cavall...@st.com> wrote:
>>>
>>> On 3/9/2016 5:31 PM, Dinh Nguyen wrote:
>>>>
>>>>
>>>> On Wed, Mar 9, 2016 at 8:53 AM, Giuseppe CAVALLARO
>>>> <peppe.cavall...@st.com> wrote:
>>>>>
>>>>>
>>>>> Hi Tomeu, Dinh, Andreas
>>>>>
>>>>> I need a sum and help from you to go ahead on the
>>>>> tx timeout.
>>>>>
>>>>> The "stmmac: MDIO fixes" seems to be the candidate to
>>>>> fix the phy connection and I will send the V2 asap (Andreas' comment).
>>>>>
>>>>> So, supposing the probe is ok and phy is connected,
>>>>> I need your input ...
>>>>>
>>>>>Tomeu: after revering the 0e80bdc9a72d (stmmac: first frame
>>>>>   prep at the end of xmit routine) the network is
>>>>>   not stable and there is a timeout after a while.
>>>>>   The box has 3.50 with normal desc settings.
>>>>>
>>>>>Dinh: the network is ok, I wonder if you can share a boot
>>>>>  log just to understand if the normal or enhanced
>>>>>  descriptors are used.
>>>>>
>>>>
>>>> Here it is:
>>>
>>>
>>> ...
>>>>
>>>>
>>>> [0.850523] stmmac - user ID: 0x10, Synopsys ID: 0x37
>>>> [0.855570]  Ring mode enabled
>>>> [0.858611]  DMA HW capability register supported
>>>> [0.863128]  Enhanced/Alternate descriptors
>>>> [0.867482]  Enabled extended descriptors
>>>> [0.871482]  RX Checksum Offload Engine supported (type 2)
>>>> [0.876948]  TX Checksum insertion supported
>>>> [0.881204]  Enable RX Mitigation via HW Watchdog Timer
>>>> [0.886863] socfpga-dwmac ff702000.ethernet eth0: No MDIO subnode
>>>> found
>>>> [0.899090] libphy: stmmac: probed
>>>> [0.902484] eth0: PHY ID 00221611 at 4 IRQ POLL (stmmac-0:04) active
>>>
>>>
>>>
>>> Thx Dinh, so you are using the Enhanced/Alternate descriptors
>>> I am debugging on my side on a setup with normal descriptors, I let you
>>> know
>>>
>>
>> Doesn't the printout "Enhanced/Alternate descriptors"  mean that I'm using
>> Enhanced/Alternate descriptors?
>
>
> yes this means that you have the Databook 3.70a and, from the HW
> capability register, the driver will use the Enhanced/Alternate
> descriptors. This is the same HW I am using on my side where the
> stmmac is working fine.
>
> In the case where it is failing on net-next, although on Databook 3.50a,
> the HW  capability register says that there is no enhanced descriptors
> and the driver uses the normal ones.
>
> Tomeu, I kindly ask you to try the patch attached. I found a bug on Tx
> path for normal descriptors. Please let me know if this help.
> Also let me know if we actually need to revert the 0e80bdc9a72d.

Hi Peppe,

with that patch I don't see any difference at all in my setup.

So to be clear, with these commits on top of next-20160314, I still
get the hang during boot:

209afef6f0cd ARM: dts: rockchip: Add mdio node to ethernet node
2315acc6cf7f Revert "stmmac: first frame prep at the end of xmit routine"
b5e08e810c63 stmmac: fix tx prepare for normal desc
37c15a31d850 i2c: immediately mark ourselves as registered
4342eec3c5a2 Add linux-next specific files for 20160314

[   27.521026] WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:303
dev_watchdog+0x284/0x288
[   27.529460] NETDEV WATCHDOG: eth0 (rk_gmac-dwmac): transmit queue 0 timed out

https://git.collabora.com/cgit/user/tomeu/linux.git/log/?h=broken-eth-on-rock2

> I am trying to find some HW where test the normal descriptors to
> speed-up the tests on my side directly.

Maybe get your tree in kernelci.org? I'm not sure if it's currently
doing any nfsroot boots, though.

Regards,

Tomeu

> Let me know and thx in advance.
>
> Regards,
> Peppe
>
>>
>> Dinh
>>
>


[PATCH] netfilter: nf_conntrack: consolidate lock/unlock into unlock_wait

2016-03-14 Thread Nicholas Mc Guire
The spin_lock()/spin_unlock() is synchronizing on the 
nf_conntrack_locks_all_lock which is equivalent to 
spin_unlock_wait() but the later should be more efficient.

Signed-off-by: Nicholas Mc Guire 
---

Patch was compile tested with: x86_64_defconfig (implies CONFIG_NETFILTER=y)
Simple run test on x86 64 with a few trivial ipfilter rules active.

Patch is against linux-next (localversion-next is next-20160311)

 net/netfilter/nf_conntrack_core.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/nf_conntrack_core.c 
b/net/netfilter/nf_conntrack_core.c
index f60b4fd..afde5f5 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -74,8 +74,7 @@ void nf_conntrack_lock(spinlock_t *lock) __acquires(lock)
spin_lock(lock);
while (unlikely(nf_conntrack_locks_all)) {
spin_unlock(lock);
-   spin_lock(_conntrack_locks_all_lock);
-   spin_unlock(_conntrack_locks_all_lock);
+   spin_unlock_wait(_conntrack_locks_all_lock);
spin_lock(lock);
}
 }
@@ -121,8 +120,7 @@ static void nf_conntrack_all_lock(void)
nf_conntrack_locks_all = true;
 
for (i = 0; i < CONNTRACK_LOCKS; i++) {
-   spin_lock(_conntrack_locks[i]);
-   spin_unlock(_conntrack_locks[i]);
+   spin_unlock_wait(_conntrack_locks[i]);
}
 }
 
-- 
2.1.4



Re: Generic TSO

2016-03-14 Thread Edward Cree
On 14/03/16 10:26, Edward Cree wrote:
> On 12/03/16 05:40, Alexander Duyck wrote:
>> Well that is the thing.  Before we can actually start tinkering with
>> the outer header we probably need to make sure we set the DF bit and
>> that it would be honored on the outer headers for IPv4.  I don't
>> believe any of the tunnels are currently doing that so repeating the
>> IP ID would be the worst possible scenario until that is resolved
>> since VXLAN tunneled frames can be fragmented while TCP frames cannot
>> so we really shouldn't be repeating IP IDs for the outer headers.
> So how do we progress with that?  I'm presuming it's not as simple as
> just patching the tunnel drivers to set DF if the inner packet has it,
> as that could break existing setups.  (I've heard that "but they're
> already broken anyway" is not usually an acceptable argument.)  Some
> sort of configuration option on the tunnel (like we do with udpcsum)?
...and immediately I find out it already exists.  (I guess I should have
looked there first!)
>From drivers/net/vxlan.c:2001:
> else if (info->key.tun_flags & TUNNEL_DONT_FRAGMENT)
> df = htons(IP_DF);

-Ed


Re: Generic TSO

2016-03-14 Thread Edward Cree
On 12/03/16 05:40, Alexander Duyck wrote:
> Well that is the thing.  Before we can actually start tinkering with
> the outer header we probably need to make sure we set the DF bit and
> that it would be honored on the outer headers for IPv4.  I don't
> believe any of the tunnels are currently doing that so repeating the
> IP ID would be the worst possible scenario until that is resolved
> since VXLAN tunneled frames can be fragmented while TCP frames cannot
> so we really shouldn't be repeating IP IDs for the outer headers.
So how do we progress with that?  I'm presuming it's not as simple as
just patching the tunnel drivers to set DF if the inner packet has it,
as that could break existing setups.  (I've heard that "but they're
already broken anyway" is not usually an acceptable argument.)  Some
sort of configuration option on the tunnel (like we do with udpcsum)?

Fortunately, with the design I'm currently planning on, a tunnel
driver could just set a flag in the SKB to say "unsafe for generic-
TSO", and we'd just send out the first segment normally and fall
back to regular software segmentation.

-Ed


Re: [PATCH v6 net-next 09/10] net: add a hardware buffer management helper API

2016-03-14 Thread Jesper Dangaard Brouer

I've not fully understood the hardware support part.

But I do think this generalization is very interesting work, and would
like to cooperate. If my use-case can fit into this, where my use-case
is in the extreme 100Gbit/s area.

There is some potential for performance improvements, if the API from
start is designed distinguish between being called from NAPI-context
(BH-disabled) and outside NAPI context.

See: netdev_alloc_frag() vs napi_alloc_frag().

Nitpicks inlined below...


On Mon, 14 Mar 2016 09:39:04 +0100
Gregory CLEMENT  wrote:

> This basic implementation allows to share code between driver using
> hardware buffer management. As the code is hardware agnostic, there is
> few helpers, most of the optimization brought by the an HW BM has to be
> done at driver level.
> 
> Tested-by: Sebastian Careba 
> Signed-off-by: Gregory CLEMENT 
> ---
>  include/net/hwbm.h | 28 ++
>  net/Kconfig|  3 ++
>  net/core/Makefile  |  1 +
>  net/core/hwbm.c| 87 
> ++
>  4 files changed, 119 insertions(+)
>  create mode 100644 include/net/hwbm.h
>  create mode 100644 net/core/hwbm.c
> 
> diff --git a/include/net/hwbm.h b/include/net/hwbm.h
> new file mode 100644
> index ..47d08662501b
> --- /dev/null
> +++ b/include/net/hwbm.h
> @@ -0,0 +1,28 @@
> +#ifndef _HWBM_H
> +#define _HWBM_H
> +
> +struct hwbm_pool {
> + /* Capacity of the pool */
> + int size;
> + /* Size of the buffers managed */
> + int frag_size;
> + /* Number of buffers currently used by this pool */
> + int buf_num;
> + /* constructor called during alocation */

alocation -> allocation

> + int (*construct)(struct hwbm_pool *bm_pool, void *buf);
> + /* protect acces to the buffer counter*/

acces -> access
Space after "counter"

> + spinlock_t lock;
> + /* private data */
> + void *priv;
> +};
> +#ifdef CONFIG_HWBM
> +void hwbm_buf_free(struct hwbm_pool *bm_pool, void *buf);
> +int hwbm_pool_refill(struct hwbm_pool *bm_pool, gfp_t gfp);
> +int hwbm_pool_add(struct hwbm_pool *bm_pool, unsigned int buf_num, gfp_t 
> gfp);
> +#else
> +void hwbm_buf_free(struct hwbm_pool *bm_pool, void *buf) {}
> +int hwbm_pool_refill(struct hwbm_pool *bm_pool, gfp_t gfp) { return 0; }
> +int hwbm_pool_add(struct hwbm_pool *bm_pool, unsigned int buf_num, gfp_t gfp)
> +{ return 0; }
> +#endif /* CONFIG_HWBM */
> +#endif /* _HWBM_H */
> diff --git a/net/Kconfig b/net/Kconfig
> index 10640d5f8bee..e13449870d06 100644
> --- a/net/Kconfig
> +++ b/net/Kconfig
> @@ -253,6 +253,9 @@ config XPS
>   depends on SMP
>   default y
>  
> +config HWBM
> +   bool
> +
>  config SOCK_CGROUP_DATA
>   bool
>   default n
> diff --git a/net/core/Makefile b/net/core/Makefile
> index 014422e2561f..d6508c2ddca5 100644
> --- a/net/core/Makefile
> +++ b/net/core/Makefile
> @@ -25,4 +25,5 @@ obj-$(CONFIG_CGROUP_NET_PRIO) += netprio_cgroup.o
>  obj-$(CONFIG_CGROUP_NET_CLASSID) += netclassid_cgroup.o
>  obj-$(CONFIG_LWTUNNEL) += lwtunnel.o
>  obj-$(CONFIG_DST_CACHE) += dst_cache.o
> +obj-$(CONFIG_HWBM) += hwbm.o
>  obj-$(CONFIG_NET_DEVLINK) += devlink.o
> diff --git a/net/core/hwbm.c b/net/core/hwbm.c
> new file mode 100644
> index ..941c28486896
> --- /dev/null
> +++ b/net/core/hwbm.c
> @@ -0,0 +1,87 @@
> +/* Support for hardware buffer manager.
> + *
> + * Copyright (C) 2016 Marvell
> + *
> + * Gregory CLEMENT 
> + *
> + *  This program is free software; you can redistribute it and/or modify
> + *  it under the terms of the GNU General Public License as published by
> + *  the Free Software Foundation; either version 2 of the License, or
> + *  (at your option) any later version.
> + */
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +void hwbm_buf_free(struct hwbm_pool *bm_pool, void *buf)
> +{
> + if (likely(bm_pool->frag_size <= PAGE_SIZE))
> + skb_free_frag(buf);
> + else
> + kfree(buf);
> +}
> +EXPORT_SYMBOL_GPL(hwbm_buf_free);
> +
> +/* Refill processing for HW buffer management */
> +int hwbm_pool_refill(struct hwbm_pool *bm_pool, gfp_t gfp)
> +{
> + int frag_size = bm_pool->frag_size;
> + void *buf;
> +
> + if (likely(frag_size <= PAGE_SIZE))
> + buf = netdev_alloc_frag(frag_size);

If we know the NAPI-context, there is a performance potential in
netdev_alloc_frag() vs napi_alloc_frag().

> + else
> + buf = kmalloc(frag_size, gfp);
> +
> + if (!buf)
> + return -ENOMEM;
> +
> + if (bm_pool->construct)
> + if (bm_pool->construct(bm_pool, buf)) {
> + hwbm_buf_free(bm_pool, buf);
> + return -ENOMEM;
> + }

Why don't we refill more objects? and do so with a bulk of memory
objects?  The "refill" name just lead me to believe that the 

[PATCH v2 0/2] net: thunderx: Performance enhancement changes

2016-03-14 Thread sunil . kovvuri
From: Sunil Goutham 

Below patches attempts to improve performance by reducing
no of atomic operations while allocating new receive buffers
and reducing cache misses by adjusting nicvf structure elements.

Changes from v1:
 No changes, resubmitting a fresh as per David's suggestion.

Sunil Goutham (2):
  net: thunderx: Set recevie buffer page usage count in bulk
  net: thunderx: Adjust nicvf structure to reduce cache misses

 drivers/net/ethernet/cavium/thunder/nic.h  |   51 
 drivers/net/ethernet/cavium/thunder/nicvf_queues.c |   31 +---
 2 files changed, 53 insertions(+), 29 deletions(-)



Re: [PATCH 1/2 net v3.16]r8169: Not enable/disable bus mastering when is enabled on BIOS

2016-03-14 Thread Michal Kubecek
On Sun, Mar 13, 2016 at 11:32:48AM +0200, Corcodel Marian wrote:
> diff --git a/drivers/net/ethernet/realtek/r8169.c 
> b/drivers/net/ethernet/realtek/r8169.c
> index 02aec96..ec555e7 100644
> --- a/drivers/net/ethernet/realtek/r8169.c
> +++ b/drivers/net/ethernet/realtek/r8169.c
> @@ -754,6 +754,7 @@ struct rtl8169_private {
>   struct timer_list timer;
>   u16 cp_cmd;
>   bool pcie;
> + bool bios_support;

You shouldn't base your patches on earlier patches that haven't been
accepted yet (unless they are part of the same series).

Michal Kubecek



[PATCH 1/1] net: Fix use after free in the recvmmsg exit path

2016-03-14 Thread Arnaldo Carvalho de Melo
From: Arnaldo Carvalho de Melo 

The syzkaller fuzzer hit the following use-after-free:

  Call Trace:
   [] __asan_report_load8_noabort+0x3e/0x40 
mm/kasan/report.c:295
   [] __sys_recvmmsg+0x6fa/0x7f0 net/socket.c:2261
   [< inline >] SYSC_recvmmsg net/socket.c:2281
   [] SyS_recvmmsg+0x16f/0x180 net/socket.c:2270
   [] entry_SYSCALL_64_fastpath+0x16/0x7a
  arch/x86/entry/entry_64.S:185

And, as Dmitry rightly assessed, that is because we can drop the
reference and then touch it when the underlying recvmsg calls return
some packets and then hit an error, which will make recvmmsg to set
sock->sk->sk_err, oops, fix it.

Reported-and-Tested-by: Dmitry Vyukov 
Cc: Alexander Potapenko 
Cc: Eric Dumazet 
Cc: Kostya Serebryany 
Cc: Sasha Levin 
Fixes: a2e2725541fa ("net: Introduce recvmmsg socket syscall")
http://lkml.kernel.org/r/20160122211644.gc2...@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 net/socket.c | 38 +++---
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/net/socket.c b/net/socket.c
index c044d1e8508c..db13ae893dce 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -2240,31 +2240,31 @@ int __sys_recvmmsg(int fd, struct mmsghdr __user *mmsg, 
unsigned int vlen,
cond_resched();
}
 
-out_put:
-   fput_light(sock->file, fput_needed);
-
if (err == 0)
-   return datagrams;
+   goto out_put;
 
-   if (datagrams != 0) {
+   if (datagrams == 0) {
+   datagrams = err;
+   goto out_put;
+   }
+
+   /*
+* We may return less entries than requested (vlen) if the
+* sock is non block and there aren't enough datagrams...
+*/
+   if (err != -EAGAIN) {
/*
-* We may return less entries than requested (vlen) if the
-* sock is non block and there aren't enough datagrams...
+* ... or  if recvmsg returns an error after we
+* received some datagrams, where we record the
+* error to return on the next call or if the
+* app asks about it using getsockopt(SO_ERROR).
 */
-   if (err != -EAGAIN) {
-   /*
-* ... or  if recvmsg returns an error after we
-* received some datagrams, where we record the
-* error to return on the next call or if the
-* app asks about it using getsockopt(SO_ERROR).
-*/
-   sock->sk->sk_err = -err;
-   }
-
-   return datagrams;
+   sock->sk->sk_err = -err;
}
+out_put:
+   fput_light(sock->file, fput_needed);
 
-   return err;
+   return datagrams;
 }
 
 SYSCALL_DEFINE5(recvmmsg, int, fd, struct mmsghdr __user *, mmsg,
-- 
2.5.0



[PATCH 0/1] recvmmsg use-after-free fix

2016-03-14 Thread Arnaldo Carvalho de Melo
From: Arnaldo Carvalho de Melo 

Hi David,

Please consider applying,

- Arnaldo

Arnaldo Carvalho de Melo (1):
  net: Fix use after free in the recvmmsg exit path

 net/socket.c | 38 +++---
 1 file changed, 19 insertions(+), 19 deletions(-)

-- 
2.5.0



[PATCH 3/5] ethtool: introduce new ioctl for per queue setting

2016-03-14 Thread kan . liang
From: Kan Liang 

Introduce a new ioctl for per queue parameters setting.
Users can apply commands to specific queues by setting SUB_COMMAND and
queue_mask as following command.

 ethtool --set-perqueue-command DEVNAME [queue_mask %x] SUB_COMMAND

If queue_mask is not set, the SUB_COMMAND will be applied to all queues.

The following patches will enable SUB_COMMANDs for per queue setting.

Signed-off-by: Kan Liang 
---
 ethtool.8.in |  19 
 ethtool.c| 100 +++
 2 files changed, 119 insertions(+)

diff --git a/ethtool.8.in b/ethtool.8.in
index 009711d..26d01cb 100644
--- a/ethtool.8.in
+++ b/ethtool.8.in
@@ -339,6 +339,13 @@ ethtool \- query or control network driver and hardware 
settings
 .B2 tx-lpi on off
 .BN tx-timer
 .BN advertise
+.HP
+.B ethtool \-\-set\-perqueue\-command
+.I devname
+.RB [ queue_mask
+.IR %x ]
+.I sub_command
+.RB ...
 .
 .\" Adjust lines (i.e. full justification) and hyphenate.
 .ad
@@ -920,6 +927,18 @@ Values are as for
 Sets the amount of time the device should stay in idle mode prior to asserting
 its Tx LPI (in microseconds). This has meaning only when Tx LPI is enabled.
 .RE
+.TP
+.B \-\-set\-perqueue\-command
+Sets sub command to specific queues.
+.RS 4
+.TP
+.B queue_mask %x
+Sets the specific queues which the sub command is applied to.
+If queue_mask is not set, the sub command will be applied to all queues.
+.TP
+.B sub_command
+Sets the sub command.
+.RE
 .SH BUGS
 Not supported (in part or whole) on all network drivers.
 .SH AUTHOR
diff --git a/ethtool.c b/ethtool.c
index 86724a2..ba741f0 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -4037,6 +4037,8 @@ static int do_seee(struct cmd_context *ctx)
return 0;
 }
 
+static int do_perqueue(struct cmd_context *ctx);
+
 #ifndef TEST_ETHTOOL
 int send_ioctl(struct cmd_context *ctx, void *cmd)
 {
@@ -4196,6 +4198,8 @@ static const struct option {
  " [ advertise %x ]\n"
  " [ tx-lpi on|off ]\n"
  " [ tx-timer %d ]\n"},
+   { "--set-perqueue-command", 1, do_perqueue, "Set per queue command",
+ " [queue_mask %x] SUB_COMMAND\n"},
{ "-h|--help", 0, show_usage, "Show this help" },
{ "--version", 0, do_version, "Show version number" },
{}
@@ -4247,6 +4251,102 @@ static int find_option(int argc, char **argp)
return -1;
 }
 
+static int set_queue_mask(u32 *queue_mask, char *str)
+{
+   int len = strlen(str);
+   int index = __KERNEL_DIV_ROUND_UP(len * 4, 32);
+   char tmp[9];
+   char *end = str + len;
+   int i, num;
+   __u32 mask;
+   int n_queues = 0;
+
+   if (len > MAX_NUM_QUEUE)
+   return -EINVAL;
+
+   for (i = 0; i < index; i++) {
+   num = end - str;
+
+   if (num >= 8) {
+   end -= 8;
+   num = 8;
+   } else {
+   end = str;
+   }
+   strncpy(tmp, end, num);
+   tmp[num] = '\0';
+
+   queue_mask[i] = strtoul(tmp, NULL, 16);
+
+   mask = queue_mask[i];
+   while (mask > 0) {
+   if (mask & 0x1)
+   n_queues++;
+   mask = mask >> 1;
+   }
+   }
+
+   return n_queues;
+}
+
+#define MAX(x, y) (x > y ? x : y)
+
+static int find_max_num_queues(struct cmd_context *ctx)
+{
+   struct ethtool_channels echannels;
+
+   echannels.cmd = ETHTOOL_GCHANNELS;
+   if (send_ioctl(ctx, ))
+   return -1;
+
+   return MAX(MAX(echannels.rx_count, echannels.tx_count), 
echannels.combined_count);
+}
+
+static int do_perqueue(struct cmd_context *ctx)
+{
+   __u32 queue_mask[__KERNEL_DIV_ROUND_UP(MAX_NUM_QUEUE, 32)] = {0};
+   int i, n_queues = 0;
+
+   if (ctx->argc == 0)
+   exit_bad_args();
+
+   /*
+* The sub commands will be applied to
+* all queues if no queue_mask set
+*/
+   if (strncmp(*ctx->argp, "queue_mask", 10)) {
+   n_queues = find_max_num_queues(ctx);
+   if (n_queues < 0) {
+   perror("Cannot get number of queues");
+   return -EFAULT;
+   }
+   for (i = 0; i < n_queues / 32; i++)
+   queue_mask[i] = ~0;
+   queue_mask[i] = (1 << (n_queues - i * 32)) - 1;
+   fprintf(stdout, "The sub commands will be applied"
+   " to all %d queues\n", n_queues);
+   } else {
+   ctx->argc--;
+   ctx->argp++;
+   n_queues = set_queue_mask(queue_mask, *ctx->argp);
+   if (n_queues < 0) {
+   perror("Invalid queue mask");
+   return n_queues;
+   }
+   ctx->argc--;
+   

[PATCH 4/5] ethtool: support per queue sub command --show-coalesce

2016-03-14 Thread kan . liang
From: Kan Liang 

Get all masked queues' coalesce from kernel and dump them one by one.

Example:

 $ sudo ./ethtool --set-perqueue-command eth5 queue_mask 0x11
   --show-coalesce
 Queue: 0
 Adaptive RX: off  TX: off
 stats-block-usecs: 0
 sample-interval: 0
 pkt-rate-low: 0
 pkt-rate-high: 0

 rx-usecs: 222
 rx-frames: 0
 rx-usecs-irq: 0
 rx-frames-irq: 256

 tx-usecs: 222
 tx-frames: 0
 tx-usecs-irq: 0
 tx-frames-irq: 256

 rx-usecs-low: 0
 rx-frame-low: 0
 tx-usecs-low: 0
 tx-frame-low: 0

 rx-usecs-high: 0
 rx-frame-high: 0
 tx-usecs-high: 0
 tx-frame-high: 0

 Queue: 4
 Adaptive RX: off  TX: off
 stats-block-usecs: 0
 sample-interval: 0
 pkt-rate-low: 0
 pkt-rate-high: 0

 rx-usecs: 222
 rx-frames: 0
 rx-usecs-irq: 0
 rx-frames-irq: 256

 tx-usecs: 222
 tx-frames: 0
 tx-usecs-irq: 0
 tx-frames-irq: 256

 rx-usecs-low: 0
 rx-frame-low: 0
 tx-usecs-low: 0
 tx-frame-low: 0

 rx-usecs-high: 0
 rx-frame-high: 0
 tx-usecs-high: 0
 tx-frame-high: 0

Signed-off-by: Kan Liang 
---
 ethtool.8.in |  2 +-
 ethtool.c| 62 ++--
 2 files changed, 61 insertions(+), 3 deletions(-)

diff --git a/ethtool.8.in b/ethtool.8.in
index 26d01cb..210ec8c 100644
--- a/ethtool.8.in
+++ b/ethtool.8.in
@@ -937,7 +937,7 @@ Sets the specific queues which the sub command is applied 
to.
 If queue_mask is not set, the sub command will be applied to all queues.
 .TP
 .B sub_command
-Sets the sub command.
+Sets the sub command. The supported sub commands include --show-coalesce.
 .RE
 .SH BUGS
 Not supported (in part or whole) on all network drivers.
diff --git a/ethtool.c b/ethtool.c
index ba741f0..a966bf8 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -1219,6 +1219,29 @@ static int dump_coalesce(const struct ethtool_coalesce 
*ecoal)
return 0;
 }
 
+void dump_per_queue_coalesce(struct ethtool_per_queue_op *per_queue_opt,
+__u32 *queue_mask)
+{
+   char *addr;
+   int i;
+
+   addr = (char *)per_queue_opt + sizeof(*per_queue_opt);
+   for (i = 0; i < __KERNEL_DIV_ROUND_UP(MAX_NUM_QUEUE, 32); i++) {
+   int queue = i * 32;
+   __u32 mask = queue_mask[i];
+
+   while (mask > 0) {
+   if (mask & 0x1) {
+   fprintf(stdout, "Queue: %d\n", queue);
+   dump_coalesce((struct ethtool_coalesce *)addr);
+   addr += sizeof(struct ethtool_coalesce);
+   }
+   mask = mask >> 1;
+   queue++;
+   }
+   }
+}
+
 struct feature_state {
u32 off_flags;
struct ethtool_gfeatures features;
@@ -4198,7 +4221,8 @@ static const struct option {
  " [ advertise %x ]\n"
  " [ tx-lpi on|off ]\n"
  " [ tx-timer %d ]\n"},
-   { "--set-perqueue-command", 1, do_perqueue, "Set per queue command",
+   { "--set-perqueue-command", 1, do_perqueue, "Set per queue command. "
+ "The supported sub commands include --show-coalesce",
  " [queue_mask %x] SUB_COMMAND\n"},
{ "-h|--help", 0, show_usage, "Show this help" },
{ "--version", 0, do_version, "Show version number" },
@@ -4302,8 +4326,31 @@ static int find_max_num_queues(struct cmd_context *ctx)
return MAX(MAX(echannels.rx_count, echannels.tx_count), 
echannels.combined_count);
 }
 
+static struct ethtool_per_queue_op *
+get_per_queue_coalesce(struct cmd_context *ctx,
+  __u32 *queue_mask, int n_queues)
+{
+   struct ethtool_per_queue_op *per_queue_opt;
+
+   per_queue_opt = malloc(sizeof(*per_queue_opt) + n_queues * 
sizeof(struct ethtool_coalesce));
+   if (!per_queue_opt)
+   return NULL;
+
+   memcpy(per_queue_opt->queue_mask, queue_mask, 
__KERNEL_DIV_ROUND_UP(MAX_NUM_QUEUE, 32) * sizeof(__u32));
+   per_queue_opt->cmd = ETHTOOL_PERQUEUE;
+   per_queue_opt->sub_command = ETHTOOL_GCOALESCE;
+   if (send_ioctl(ctx, per_queue_opt)) {
+   free(per_queue_opt);
+   perror("Cannot get device per queue parameters");
+   return NULL;
+   }
+
+   return per_queue_opt;
+}
+
 static int do_perqueue(struct cmd_context *ctx)
 {
+   struct ethtool_per_queue_op *per_queue_opt;
__u32 queue_mask[__KERNEL_DIV_ROUND_UP(MAX_NUM_QUEUE, 32)] = {0};
int i, n_queues = 0;
 
@@ -4342,7 +4389,18 @@ static int do_perqueue(struct cmd_context *ctx)
if (i < 0)
exit_bad_args();
 
-   /* no sub_command support yet */
+   if (strstr(args[i].opts, "--show-coalesce") != NULL) {
+   per_queue_opt = get_per_queue_coalesce(ctx, queue_mask, 
n_queues);
+   if (per_queue_opt == NULL) {
+   perror("Cannot get device per queue parameters");
+   return 

Re: [PATCH 3/3] net: mediatek: check device_reset return code

2016-03-14 Thread David Miller
From: Arnd Bergmann 
Date: Mon, 14 Mar 2016 15:07:12 +0100

> The device_reset() function may fail, so we have to check
> its return value, e.g. to make deferred probing work correctly.
> gcc warns about it because of the warn_unused_result attribute:
> 
> drivers/net/ethernet/mediatek/mtk_eth_soc.c: In function 'mtk_probe':
> drivers/net/ethernet/mediatek/mtk_eth_soc.c:1679:2: error: ignoring return 
> value of 'device_reset', declared with attribute warn_unused_result 
> [-Werror=unused-result]
> 
> This adds the trivial error check to propagate the return value
> to the generic platform device probe code.
> 
> Signed-off-by: Arnd Bergmann 

Applied.


Re: [PATCH 2/3] net: mediatek: remove incorrect dma_mask assignment

2016-03-14 Thread David Miller
From: Arnd Bergmann 
Date: Mon, 14 Mar 2016 15:07:11 +0100

> Device drivers should not mess with the DMA mask directly,
> but instead call dma_set_mask() etc if needed.
> 
> In case of the mtk_eth_soc driver, the mask already gets set
> correctly when the device is created, and setting it again
> is against the documented API.
> 
> This removes the incorrect setting.
> 
> Signed-off-by: Arnd Bergmann 

Applied.


Re: [PATCH net-next 1/2] rtnetlink: add new RTM_GETSTATS message to dump link stats

2016-03-14 Thread roopa
On 3/14/16, 7:51 AM, Jiri Pirko wrote:
> Sun, Mar 13, 2016 at 02:56:25AM CET, ro...@cumulusnetworks.com wrote:
>> From: Roopa Prabhu 
>>
>> This patch adds a new RTM_GETSTATS message to query link stats via netlink
> >from the kernel. RTM_NEWLINK also dumps stats today, but RTM_NEWLINK
>> returns a lot more than just stats and is expensive in some cases when
>> frequent polling for stats from userspace is a common operation.
>>
>> RTM_GETSTATS is an attempt to provide a light weight netlink message
>> to explicity query only link stats from the kernel on an interface.
>> The idea is to also keep it extensible so that new kinds of stats can be
>> added to it in the future.
>>
>> This patch adds the following attribute for NETDEV stats:
>> struct nla_policy ifla_stats_policy[IFLA_STATS_MAX + 1] = {
>>[IFLA_STATS_LINK64]  = { .len = sizeof(struct rtnl_link_stats64) },
>> };
>>
>> This patch also allows for af family stats (an example af stats for IPV6
>> is available with the second patch in the series).
>>
>> Like any other rtnetlink message, RTM_GETSTATS can be used to get stats of
>> a single interface or all interfaces with NLM_F_DUMP.
>>
>> Future possible new types of stat attributes:
>> - IFLA_MPLS_STATS  (nested. for mpls/mdev stats)
>> - IFLA_EXTENDED_STATS (nested. extended software netdev stats like bridge,
>>  vlan, vxlan etc)
>> - IFLA_EXTENDED_HW_STATS (nested. extended hardware stats which are
>>  available via ethtool today)
>>
>> This patch also declares a filter mask for all stat attributes.
>> User has to provide a mask of stats attributes to query. This will be
>> specified in a new hdr 'struct if_stats_msg' for stats messages.
>>
>> Without any attributes in the filter_mask, no stats will be returned.
>>
>> This patch has been tested with modified iproute2 ifstat.
>>
>> Suggested-by: Jamal Hadi Salim 
>> Signed-off-by: Roopa Prabhu 
>> ---
>> include/net/rtnetlink.h|   5 ++
>> include/uapi/linux/if_link.h   |  19 
>> include/uapi/linux/rtnetlink.h |   7 ++
>> net/core/rtnetlink.c   | 200 
>> +
>> 4 files changed, 231 insertions(+)
>>
>> diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h
>> index 2f87c1b..fa68158 100644
>> --- a/include/net/rtnetlink.h
>> +++ b/include/net/rtnetlink.h
>> @@ -131,6 +131,11 @@ struct rtnl_af_ops {
>>  const struct nlattr *attr);
>>  int (*set_link_af)(struct net_device *dev,
>> const struct nlattr *attr);
>> +size_t  (*get_link_af_stats_size)(const struct 
>> net_device *dev,
>> +  u32 filter_mask);
>> +int (*fill_link_af_stats)(struct sk_buff *skb,
>> +  const struct net_device 
>> *dev,
>> +  u32 filter_mask);
>> };
>>
>> void __rtnl_af_unregister(struct rtnl_af_ops *ops);
>> diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
>> index 249eef9..0840f3e 100644
>> --- a/include/uapi/linux/if_link.h
>> +++ b/include/uapi/linux/if_link.h
>> @@ -741,4 +741,23 @@ enum {
>>
>> #define IFLA_HSR_MAX (__IFLA_HSR_MAX - 1)
>>
>> +/* STATS section */
>> +
>> +struct if_stats_msg {
>> +__u8  family;
>> +__u32 ifindex;
>> +__u32 filter_mask;
> This limit future extension to only 32 groups of stats. I can imagine
> that more than that can be added, easily.
I thought about that, but it is going to be a while before we run out of the 
u32.
Most of the other stats will be nested like per logical interface stats or
per hw stats. If we do run out of them, in the future we could add a netlink
attribute for extended filter mask to carry more bits (similar to 
IFLA_EXT_MASK).
I did also start with just having a IFLA_STATS_EXT_MASK like attribute
to begin with, but since no stats are dumped by default, having a way to easily 
specify
mask in the hdr will be easier on apps. And this will again be a u32 anyways.


>  Why don't you use nested
> attribute IFLA_STATS_FILTER with flag attributes for every type?
>  That
> would be easily extendable.
a u8 for each stats selector seems like an overkill.
> Using netlink header struct for this does not look correct to me.
> In past, this was done lot of times and turned out to be a problem later.
>
>
I started with not adding it, but rtnetlink rcv handler looks for family
in the hdr. And hence all of the messages have a struct header
with family as the first field (you can correct me if you find that it is not 
necessary.)




Re: [PATCH 1/3] net: mediatek: use dma_addr_t correctly

2016-03-14 Thread John Crispin


On 14/03/2016 15:07, Arnd Bergmann wrote:
> dma_alloc_coherent() expects a dma_addr_t pointer as its argument,
> not an 'unsigned int', and gcc correctly warns about broken
> code in the mtk_init_fq_dma function:
> 
> drivers/net/ethernet/mediatek/mtk_eth_soc.c: In function 'mtk_init_fq_dma':
> drivers/net/ethernet/mediatek/mtk_eth_soc.c:463:13: error: passing argument 3 
> of 'dma_alloc_coherent' from incompatible pointer type 
> [-Werror=incompatible-pointer-types]
> 
> This changes the type of the local variable to dma_addr_t.
> 
> Signed-off-by: Arnd Bergmann 

thanks for the fixes



> ---
>  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
> b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> index ba3afa5d4640..3e42204adfe5 100644
> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> @@ -453,7 +453,7 @@ static inline void mtk_rx_get_desc(struct mtk_rx_dma *rxd,
>  /* the qdma core needs scratch memory to be setup */
>  static int mtk_init_fq_dma(struct mtk_eth *eth)
>  {
> - unsigned int phy_ring_head, phy_ring_tail;
> + dma_addr_t phy_ring_head, phy_ring_tail;
>   int cnt = MTK_DMA_SIZE;
>   dma_addr_t dma_addr;
>   int i;
> 


Re: [PATCH net-next] rds-tcp: Add module parameters to control sndbuf/rcvbuf size of RDS-TCP socket

2016-03-14 Thread Sowmini Varadhan

In any case, to wrap up this thread.

I managed to set this up with sysctl. End result gives me a tunable
per netns (which modparam would not), and thanks to the RDS reconnect
infra, can be done dynamically at any time, not just at startup. 
And it is more compact than a daemon-y solution.

I'll send out the patches later this week after some more cleanup
and testing. 

--Sowmini

However, it would still be nice to know exactly what distribution
issues come out of modparam.




[PATCH net-next v6] tcp: Add RFC4898 tcpEStatsPerfDataSegsOut/In

2016-03-14 Thread Martin KaFai Lau
Per RFC4898, they count segments sent/received
containing a positive length data segment (that includes
retransmission segments carrying data).  Unlike
tcpi_segs_out/in, tcpi_data_segs_out/in excludes segments
carrying no data (e.g. pure ack).

The patch also updates the segs_in in tcp_fastopen_add_skb()
so that segs_in >= data_segs_in property is kept.

Together with retransmission data, tcpi_data_segs_out
gives a better signal on the rxmit rate.

v6: Rebase on the latest net-next

v5: Eric pointed out that checking skb->len is still needed in
tcp_fastopen_add_skb() because skb can carry a FIN without data.
Hence, instead of open coding segs_in and data_segs_in, tcp_segs_in()
helper is used.  Comment is added to the fastopen case to explain why
segs_in has to be reset and tcp_segs_in() has to be called before
__skb_pull().

v4: Add comment to the changes in tcp_fastopen_add_skb()
and also add remark on this case in the commit message.

v3: Add const modifier to the skb parameter in tcp_segs_in()

v2: Rework based on recent fix by Eric:
commit a9d99ce28ed3 ("tcp: fix tcpi_segs_in after connection establishment")

Signed-off-by: Martin KaFai Lau 
Cc: Chris Rapier 
Cc: Eric Dumazet 
Cc: Marcelo Ricardo Leitner 
Cc: Neal Cardwell 
Cc: Yuchung Cheng 
Acked-by: Eric Dumazet 
---
 include/linux/tcp.h  |  6 ++
 include/net/tcp.h| 10 ++
 include/uapi/linux/tcp.h |  2 ++
 net/ipv4/tcp.c   |  2 ++
 net/ipv4/tcp_fastopen.c  |  8 
 net/ipv4/tcp_ipv4.c  |  2 +-
 net/ipv4/tcp_minisocks.c |  2 +-
 net/ipv4/tcp_output.c|  4 +++-
 net/ipv6/tcp_ipv6.c  |  2 +-
 9 files changed, 34 insertions(+), 4 deletions(-)

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index bcbf51d..7be9b12 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -158,6 +158,9 @@ struct tcp_sock {
u32 segs_in;/* RFC4898 tcpEStatsPerfSegsIn
 * total number of segments in.
 */
+   u32 data_segs_in;   /* RFC4898 tcpEStatsPerfDataSegsIn
+* total number of data segments in.
+*/
u32 rcv_nxt;/* What we want to receive next */
u32 copied_seq; /* Head of yet unread data  */
u32 rcv_wup;/* rcv_nxt on last window update sent   */
@@ -165,6 +168,9 @@ struct tcp_sock {
u32 segs_out;   /* RFC4898 tcpEStatsPerfSegsOut
 * The total number of segments sent.
 */
+   u32 data_segs_out;  /* RFC4898 tcpEStatsPerfDataSegsOut
+* total number of data segments sent.
+*/
u64 bytes_acked;/* RFC4898 tcpEStatsAppHCThruOctetsAcked
 * sum(delta(snd_una)), or how many bytes
 * were acked.
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 0302636..c8dbd29 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1840,4 +1840,14 @@ static inline int tcp_inq(struct sock *sk)
return answ;
 }
 
+static inline void tcp_segs_in(struct tcp_sock *tp, const struct sk_buff *skb)
+{
+   u16 segs_in;
+
+   segs_in = max_t(u16, 1, skb_shinfo(skb)->gso_segs);
+   tp->segs_in += segs_in;
+   if (skb->len > tcp_hdrlen(skb))
+   tp->data_segs_in += segs_in;
+}
+
 #endif /* _TCP_H */
diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index fe95446..53e8e3f 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -199,6 +199,8 @@ struct tcp_info {
 
__u32   tcpi_notsent_bytes;
__u32   tcpi_min_rtt;
+   __u32   tcpi_data_segs_in;  /* RFC4898 tcpEStatsDataSegsIn */
+   __u32   tcpi_data_segs_out; /* RFC4898 tcpEStatsDataSegsOut */
 };
 
 /* for TCP_MD5SIG socket option */
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index a265f00..992b310 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2715,6 +2715,8 @@ void tcp_get_info(struct sock *sk, struct tcp_info *info)
info->tcpi_notsent_bytes = max(0, notsent_bytes);
 
info->tcpi_min_rtt = tcp_min_rtt(tp);
+   info->tcpi_data_segs_in = tp->data_segs_in;
+   info->tcpi_data_segs_out = tp->data_segs_out;
 }
 EXPORT_SYMBOL_GPL(tcp_get_info);
 
diff --git a/net/ipv4/tcp_fastopen.c b/net/ipv4/tcp_fastopen.c
index fdb286d..4fc0061 100644
--- a/net/ipv4/tcp_fastopen.c
+++ b/net/ipv4/tcp_fastopen.c
@@ -140,6 +140,14 @@ void tcp_fastopen_add_skb(struct sock *sk, struct sk_buff 
*skb)
return;
 
skb_dst_drop(skb);
+   /* segs_in has been initialized to 1 in tcp_create_openreq_child().
+* Hence, reset segs_in to 0 before calling tcp_segs_in()
+* to 

Re: [PATCH next v2 0/7] Introduce l3_dev pointer for L3 processing

2016-03-14 Thread Cong Wang
On Sun, Mar 13, 2016 at 8:53 PM, David Miller  wrote:
>
> Please stop pretending that this device switching is ok, it's not.

+1
This is what I have been complaining about since v1...


Re: [PATCH net-next 08/13] net/mlx5e: Add fragmented memory support for RX multi packet WQE

2016-03-14 Thread Saeed Mahameed
On Fri, Mar 11, 2016 at 9:58 PM, Eric Dumazet  wrote:

>> I totally agree with this, we should have reported  skb->truesize +=
>> (consumed strides)*(stride size).
>> but again this is not as critical as you think, in the worst case
>> skb->truesize will be off by 127B at most.
>
> Ouch. really you are completely wrong.

It it is just a matter of perspective, quoting:
http://vger.kernel.org/~davem/skb_sk.html
"This is the total of how large a data buffer we allocated for the
packet, plus the size of 'struct sk_buff' itself."

as explained more than once, a page used in ConnectX4 MPWQE approach
can be used for more than one packet, according to the above
documentation and many other examples in the kernel, each packet will
report as much data buffer as it used from that page, and we allocated
for that packet: #strides * stridesize from that page, (common sense).

it is really uncalled-for to report for each SKB, skb->truesize +=
PAGE_SIZE for the same shared reuseable page, as we did in here and as
other drivers already do.

It is just ridiculous to report PAGE_SIZE for SKB that used only 128B
and the others parts of that page are being either reused by HW or
reported back to the stack and we already did the truesize accounting
on their parts.

It seems to me that reporting PAGE_SIZE* (#SKBs pointing to that page)
for all of those SKBs is just a big lie and it is just an abuse to the
skb->truesize to protect against special/rare cases like OOO issue
that I can suggest a handful of solutions (out of this thread scope)
for them without the need of lying in device drivers of the actual
truesize.
Think about it, if SKBs share the same page then SUM(SKBs->truesize) =
PAGE_SIZE.

and suppose you are right, why just not  remove the truesize param
from skb_add_rx_frag, and just explicitly do skb->true_szie +=
PAGE_SIZE, hardcoded inside that function? or rename the truesize
param to pageorder ?

>
> If one skb has a fragment of a page, and sits in a queue for a long
> time, it really uses a full page, because the remaining part of the page
> is not reusable. Only kmalloc(128) can deal with that idea of allowing
> other parts of the page being 'freed and reusable'
This concern was also true before this series for other drivers in the
kernel, who use pages for fragmented SKBs and non of them report
PAGE_SIZE as SKB->truesize, as their pages are reuseable.

>
> It is trivial for an attacker to make sure the host will consume one
> page + sk_buff + skb->head = 4096 + 256 + 512, by specially sending out
> of order packets on TCP flows.
we can do special accounting for ooo like issues in the stack (maybe
count page references and sum up page sizes as you suggest), device
drivers shouldn't have special handling/accounting to protect against
such cases.


Re: [PATCH v2 net-next] ixgbe: Avoid unaligned access in ixgbe_atr() for LLC packets

2016-03-14 Thread Alexander Duyck
On Mon, Mar 14, 2016 at 10:46 AM, Sowmini Varadhan
 wrote:
>
> For LLC based protocols like lldp, stp etc., the ethernet header
> is an 802.3 header with a h_proto that is not 0x800, 0x86dd, or
> even 0x806.  In this world, the skb_network_header() points at
> the DSAP/SSAP/..  and is not likely to be NET_IP_ALIGNed in
> ixgbe_atr().
>
> With LLC, drivers are not likely to correctly find IPVERSION,
> or "6", at hdr.ipv4->version, but will instead just needlessly
> trigger an unaligned access. (IPv4/IPv6 over LLC is almost never
> implemented).
>
> The unaligned access is thus avoidable: bail out quickly after
> examining first->protocol.
>
> Signed-off-by: Sowmini Varadhan 
> ---
> v2: Alexander Duyck comments.
>
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |5 +
>  1 files changed, 5 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c 
> b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index 4d6223d..b25e603 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -7574,6 +7574,11 @@ static void ixgbe_atr(struct ixgbe_ring *ring,
> if (!ring->atr_sample_rate)
> return;
>
> +   if (first->protocol != htons(ETH_P_IP) &&
> +   first->protocol != htons(ETH_P_IPV6) &&
> +   first->protocol != htons(ETH_P_ARP))
> +   return;
> +

One other thing I forgot to mention is that we don't support ARP so
that check could be dropped.  The ATR code only supports IPv4 or IPv6
with TCP.

- Alex


[PATCH] ath5k: Change led pin configuration for compaq c700 laptop

2016-03-14 Thread Joseph Salisbury
BugLink: http://bugs.launchpad.net/bugs/972604

Commit 09c9bae26b0d3c9472cb6ae45010460a2cee8b8d ("ath5k: add led pin 
configuration for compaq c700 laptop") added a pin configuration for the Compaq 
c700 laptop.  However, the polarity of the led pin is reversed.  It should be 
red for wifi off and blue for wifi on, but it is the opposite.  This bug was 
reported in the following bug report: 
http://pad.lv/972604


Fixes: 09c9bae26b0d3c9472cb6ae45010460a2cee8b8d ("ath5k: add led pin 
configuration for compaq c700 laptop")

Signed-off-by: Joseph Salisbury 
Cc: sta...@vger.kernel.org

---
 drivers/net/wireless/ath/ath5k/led.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath5k/led.c 
b/drivers/net/wireless/ath/ath5k/led.c
index 803030f..6a2a168 100644
--- a/drivers/net/wireless/ath/ath5k/led.c
+++ b/drivers/net/wireless/ath/ath5k/led.c
@@ -77,7 +77,7 @@ static const struct pci_device_id ath5k_led_devices[] = {
/* HP Compaq CQ60-206US (ddregg...@jumptv.com) */
{ ATH_SDEVICE(PCI_VENDOR_ID_HP, 0x0137a), ATH_LED(3, 1) },
/* HP Compaq C700 (nitrous...@gmail.com) */
-   { ATH_SDEVICE(PCI_VENDOR_ID_HP, 0x0137b), ATH_LED(3, 1) },
+   { ATH_SDEVICE(PCI_VENDOR_ID_HP, 0x0137b), ATH_LED(3, 0) },
/* LiteOn AR5BXB63 (mag...@salug.it) */
{ ATH_SDEVICE(PCI_VENDOR_ID_ATHEROS, 0x3067), ATH_LED(3, 0) },
/* IBM-specific AR5212 (all others) */
-- 
1.9.1



Re: [PATCH v2 2/3] of_mdio: use IS_ERR_OR_NULL()

2016-03-14 Thread Sergei Shtylyov

On 03/14/2016 01:22 AM, Arnd Bergmann wrote:


IS_ERR_OR_NULL() is open coded in of_mdiobus_register_phy(), so just call
it directly...

Signed-off-by: Sergei Shtylyov 
Reviewed-by: Florian Fainelli 

---
Changes in version 2:
- removed the of_mdiobus_register_device() hunk;
- added the "Reviewed-by:" tag.

  drivers/of/of_mdio.c |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

Index: net-next/drivers/of/of_mdio.c
===
--- net-next.orig/drivers/of/of_mdio.c
+++ net-next/drivers/of/of_mdio.c
@@ -56,7 +56,7 @@ static int of_mdiobus_register_phy(struc
 phy = phy_device_create(mdio, addr, phy_id, 0, NULL);
 else
 phy = get_phy_device(mdio, addr, is_c45);
-   if (!phy || IS_ERR(phy))
+   if (IS_ERR_OR_NULL(phy))
 return 1;

 rc = irq_of_parse_and_map(child, 0);




IS_ERR_OR_NULL() usually indicates that the code is wrong, or that
an API has been misdesigned.

Can you clarify in the changelog which one it is,


   The second. :-)


or (better) change
it so that 'phy' in this function is either NULL or IS_ERR() in case
of an error but not both?


   I guess you meant to change get_phy_device() to not return NULL (and so 
only return the error codes, not both), am I correct?



I think moving the 'if (problem) return 1' into the two if/else
is a correct transformation that also makes it very clear what
is going on here.


   Can be done too... it's not that I have much time for that many respins 
though. It all started with 1 little patch (this one). :-)



Arnd


MBR, Sergei



Re: [PATCH net] ppp: ensure file->private_data can't be overridden

2016-03-14 Thread Guillaume Nault
On Fri, Mar 11, 2016 at 02:42:16PM -0500, David Miller wrote:
> From: Guillaume Nault 
> Date: Tue, 8 Mar 2016 20:14:30 +0100
> 
> > Lock ppp_mutex and check that file->private_data is NULL before
> > executing any action in ppp_unattached_ioctl().
> > The test done by ppp_ioctl() can't be relied upon, because
> > file->private_data may have been updated meanwhile. In which case
> > ppp_unattached_ioctl() will override file->private_data and mess up
> > reference counters or loose pointer to previously allocated PPP unit.
> > 
> > In case the test fails, -ENOTTY is returned, just like if ppp_ioctl()
> > had rejected the ioctl in the first place.
> > 
> > Signed-off-by: Guillaume Nault 
> 
> If this thing can disappear on us, then we need to make the entirety
> of ppp_ioctl() run with the mutex held to fix this properly.
> 
> Otherwise ->private_data could go NULL on us meanwhile as well.
> 
> We should hold the mutex, to stabilize the value of ->private_data.

Actually, only ppp_release() can reset ->private_data to NULL. Beyond
closing the file's last reference, the only way to trigger it is
to run the PPPIOCDETACH ioctl. But even then, ppp_release() isn't
called if the file has more than one reference.
So ->private_data should never go NULL from under another user.

As for setting ->private_data to non-NULL value, this is exclusively
handled by ppp_unattached_ioctl(). Since the ppp_mutex is held at the
beginning of the function, calls are serialised, but one may still
overwrite ->private_data and leak the memory previously pointed to.
By testing ->private_data with ppp_mutex held, this patch fixes this
issue, and ->private_data is now guaranteed to remain constant after
it's been set.

Testing ->private_data without lock in ppp_ioctl() before calling
ppp_unattached_ioctl() is fine, because either ->private_data is
not NULL and thus is stable, or it is and ppp_unattached_ioctl()
takes care of not overriding ->private_data, should its value get
modified before taking the mutex.


I considered moving ppp_mutex up to cover the entirety of ppp_ioctl()
too, but finally choosed to handle everything in ppp_unattached_ioctl()
because that's where the problem really stands.
ppp_ioctl() takes the mutex for historical reasons (semi-automatic BKL
removal) and there are several places where holding ppp_mutex seems
unnecessary (e.g. for PPPIOCDETACH). So I felt the right direction was
to move ppp_mutex further down rather than moving it up to cover the
entirety of ppp_ioctl().

In particular, with regard to adding rtnetlink handlers for PPP (which
is the objective that lead to those PPP fixes), holding ppp_mutex for
too long is a problem. An rtnetlink handler would run under protection
of the rtnl mutex, and would need to grab ppp_mutex too (unless we
don't associate the PPP unit fd to the net device in the .newlink
callback).
But currently the PPPIOCNEWUNIT ioctl holds ppp_mutex before taking the
rtnl mutex (in ppp_create_interface()). In this context moving
ppp_mutex up to ppp_ioctl() makes things more difficult because what's
required is, on the contrary, moving it further down so that it gets
held after the rtnl mutex.
However I'd agree that such consideration shouldn't come into play for
fixes on net. It weighted a bit in my decision to not push ppp_mutex
up though.


Re: [PATCH v2 net-next] ixgbe: Avoid unaligned access in ixgbe_atr() for LLC packets

2016-03-14 Thread Sowmini Varadhan
On (03/14/16 10:55), Alexander Duyck wrote:
> 
> One other thing I forgot to mention is that we don't support ARP so
> that check could be dropped.  The ATR code only supports IPv4 or IPv6
> with TCP.

I did notice that, but I left it in place because (a) it comes down
the stack with the NET_IP_ALIGNment and (b) ARP is only sent over
Ethernet II (there is no LLC SAP for ARP, which is a big reason
why ipv4 is not sent over llc, despite rfc 1042).

I figured it would not hurt to pass it down, in case we decide
to do something clever with it in the future.

--Sowmini



[PATCH] Documentation: networking: phy.txt: Add missing functions

2016-03-14 Thread Florian Fainelli
Some new development in PHYLIB added new function pointers to the struct
phy_driver, document these.

Signed-off-by: Florian Fainelli 
---
 Documentation/networking/phy.txt | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/Documentation/networking/phy.txt b/Documentation/networking/phy.txt
index e839e7efc835..7ab9404a8412 100644
--- a/Documentation/networking/phy.txt
+++ b/Documentation/networking/phy.txt
@@ -267,13 +267,23 @@ Writing a PHY driver
config_intr: Enable or disable interrupts
remove: Does any driver take-down
ts_info: Queries about the HW timestamping status
+   match_phy_device: used for Clause 45 capable PHYs to match devices
+   in package and ensure they are compatible
hwtstamp: Set the PHY HW timestamping configuration
rxtstamp: Requests a receive timestamp at the PHY level for a 'skb'
txtsamp: Requests a transmit timestamp at the PHY level for a 'skb'
set_wol: Enable Wake-on-LAN at the PHY level
get_wol: Get the Wake-on-LAN status at the PHY level
+   link_change_notify: called to inform the core is about to change the
+   link state, can be used to work around bogus PHY between state changes
read_mmd_indirect: Read PHY MMD indirect register
write_mmd_indirect: Write PHY MMD indirect register
+   module_info: Get the size and type of an EEPROM contained in an plug-in
+   module
+   module_eeprom: Get EEPROM information of a plug-in module
+   get_sset_count: Get number of strings sets that get_strings will count
+   get_strings: Get strings from requested objects (statistics)
+   get_stats: Get the extended statistics from the PHY device
 
  Of these, only config_aneg and read_status are required to be
  assigned by the driver code.  The rest are optional.  Also, it is
-- 
2.1.0



Re: [PATCH next v2 0/7] Introduce l3_dev pointer for L3 processing

2016-03-14 Thread Mahesh Bandewar
On Sun, Mar 13, 2016 at 8:53 PM, David Miller  wrote:
> From: Mahesh Bandewar 
> Date: Sun, 13 Mar 2016 19:29:58 -0700
>
>> On Sun, Mar 13, 2016 at 6:50 PM, David Miller  wrote:
>>> It doesn't matter whether doing so or not makes sense.
>>>
>>> You're going to have to find a way to do both, and also I'm concerned
>>> about how you're leaking the source namespace's "stuff" into the
>>> destination's.  That's very worrisome to me.
>>
>> If we add a new mode (e.g. L3s) and preserve current mode as is it,
>> then that should address your first concern.
>
> Also, I don't want all of this device translation stuff all over the
> place.
>
I could add skb->dev. Is that OK? Then non of this translation / helper-stuff
is required. I'm definitely open for suggestions.

> Furthermore, when you walk across the ns boundary, that old device has
> to disappear.  That's why that is the device assigned to skb->dev.
>
The layer boundaries are not that well maintained. We do check for the xfrm
policies in L4 and expect the skb->dev pointing to the L3 device. So unless we
have a way to derive a L3 dev from skb->dev, I don't think xfrm will
work. Unless
some Xfrm-expert asserts that this is not needed.

> Please stop pretending that this device switching is ok, it's not.


Re: [PATCH net-next] rds-tcp: Add module parameters to control sndbuf/rcvbuf size of RDS-TCP socket

2016-03-14 Thread Tom Herbert
On Fri, Mar 11, 2016 at 8:39 PM, Sowmini Varadhan
 wrote:
> On (03/11/16 20:07), Tom Herbert wrote:
>
>> You are describing your deployment of RDS, not kernel implementation.
>> What I see is a very rigid implementation that would make it hard for
>> many us to ever even consider deploying. The abilities of applications
>> to tune TCP connections is well understood, very prevalent, and really
>> fundamental in making TCP based datacenters at large scale.
>
> sorry, historically, OS DDI/DKI's for kernel modules have always
> had some way to set up startup parameters at module load time.  And
> clusters have been around for a while.
>
> So let's just focus on the technical question around module config here,
> which involves more than TCP sockets, btw.
>
>>  Any way, it was just an idea... ;-)
>
> Thank you.
>
> Moving on,
>
>> Maybe add one module parameter that indicates the module should just
>> load but not start, configure whatever is needed via netlink, and then
>> send one more netlink command to start operations.
>
> Even that needs an extra daemon.
>
> Without getting into the vast number of questions that it raises (such
> as every module with startup params now needs a uspace counterpart?
> modprobe-r behavior, namespace behavior? Why netlink for every kernel
> module? etc)..
>
Most modules of any significant complexity are managed via netlink,
and so all of your questions have likely already been answered. Yes,
netlink assumes userspace configuration tools ("ip" configuration many
different networking modules). There are simple interfaces to
create/delete a module's netlink hooks when module is added/removed.
Netlink operates in the context of network namespaces. netlink is
preferred since it is far more extensible and generic for
configuration than sysctl, module params, etc.

Tom

> One module parameter is as much a "distribution management"
> problem as 10 of them, yes? I hope you see that I dont need that module
> param and daemon baggage- I can just use  sysctl to set up all my params,
> including one bit for module_can_start_now to achieve the same thing.
>
> But it is still more than the handful of lines of code in my patch,
> so it would be nice to understand what is the "distribution" issue.
>
> Stepping back, how do we make sysctl fully namespace friendly?
>
> btw setting kernel socket keepalive params via sysctl is not a problem
> to implement at all, if it ever shows up as a requirement for customers.
>
> --Sowmini
>


RE: [PATCH v2 4/4] ethtool: support setting default Rx flow indirection table

2016-03-14 Thread Keller, Jacob E
> -Original Message-
> From: Ben Hutchings [mailto:b...@decadent.org.uk]
> Sent: Sunday, March 13, 2016 9:25 AM
> To: Keller, Jacob E 
> Cc: netdev 
> Subject: Re: [PATCH v2 4/4] ethtool: support setting default Rx flow
> indirection table
> 
> On Tue, 2016-02-16 at 21:22 +, Keller, Jacob E wrote:
> 
> > Signed-off-by: Jacob Keller 
> > ---
> >
> > Not sure if there is a mailing list for this, I sent this to the netdev
> > list but forgot to Cc you on the ethtool change.
> 
> I haven't been keeping up with netdev for a long time, but I have
> recently set up filtering by subject so I can keep up with just the
> ethtool-related messages.  Still, patches for the ethtool command
> should always be explicitly sent to me.
> 
> > Dave applied the
> > network core patches, but they're more or less useless unless we
> > actually have the ability to request default setting using ethtool
> > (which I extended to support "default" here)
> 
> The patch was mangled (word-wrapped and modified white-space) in this
> message, so I took the version in
> .
> 
> [...]
> > @@ -3332,7 +3335,7 @@ static int do_srxfh(struct cmd_context *ctx)
> >     u32 entry_size = sizeof(rss_head.rss_config[0]);
> >     u32 num_weights = 0;
> >
> > -   if (ctx->argc < 2)
> > +   if (ctx->argc < 1)
> >     exit_bad_args();
> [...]
> 
> This means we might continue without having the required parameter
> after "equal", "weight" or "hkey".  But, having said that, since we're
> only checking once before running the loop, we're already failing to
> validate that properly.
> 
> I've applied this, but could you please send another patch that adds
> checks on ctx->argc within the loop and test cases in test-cmdline.c?
> 
> Ben.
> 

Yes. Not sure how the patch got broken for you here, as I sent it using 
git-send-email. I will send the proposed fix above.

Thanks,
Jake



Re: [PATCH next v2 0/7] Introduce l3_dev pointer for L3 processing

2016-03-14 Thread Cong Wang
On Sun, Mar 13, 2016 at 5:01 PM, Mahesh Bandewar  wrote:
>>> If I understand correctly (and as Cong already said), information are
>>> leaking
>>> between netns during the input phase. On the tx side, skb_scrub_packet() is
>>> called, but not on the rx side. I think it's wrong. There should be an
>>> explicit
>>> boundary.
>>
>> That is not what I am complaining about.
>>
>> I dislike the trick of switching skb->dev pointer with skb->dev->l3_dev.
>> This is not how we switch netns, nor the way how netns works.
>>
> How it is different from what we are doing currently?
>
> Current: Use skb->dev for L3 processing and derive netns from skb->dev
> Proposal: use skb->dev->l3_dev for L3 processing and derive netns from
> skb->dev->l3_dev


If you ever read the part you quote below, you will have the answer.


>
>> Look at veth pair or dev_change_net_namespace(), each time when we
>> switch netns, we need to do a full reregistration or a full reentrance, we
>> never just switch some pointers to switch netns. This is why I said it breaks
>> isolation.

^ You miss this part.


>>
>> Also, it is ugly to hide such a ipvlan-specific pointer for half of the RX 
>> code
>> path.
> I think I have already mentioned, I'm adding RX code now and later
> I'll add TX code to use
> l3_dev to make it symmetric. This way all L3 (Tx/Rx) will use this
> device reference
> always.

You are trying to convince me by telling me you will add more ugly code??
Seriously??


Re: [PATCH 3/4] infiniband: hns: add Hisilicon RoCE support(driver code)

2016-03-14 Thread Parav Pandit
>>
>> Since SRQ is not supported in this driver version, can you keep
>> remaining code base also to not bother about SRQ specifically
>> poll_cq_one, modify_qp, destroy_qp etc?
>> SRQ support can come as complete additional patch along with cmd_mask,
>> callbacks and rest of the code.
>>
>> .
> Sorry, I see your review in time.
> Sure, SRQ is not supported in current roce driver. I have verified the 
> function
> for RDMA. It is not influence. For your question, we need to analyse it 
> scientific.
> after that, i will reply your doubt, is that ok?

Yes. No problem.


Re: [PATCH 1/5] mlx4: add missing braces in verify_qp_parameters

2016-03-14 Thread Leon Romanovsky
On Mon, Mar 14, 2016 at 03:18:34PM +0100, Arnd Bergmann wrote:
> The implementation of QP paravirtualization back in linux-3.7 included
> some code that looks very dubious, and gcc-6 has grown smart enough
> to warn about it:
> 
> drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 
> 'verify_qp_parameters':
> drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:3154:5: error: 
> statement is indented as if it were guarded by... 
> [-Werror=misleading-indentation]
>  if (optpar & MLX4_QP_OPTPAR_ALT_ADDR_PATH) {
>  ^~
> drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:3144:4: note: ...this 
> 'if' clause, but it is not
> if (slave != mlx4_master_func_num(dev))
> 
> From looking at the context, I'm reasonably sure that the indentation
> is correct but that it should have contained curly braces from the
> start, as the update_gid() function in the same patch correctly does.
> 
> Signed-off-by: Arnd Bergmann 
> Fixes: 54679e148287 ("mlx4: Implement QP paravirtualization and maintain 
> phys_pkey_cache for smp_snoop")

Thanks, looks good.
Reviewed-by: Leon Romanovsky 

> ---
>  drivers/net/ethernet/mellanox/mlx4/resource_tracker.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c 
> b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
> index 25ce1b030a00..cd9b2b28df88 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
> @@ -3141,7 +3141,7 @@ static int verify_qp_parameters(struct mlx4_dev *dev,
>   case QP_TRANS_RTS2RTS:
>   case QP_TRANS_SQD2SQD:
>   case QP_TRANS_SQD2RTS:
> - if (slave != mlx4_master_func_num(dev))
> + if (slave != mlx4_master_func_num(dev)) {
>   if (optpar & MLX4_QP_OPTPAR_PRIMARY_ADDR_PATH) {
>   port = (qp_ctx->pri_path.sched_queue >> 
> 6 & 1) + 1;
>   if (dev->caps.port_mask[port] != 
> MLX4_PORT_TYPE_IB)
> @@ -3160,6 +3160,7 @@ static int verify_qp_parameters(struct mlx4_dev *dev,
>   if (qp_ctx->alt_path.mgid_index >= 
> num_gids)
>   return -EINVAL;
>   }
> + }
>   break;
>   default:
>   break;
> -- 
> 2.7.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v6, 5/5] mmc: sdhci-of-esdhc: fix host version for T4240-R1.0-R2.0

2016-03-14 Thread Scott Wood
On 03/14/2016 02:29 AM, Yangbo Lu wrote:
>> -Original Message-
>> From: Arnd Bergmann [mailto:a...@arndb.de]
>> Sent: Monday, March 14, 2016 6:26 AM
>> To: linuxppc-...@lists.ozlabs.org
>> Cc: Yangbo Lu; devicet...@vger.kernel.org; linux-arm-
>> ker...@lists.infradead.org; linux-ker...@vger.kernel.org; linux-
>> c...@vger.kernel.org; linux-...@vger.kernel.org; iommu@lists.linux-
>> foundation.org; netdev@vger.kernel.org; linux-...@vger.kernel.org;
>> ulf.hans...@linaro.org; Zhao Qiang; Russell King; Bhupesh Sharma; Joerg
>> Roedel; Santosh Shilimkar; Scott Wood; Rob Herring; Claudiu Manoil; Kumar
>> Gala; Yang-Leo Li; Xiaobo Xie
>> Subject: Re: [v6, 5/5] mmc: sdhci-of-esdhc: fix host version for T4240-
>> R1.0-R2.0
>>
>> On Wednesday 09 March 2016 18:08:51 Yangbo Lu wrote:
>>> @@ -567,10 +580,20 @@ static void esdhc_init(struct platform_device
>> *pdev, struct sdhci_host *host)
>>> struct sdhci_pltfm_host *pltfm_host;
>>> struct sdhci_esdhc *esdhc;
>>> u16 host_ver;
>>> +   u32 svr;
>>>
>>> pltfm_host = sdhci_priv(host);
>>> esdhc = sdhci_pltfm_priv(pltfm_host);
>>>
>>> +   fsl_guts_init();
>>> +   svr = fsl_guts_get_svr();
>>> +   if (svr) {
>>> +   esdhc->soc_ver = SVR_SOC_VER(svr);
>>> +   esdhc->soc_rev = SVR_REV(svr);
>>> +   } else {
>>> +   dev_err(>dev, "Failed to get SVR value!\n");
>>> +   }
>>> +
>>
>> This makes the driver non-portable. Better identify the specific
>> workarounds based on the compatible string for this device, or add a
>> boolean DT property for the quirk.
>>
>>  Arnd
> 
> [Lu Yangbo-B47093] Hi Arnd, we did have a discussion about using DTS in v1 
> before.
> https://patchwork.kernel.org/patch/6834221/
> 
> We don’t have a separate DTS file for each revision of an SOC and if we did, 
> we'd constantly have people using the wrong one.
> In addition, the device tree is stable ABI and errata are often discovered 
> after device tree are deployed.
> See the link for details.
> 
> So we decide to read SVR from the device-config/guts MMIO block other than 
> using DTS.
> Thanks.

Also note that this driver is already only for fsl-specific hardware,
and it will still work even if fsl_guts doesn't find anything to bind to
-- it just wouldn't be able to detect errata based on SVR in that case.

-Scott



[PATCH 5/5] ethtool: support per queue sub command --coalesce

2016-03-14 Thread kan . liang
From: Kan Liang 

This patch uses a similar way as do_scoalesce to set coalesce per queue.
It reads the current settings, change them, and write them back to the
kernel for each masked queue.

Example:

 $ sudo ./ethtool --set-perqueue-command eth5 queue_mask 0x1 --coalesce
 rx-usecs 10 tx-usecs 5
 $ sudo ./ethtool --set-perqueue-command eth5 queue_mask 0x1
 --show-coalesce

 Queue: 0
 Adaptive RX: on  TX: on
 stats-block-usecs: 0
 sample-interval: 0
 pkt-rate-low: 0
 pkt-rate-high: 0

 rx-usecs: 10
 rx-frames: 0
 rx-usecs-irq: 0
 rx-frames-irq: 256

 tx-usecs: 5
 tx-frames: 0
 tx-usecs-irq: 0
 tx-frames-irq: 256

 rx-usecs-low: 0
 rx-frame-low: 0
 tx-usecs-low: 0
 tx-frame-low: 0

 rx-usecs-high: 0
 rx-frame-high: 0
 tx-usecs-high: 0
 tx-frame-high: 0

Signed-off-by: Kan Liang 
---
 ethtool.8.in |  2 +-
 ethtool.c| 58 +-
 2 files changed, 58 insertions(+), 2 deletions(-)

diff --git a/ethtool.8.in b/ethtool.8.in
index 210ec8c..0e42180 100644
--- a/ethtool.8.in
+++ b/ethtool.8.in
@@ -937,7 +937,7 @@ Sets the specific queues which the sub command is applied 
to.
 If queue_mask is not set, the sub command will be applied to all queues.
 .TP
 .B sub_command
-Sets the sub command. The supported sub commands include --show-coalesce.
+Sets the sub command. The supported sub commands include --show-coalesce and 
--coalesce.
 .RE
 .SH BUGS
 Not supported (in part or whole) on all network drivers.
diff --git a/ethtool.c b/ethtool.c
index a966bf8..55ba26c 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -4222,7 +4222,7 @@ static const struct option {
  " [ tx-lpi on|off ]\n"
  " [ tx-timer %d ]\n"},
{ "--set-perqueue-command", 1, do_perqueue, "Set per queue command. "
- "The supported sub commands include --show-coalesce",
+ "The supported sub commands include --show-coalesce, --coalesce",
  " [queue_mask %x] SUB_COMMAND\n"},
{ "-h|--help", 0, show_usage, "Show this help" },
{ "--version", 0, do_version, "Show version number" },
@@ -4348,6 +4348,52 @@ get_per_queue_coalesce(struct cmd_context *ctx,
return per_queue_opt;
 }
 
+static void __set_per_queue_coalesce(int queue)
+{
+   int changed = 0;
+
+   do_generic_set(cmdline_coalesce, ARRAY_SIZE(cmdline_coalesce),
+  );
+
+   if (!changed)
+   fprintf(stderr, "Queue %d, no coalesce parameters changed\n", 
queue);
+}
+
+static void set_per_queue_coalesce(struct cmd_context *ctx,
+  struct ethtool_per_queue_op *per_queue_opt)
+{
+   __u32 *queue_mask = per_queue_opt->queue_mask;
+   char *addr = (char *)per_queue_opt + sizeof(*per_queue_opt);
+   int gcoalesce_changed = 0;
+   int i;
+
+   parse_generic_cmdline(ctx, _changed,
+ cmdline_coalesce, ARRAY_SIZE(cmdline_coalesce));
+
+   for (i = 0; i < __KERNEL_DIV_ROUND_UP(MAX_NUM_QUEUE, 32); i++) {
+   int queue = i * 32;
+   __u32 mask = queue_mask[i];
+
+   while (mask > 0) {
+   if (mask & 0x1) {
+   memcpy(_ecoal, addr, sizeof(struct 
ethtool_coalesce));
+   __set_per_queue_coalesce(queue);
+   memcpy(addr, _ecoal, sizeof(struct 
ethtool_coalesce));
+   addr += sizeof(struct ethtool_coalesce);
+   }
+   mask = mask >> 1;
+   queue++;
+   }
+   }
+
+   per_queue_opt->cmd = ETHTOOL_PERQUEUE;
+   per_queue_opt->sub_command = ETHTOOL_SCOALESCE;
+
+   if (send_ioctl(ctx, per_queue_opt))
+   perror("Cannot set device per queue parameters");
+
+}
+
 static int do_perqueue(struct cmd_context *ctx)
 {
struct ethtool_per_queue_op *per_queue_opt;
@@ -4397,6 +4443,16 @@ static int do_perqueue(struct cmd_context *ctx)
}
dump_per_queue_coalesce(per_queue_opt, queue_mask);
free(per_queue_opt);
+   } else if (strstr(args[i].opts, "--coalesce") != NULL) {
+   ctx->argc--;
+   ctx->argp++;
+   per_queue_opt = get_per_queue_coalesce(ctx, queue_mask, 
n_queues);
+   if (per_queue_opt == NULL) {
+   perror("Cannot get device per queue parameters");
+   return -EFAULT;
+   }
+   set_per_queue_coalesce(ctx, per_queue_opt);
+   free(per_queue_opt);
} else {
perror("The subcommand is not supported yet");
return -EOPNOTSUPP;
-- 
2.5.0



[PATCH 2/5] ethtool: move cmdline_coalesce out of do_scoalesce

2016-03-14 Thread kan . liang
From: Kan Liang 

Moving cmdline_coalesce out of do_scoalesce, so it can be shared with
other functions.
No behavior change.

Signed-off-by: Kan Liang 
---
 ethtool.c | 147 +++---
 1 file changed, 74 insertions(+), 73 deletions(-)

diff --git a/ethtool.c b/ethtool.c
index bd0583c..86724a2 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -1883,85 +1883,86 @@ static int do_gcoalesce(struct cmd_context *ctx)
return 0;
 }
 
+static struct ethtool_coalesce s_ecoal;
+static s32 coal_stats_wanted = -1;
+static int coal_adaptive_rx_wanted = -1;
+static int coal_adaptive_tx_wanted = -1;
+static s32 coal_sample_rate_wanted = -1;
+static s32 coal_pkt_rate_low_wanted = -1;
+static s32 coal_pkt_rate_high_wanted = -1;
+static s32 coal_rx_usec_wanted = -1;
+static s32 coal_rx_frames_wanted = -1;
+static s32 coal_rx_usec_irq_wanted = -1;
+static s32 coal_rx_frames_irq_wanted = -1;
+static s32 coal_tx_usec_wanted = -1;
+static s32 coal_tx_frames_wanted = -1;
+static s32 coal_tx_usec_irq_wanted = -1;
+static s32 coal_tx_frames_irq_wanted = -1;
+static s32 coal_rx_usec_low_wanted = -1;
+static s32 coal_rx_frames_low_wanted = -1;
+static s32 coal_tx_usec_low_wanted = -1;
+static s32 coal_tx_frames_low_wanted = -1;
+static s32 coal_rx_usec_high_wanted = -1;
+static s32 coal_rx_frames_high_wanted = -1;
+static s32 coal_tx_usec_high_wanted = -1;
+static s32 coal_tx_frames_high_wanted = -1;
+
+static struct cmdline_info cmdline_coalesce[] = {
+   { "adaptive-rx", CMDL_BOOL, _adaptive_rx_wanted,
+ _ecoal.use_adaptive_rx_coalesce },
+   { "adaptive-tx", CMDL_BOOL, _adaptive_tx_wanted,
+ _ecoal.use_adaptive_tx_coalesce },
+   { "sample-interval", CMDL_S32, _sample_rate_wanted,
+ _ecoal.rate_sample_interval },
+   { "stats-block-usecs", CMDL_S32, _stats_wanted,
+ _ecoal.stats_block_coalesce_usecs },
+   { "pkt-rate-low", CMDL_S32, _pkt_rate_low_wanted,
+ _ecoal.pkt_rate_low },
+   { "pkt-rate-high", CMDL_S32, _pkt_rate_high_wanted,
+ _ecoal.pkt_rate_high },
+   { "rx-usecs", CMDL_S32, _rx_usec_wanted,
+ _ecoal.rx_coalesce_usecs },
+   { "rx-frames", CMDL_S32, _rx_frames_wanted,
+ _ecoal.rx_max_coalesced_frames },
+   { "rx-usecs-irq", CMDL_S32, _rx_usec_irq_wanted,
+ _ecoal.rx_coalesce_usecs_irq },
+   { "rx-frames-irq", CMDL_S32, _rx_frames_irq_wanted,
+ _ecoal.rx_max_coalesced_frames_irq },
+   { "tx-usecs", CMDL_S32, _tx_usec_wanted,
+ _ecoal.tx_coalesce_usecs },
+   { "tx-frames", CMDL_S32, _tx_frames_wanted,
+ _ecoal.tx_max_coalesced_frames },
+   { "tx-usecs-irq", CMDL_S32, _tx_usec_irq_wanted,
+ _ecoal.tx_coalesce_usecs_irq },
+   { "tx-frames-irq", CMDL_S32, _tx_frames_irq_wanted,
+ _ecoal.tx_max_coalesced_frames_irq },
+   { "rx-usecs-low", CMDL_S32, _rx_usec_low_wanted,
+ _ecoal.rx_coalesce_usecs_low },
+   { "rx-frames-low", CMDL_S32, _rx_frames_low_wanted,
+ _ecoal.rx_max_coalesced_frames_low },
+   { "tx-usecs-low", CMDL_S32, _tx_usec_low_wanted,
+ _ecoal.tx_coalesce_usecs_low },
+   { "tx-frames-low", CMDL_S32, _tx_frames_low_wanted,
+ _ecoal.tx_max_coalesced_frames_low },
+   { "rx-usecs-high", CMDL_S32, _rx_usec_high_wanted,
+ _ecoal.rx_coalesce_usecs_high },
+   { "rx-frames-high", CMDL_S32, _rx_frames_high_wanted,
+ _ecoal.rx_max_coalesced_frames_high },
+   { "tx-usecs-high", CMDL_S32, _tx_usec_high_wanted,
+ _ecoal.tx_coalesce_usecs_high },
+   { "tx-frames-high", CMDL_S32, _tx_frames_high_wanted,
+ _ecoal.tx_max_coalesced_frames_high },
+};
 static int do_scoalesce(struct cmd_context *ctx)
 {
-   struct ethtool_coalesce ecoal;
int gcoalesce_changed = 0;
-   s32 coal_stats_wanted = -1;
-   int coal_adaptive_rx_wanted = -1;
-   int coal_adaptive_tx_wanted = -1;
-   s32 coal_sample_rate_wanted = -1;
-   s32 coal_pkt_rate_low_wanted = -1;
-   s32 coal_pkt_rate_high_wanted = -1;
-   s32 coal_rx_usec_wanted = -1;
-   s32 coal_rx_frames_wanted = -1;
-   s32 coal_rx_usec_irq_wanted = -1;
-   s32 coal_rx_frames_irq_wanted = -1;
-   s32 coal_tx_usec_wanted = -1;
-   s32 coal_tx_frames_wanted = -1;
-   s32 coal_tx_usec_irq_wanted = -1;
-   s32 coal_tx_frames_irq_wanted = -1;
-   s32 coal_rx_usec_low_wanted = -1;
-   s32 coal_rx_frames_low_wanted = -1;
-   s32 coal_tx_usec_low_wanted = -1;
-   s32 coal_tx_frames_low_wanted = -1;
-   s32 coal_rx_usec_high_wanted = -1;
-   s32 coal_rx_frames_high_wanted = -1;
-   s32 coal_tx_usec_high_wanted = -1;
-   s32 coal_tx_frames_high_wanted = -1;
-   struct cmdline_info cmdline_coalesce[] = {
-   { "adaptive-rx", CMDL_BOOL, _adaptive_rx_wanted,
- _adaptive_rx_coalesce },
-   { 

[PATCH 1/5] ethtool: move option parsing related codes into function

2016-03-14 Thread kan . liang
From: Kan Liang 

Move option parsing code into find_option function.
No behavior changes.

Signed-off-by: Kan Liang 
---
 ethtool.c | 49 +++--
 1 file changed, 31 insertions(+), 18 deletions(-)

diff --git a/ethtool.c b/ethtool.c
index 0cd0d4f..bd0583c 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -4223,6 +4223,29 @@ static int show_usage(struct cmd_context *ctx)
return 0;
 }
 
+static int find_option(int argc, char **argp)
+{
+   const char *opt;
+   size_t len;
+   int k;
+
+   for (k = 0; args[k].opts; k++) {
+   opt = args[k].opts;
+   for (;;) {
+   len = strcspn(opt, "|");
+   if (strncmp(*argp, opt, len) == 0 &&
+   (*argp)[len] == 0)
+   return k;
+
+   if (opt[len] == 0)
+   break;
+   opt += len + 1;
+   }
+   }
+
+   return -1;
+}
+
 int main(int argc, char **argp)
 {
int (*func)(struct cmd_context *);
@@ -4240,24 +4263,14 @@ int main(int argc, char **argp)
 */
if (argc == 0)
exit_bad_args();
-   for (k = 0; args[k].opts; k++) {
-   const char *opt;
-   size_t len;
-   opt = args[k].opts;
-   for (;;) {
-   len = strcspn(opt, "|");
-   if (strncmp(*argp, opt, len) == 0 &&
-   (*argp)[len] == 0) {
-   argp++;
-   argc--;
-   func = args[k].func;
-   want_device = args[k].want_device;
-   goto opt_found;
-   }
-   if (opt[len] == 0)
-   break;
-   opt += len + 1;
-   }
+
+   k = find_option(argc, argp);
+   if (k > 0) {
+   argp++;
+   argc--;
+   func = args[k].func;
+   want_device = args[k].want_device;
+   goto opt_found;
}
if ((*argp)[0] == '-')
exit_bad_args();
-- 
2.5.0



  1   2   3   >