date:20150821

[PATCH, net-next]r8169:Disable interrupts.

2015-08-21 Thread Corcodel Marian

Disable interrupts when close the interface.


Signed-off-by: Corcodel Marian corcodel.mar...@gmail.com

diff --git a/drivers/net/ethernet/realtek/r8169.c 
b/drivers/net/ethernet/realtek/r8169.c
index 6cd7226..ea461fe 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -7548,6 +7548,7 @@ static int rtl8169_close(struct net_device *dev)
 
/* Update counters before going down */
rtl8169_update_counters(dev);
+   rtl8169_irq_mask_and_ack(tp);
 
rtl_lock_work(tp);
clear_bit(RTL_FLAG_TASK_ENABLED, tp-wk.flags);
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 net-next] netfilter: ipset: Fixing unnamed union init

2015-08-21 Thread Elad Raz

In continue to proposed Vinson Lee's post [1], this patch fixes compilation
issues founded at gcc 4.4.7. The initialization of .cidr field of unnamed
unions causes compilation error in gcc 4.4.x.

References

Visible links
[1] https://lkml.org/lkml/2015/7/5/74

Signed-off-by: Elad Raz el...@mellanox.com
---
 net/netfilter/ipset/ip_set_hash_netnet.c | 20 ++--
 net/netfilter/ipset/ip_set_hash_netportnet.c | 20 ++--
 2 files changed, 36 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/ipset/ip_set_hash_netnet.c 
b/net/netfilter/ipset/ip_set_hash_netnet.c
index 3c862c0..a93dfeb 100644
--- a/net/netfilter/ipset/ip_set_hash_netnet.c
+++ b/net/netfilter/ipset/ip_set_hash_netnet.c
@@ -131,6 +131,13 @@ hash_netnet4_data_next(struct hash_netnet4_elem *next,
 #define HOST_MASK  32
 #include ip_set_hash_gen.h
 
+static void
+hash_netnet4_init(struct hash_netnet4_elem *e)
+{
+   e-cidr[0] = HOST_MASK;
+   e-cidr[1] = HOST_MASK;
+}
+
 static int
 hash_netnet4_kadt(struct ip_set *set, const struct sk_buff *skb,
  const struct xt_action_param *par,
@@ -160,7 +167,7 @@ hash_netnet4_uadt(struct ip_set *set, struct nlattr *tb[],
 {
const struct hash_netnet *h = set-data;
ipset_adtfn adtfn = set-variant-adt[adt];
-   struct hash_netnet4_elem e = { .cidr = { HOST_MASK, HOST_MASK, }, };
+   struct hash_netnet4_elem e = { };
struct ip_set_ext ext = IP_SET_INIT_UEXT(set);
u32 ip = 0, ip_to = 0, last;
u32 ip2 = 0, ip2_from = 0, ip2_to = 0, last2;
@@ -169,6 +176,7 @@ hash_netnet4_uadt(struct ip_set *set, struct nlattr *tb[],
if (tb[IPSET_ATTR_LINENO])
*lineno = nla_get_u32(tb[IPSET_ATTR_LINENO]);
 
+   hash_netnet4_init(e);
if (unlikely(!tb[IPSET_ATTR_IP] || !tb[IPSET_ATTR_IP2] ||
 !ip_set_optattr_netorder(tb, IPSET_ATTR_CADT_FLAGS)))
return -IPSET_ERR_PROTOCOL;
@@ -357,6 +365,13 @@ hash_netnet6_data_next(struct hash_netnet4_elem *next,
 #define IP_SET_EMIT_CREATE
 #include ip_set_hash_gen.h
 
+static void
+hash_netnet6_init(struct hash_netnet6_elem *e)
+{
+   e-cidr[0] = HOST_MASK;
+   e-cidr[1] = HOST_MASK;
+}
+
 static int
 hash_netnet6_kadt(struct ip_set *set, const struct sk_buff *skb,
  const struct xt_action_param *par,
@@ -385,13 +400,14 @@ hash_netnet6_uadt(struct ip_set *set, struct nlattr *tb[],
  enum ipset_adt adt, u32 *lineno, u32 flags, bool retried)
 {
ipset_adtfn adtfn = set-variant-adt[adt];
-   struct hash_netnet6_elem e = { .cidr = { HOST_MASK, HOST_MASK, }, };
+   struct hash_netnet6_elem e = { };
struct ip_set_ext ext = IP_SET_INIT_UEXT(set);
int ret;
 
if (tb[IPSET_ATTR_LINENO])
*lineno = nla_get_u32(tb[IPSET_ATTR_LINENO]);
 
+   hash_netnet6_init(e);
if (unlikely(!tb[IPSET_ATTR_IP] || !tb[IPSET_ATTR_IP2] ||
 !ip_set_optattr_netorder(tb, IPSET_ATTR_CADT_FLAGS)))
return -IPSET_ERR_PROTOCOL;
diff --git a/net/netfilter/ipset/ip_set_hash_netportnet.c 
b/net/netfilter/ipset/ip_set_hash_netportnet.c
index 0c68734..9a14c23 100644
--- a/net/netfilter/ipset/ip_set_hash_netportnet.c
+++ b/net/netfilter/ipset/ip_set_hash_netportnet.c
@@ -142,6 +142,13 @@ hash_netportnet4_data_next(struct hash_netportnet4_elem 
*next,
 #define HOST_MASK  32
 #include ip_set_hash_gen.h
 
+static void
+hash_netportnet4_init(struct hash_netportnet4_elem *e)
+{
+   e-cidr[0] = HOST_MASK;
+   e-cidr[1] = HOST_MASK;
+}
+
 static int
 hash_netportnet4_kadt(struct ip_set *set, const struct sk_buff *skb,
  const struct xt_action_param *par,
@@ -175,7 +182,7 @@ hash_netportnet4_uadt(struct ip_set *set, struct nlattr 
*tb[],
 {
const struct hash_netportnet *h = set-data;
ipset_adtfn adtfn = set-variant-adt[adt];
-   struct hash_netportnet4_elem e = { .cidr = { HOST_MASK, HOST_MASK, }, };
+   struct hash_netportnet4_elem e = { };
struct ip_set_ext ext = IP_SET_INIT_UEXT(set);
u32 ip = 0, ip_to = 0, ip_last, p = 0, port, port_to;
u32 ip2_from = 0, ip2_to = 0, ip2_last, ip2;
@@ -185,6 +192,7 @@ hash_netportnet4_uadt(struct ip_set *set, struct nlattr 
*tb[],
if (tb[IPSET_ATTR_LINENO])
*lineno = nla_get_u32(tb[IPSET_ATTR_LINENO]);
 
+   hash_netportnet4_init(e);
if (unlikely(!tb[IPSET_ATTR_IP] || !tb[IPSET_ATTR_IP2] ||
 !ip_set_attr_netorder(tb, IPSET_ATTR_PORT) ||
 !ip_set_optattr_netorder(tb, IPSET_ATTR_PORT_TO) ||
@@ -412,6 +420,13 @@ hash_netportnet6_data_next(struct hash_netportnet4_elem 
*next,
 #define IP_SET_EMIT_CREATE
 #include ip_set_hash_gen.h
 
+static void
+hash_netportnet6_init(struct hash_netportnet6_elem *e)
+{
+   e-cidr[0] = HOST_MASK;
+   e-cidr[1] = HOST_MASK;
+}
+
 static int
 hash_netportnet6_kadt(struct ip_set *set, const struct sk_buff

Re: [PATCH v2 net-next] r8169: Add values missing in @get_stats64 from HW counters

2015-08-21 Thread Corinna Vinschen

Sorry, I forgot to mention that I tested this patch on three different
chip versions, RTL_GIGA_MAC_VER_23, RTL_GIGA_MAC_VER_33 and
RTL_GIGA_MAC_VER_35.  I couldn't test on pre-RTL_GIGA_MAC_VER_19, but
the offset handling without counter reset already worked as expected on
later chip versions, so I'm pretty confident that older chip versions
should work accordingly.

On Aug 21 12:09, Corinna Vinschen wrote:
 The r8169 driver collects statistical information returned by
 @get_stats64 by counting them in the driver itself, even though many
 (but not all) of the values are already collected by tally counters
 (TCs) in the NIC.  Some of these TC values are not returned by
 @get_stats64.  Especially the received multicast packages are missing
 from /proc/net/dev.
 
 Rectify this by fetching the TCs and returning them from
 rtl8169_get_stats64.
 
 The counters collected in the driver obviously disappear as soon as the
 driver is unloaded so after a driver is loaded the counters always start
 at 0. The TCs on the other hand are only reset by a power cycle.  Without
 further considerations the values collected by the driver would not match
 up against the TC values.
 
 This patch introduces a new function rtl8169_reset_counters which
 resets the TCs.
 
 Unfortunately chip versions prior to RTL_GIGA_MAC_VER_19 don't allow
 to reset the TCs programatically.  Therefore introduce an addition to
 the rtl8169_private struct and a function rtl8169_init_counter_offsets
 to store the TCs at first rtl_open.  Use these values as offsets in
 rtl8169_get_stats64.
 
 Signed-off-by: Corinna Vinschen vinsc...@redhat.com
 ---
  drivers/net/ethernet/realtek/r8169.c | 107 
 +++
  1 file changed, 107 insertions(+)
 
 diff --git a/drivers/net/ethernet/realtek/r8169.c 
 b/drivers/net/ethernet/realtek/r8169.c
 index f790f61..f26a48d 100644
 --- a/drivers/net/ethernet/realtek/r8169.c
 +++ b/drivers/net/ethernet/realtek/r8169.c
 @@ -637,6 +637,9 @@ enum rtl_register_content {
   /* _TBICSRBit */
   TBILinkOK   = 0x0200,
  
 + /* ResetCounterCommand */
 + CounterReset= 0x1,
 +
   /* DumpCounterCommand */
   CounterDump = 0x8,
  
 @@ -747,6 +750,14 @@ struct rtl8169_counters {
   __le16  tx_underun;
  };
  
 +struct rtl8169_tc_offsets {
 + boolinited;
 + __le64  tx_errors;
 + __le32  tx_multi_collision;
 + __le32  rx_multicast;
 + __le16  tx_aborted;
 +};
 +
  enum rtl_flag {
   RTL_FLAG_TASK_ENABLED,
   RTL_FLAG_TASK_SLOW_PENDING,
 @@ -824,6 +835,7 @@ struct rtl8169_private {
  
   struct mii_if_info mii;
   struct rtl8169_counters counters;
 + struct rtl8169_tc_offsets tc_offset;
   u32 saved_wolopts;
   u32 opts1_mask;
  
 @@ -2179,6 +2191,47 @@ static int rtl8169_get_sset_count(struct net_device 
 *dev, int sset)
   }
  }
  
 +DECLARE_RTL_COND(rtl_reset_counters_cond)
 +{
 + void __iomem *ioaddr = tp-mmio_addr;
 +
 + return RTL_R32(CounterAddrLow)  CounterReset;
 +}
 +
 +static void rtl8169_reset_counters(struct net_device *dev)
 +{
 + struct rtl8169_private *tp = netdev_priv(dev);
 + void __iomem *ioaddr = tp-mmio_addr;
 + struct device *d = tp-pci_dev-dev;
 + struct rtl8169_counters *counters;
 + dma_addr_t paddr;
 + u32 cmd;
 +
 + /*
 +  * Versions prior to RTL_GIGA_MAC_VER_19 don't support resetting the
 +  * tally counters.
 +  */
 + if (tp-mac_version  RTL_GIGA_MAC_VER_19)
 + return;
 +
 + counters = dma_alloc_coherent(d, sizeof(*counters), paddr, GFP_KERNEL);
 + if (!counters)
 + return;
 +
 + RTL_W32(CounterAddrHigh, (u64)paddr  32);
 + cmd = (u64)paddr  DMA_BIT_MASK(32);
 + RTL_W32(CounterAddrLow, cmd);
 + RTL_W32(CounterAddrLow, cmd | CounterReset);
 +
 + if (!rtl_udelay_loop_wait_low(tp, rtl_reset_counters_cond, 10, 1000))
 + netif_warn(tp, hw, dev, counter reset failed\n);
 +
 + RTL_W32(CounterAddrLow, 0);
 + RTL_W32(CounterAddrHigh, 0);
 +
 + dma_free_coherent(d, sizeof(*counters), counters, paddr);
 +}
 +
  DECLARE_RTL_COND(rtl_counters_cond)
  {
   void __iomem *ioaddr = tp-mmio_addr;
 @@ -2220,6 +2273,39 @@ static void rtl8169_update_counters(struct net_device 
 *dev)
   dma_free_coherent(d, sizeof(*counters), counters, paddr);
  }
  
 +static void rtl8169_init_counter_offsets(struct net_device *dev)
 +{
 + struct rtl8169_private *tp = netdev_priv(dev);
 +
 + /*
 +  * rtl8169_init_counter_offsets is called from rtl_open.  On chip
 +  * versions prior to RTL_GIGA_MAC_VER_19 the tally counters are only
 +  * reset by a power cycle, while the counter values collected by the
 +  * driver are reset at every driver unload/load cycle.
 +  *
 +  * To make sure the HW values returned by @get_stats64 match the SW
 +  * values, we collect the initial values at first open(*) and use them
 +  * as offsets to normalize the values

Re: e1000e: possible reggresion?

2015-08-21 Thread Tomas Papan

Hi Eric,

It was probalby it. Uptime 5+ hours and no problem.

Thanks for the hint, I was compiling linus tree yesterday around 12:00 UTC,
your change was added later.

Regards
Tomas

On Thu, Aug 20, 2015 at 10:29 PM, Eric Dumazet eric.duma...@gmail.com wrote:
 On Thu, 2015-08-20 at 21:37 +0200, Tomas Papan wrote:
 Hi there,

 I’m observing a freeze with the recent kernel (4.2-rc7). Unfortunately I 
 can’t preserver the full traces.
 There is nothing in the messages after reboot, I was just lucky one time to 
 see it when tail -f /var/log/messages was running.
 This is the only line which I was able to get:
   eth1 (e1000e): transmit queue 0 timed out

 I’ve got this message in the past, but the ethtool -K eth1 tso off solved 
 that. I’m always running
 this command at the boot time since then. There is no issue with 4.2-rc4. It 
 is hard to bisect,
 because this machine is used as headless server and it happens randomly 
 (usually within 2 hours).

 Do you have any idea how to trace it or what can I do?
 Please keep me on CC since I’m not subscribed on this list

 Regards
 Tomas

 I would pull latest tree from Linus and pray the bug was fixed.

 My feeling is that you hit the issue fixed with

 commit 83fccfc3940c4a2db90fd7e7079f5b465cd8c6af
 Author: Eric Dumazet eduma...@google.com
 Date:   Thu Aug 13 15:44:51 2015 -0700

 inet: fix potential deadlock in reqsk_queue_unlink()

 When replacing del_timer() with del_timer_sync(), I introduced
 a deadlock condition :

 reqsk_queue_unlink() is called from inet_csk_reqsk_queue_drop()

 inet_csk_reqsk_queue_drop() can be called from many contexts,
 one being the timer handler itself (reqsk_timer_handler()).

 In this case, del_timer_sync() loops forever.

 Simple fix is to test if timer is pending.

 Fixes: 2235f2ac75fd (inet: fix races with reqsk timers)
 Signed-off-by: Eric Dumazet eduma...@google.com
 Signed-off-by: David S. Miller da...@davemloft.net


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH V4 net-next 0/2] net: implement SMC-R solution

2015-08-21 Thread Ursula Braun

From: Ursula Braun ursula.br...@de.ibm.com

Dave,

this is V4 of my SMC-R patches:
Since you are asking for a solution 100% in our own separate module
with our own can of worms, we have to give up the transparent detection
whether a communication peer can do SMC-R or not (this has been the
purpose of the rejected TCP hooks). Instead, we want just the new
self-contained SMC-R socket family added to the kernel.
By the way, since August 2015 the SMC-R Informational RFC is no longer
a draft, but published as RFC7609.

V4 changes:
1. Remove tcp patches supporting TCP experimental options
2. Remove references to tcp_sock syn_smc flag in smc-code, since TCP
   experimental options are not supported by the Linux-tcp.
3. clc_wait_msg() simplified

V3 changes:
1. Avoid adding of new space for smc-related bits in the tcp structures.
2. Make the smc feature to be nearly zero cost using Static Keys / jump
   labels
3. Increase / decrease smc static key in the smc-code
4. Make sure the next-to-last patch does not break the build
5. Additional pnet table checking

V2 changes:
1. activate tcp changes for CONFIG_AFSMC only (as suggested by Eric Dumazet)
2. add additional hook in net/core/sock.c
3. fix bitfield endianness problem

Thanks,
Ursula

In 2013, IBM introduced an optimized communications solution for the
IBM zEnterprise EC12 and BC12 (s390 in Linux terminology) that is
comprised of the IBM 10GbE RoCE Express feature with Shared Memory
Communications-RDMA (SMC-R) protocol [1].
SMC-R is designed for the enterprise data center environment and is an open
protocol as specified in the informational RFC7609 [2]. It has been
published in August 2015. Another implementation of this protocol is
available since 2013 with IBM z/OS Version 2 Release 1. 

SMC-R provides a “sockets over RDMA” solution that leverages industry
standard RDMA over Converged Ethernet (RoCE) technology.

IBM has developed a Linux implementation of the SMC-R standard. A new
socket protocol family AF_SMC is introduced. A preload library can be used
to enable TCP-based applications to use SMC-R without changes. 

Key aspects of SMC-R are: 
1. Provides optimized performance compared to standard TCP/IP over Ethernet
   within the data center for both request/response (latency) and streaming
   workloads (CPU savings) [3]. 
   Initial benchmarks on Linux on x86 processors have shown latency
   reduction of up to 52% with a throughput gain of 111% using SMC-R vs TCP
   for request/response message patterns (10 concurrent TCP connections
   with 16KBmessages) and CPU savings of up to 69% for streaming data
   patterns (single TCP connection with 20MB of data in one direction).
   [1] is currently updated to contain more detailed information on Linux
   and performance.
2. In order to preserve the traditional network administrative model the
   SMC-R protocol ties into the existing IP addresses and uses TCP's
   handshake to establish connections. This allows existing management
   tools and security infrastructure to control the creation of SMC
   connections.
3. The SMC-R protocol logically bonds multiple RoCE adapters together
   providingredundancy with transparent fail-over for improved high
   availability, increased bandwidth and load balancing across multiple
   RDMA-capable devices.
Without the rejected TCP Experimental Options the following aspects are
restricted; alternate solutions are in discussion. 
4. Due to its handshake protocol, SMC-R is compatible with (transparent to)
   existing TCP connection load balancers that are commonly used in the
   enterprise data center environment for multi-tier application workloads.
5. SMC-R's handshake protocol allows for transparent fallback to TCP/IP,
   should one of the peers not be capable of the protocol.

Additional SMC-R overview and reference materials are available [1].  

The SMC-R “rendezvous protocol eliminates the need for RDMA-CM and the
exchange occurs through an initial TCP connection. Building on a TCP
connection to establish an SMC-R connection solves many key requirements.
The rendezvous process occurs now in 1 phase only: 
1. TCP/IP 3-way exchange with TCP experimental options is skipped.
2. SMC-R 3-way exchange:
   It is assumed both partners indicate SMC-R capability. Then at the
   completion of the 3-way TCP handshake the SMC-R layers in each peer take
   control of the TCP connection and exchange their RDMA credentials. If
   this 3-way exchange completes successfully the connection continues using
   SMC-R. If the exchange is not successful the connections falls back to
   standard TCP/IP. 

References:
[1] SMC-R Overview and Reference Materials:
http://www-01.ibm.com/software/network/commserver/SMCR/ 
[2] SMC-R Informational RFC:
https://tools.ietf.org/rfc/rfc7609
[3] Linux SMC-R Overview and Performance Summary
(archs x86 and s390):
http://www-01.ibm.com/software/network/commserver/SMCR/ 

The patch series is prepared to apply to net-next and

[PATCH V4 net-next 1/2] net: introduce socket family constants

2015-08-21 Thread Ursula Braun

From: Ursula Braun ursula.br...@de.ibm.com

The new socket family is assigned the next available address / protocol
family constant 41.
Implementing SO_KEEPALIVE for SMC-R requires an extra hook in net/ipv4/timer.c.

Signed-off-by: Ursula Braun ursula.br...@de.ibm.com
---
 include/linux/socket.h |  4 +++-
 include/net/smc.h  | 13 +
 net/ipv4/tcp_timer.c   |  2 +-
 3 files changed, 17 insertions(+), 2 deletions(-)
 create mode 100644 include/net/smc.h

diff --git a/include/linux/socket.h b/include/linux/socket.h
index 5bf59c8..1adcbcc 100644
--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -200,7 +200,8 @@ struct ucred {
 #define AF_ALG 38  /* Algorithm sockets*/
 #define AF_NFC 39  /* NFC sockets  */
 #define AF_VSOCK   40  /* vSockets */
-#define AF_MAX 41  /* For now.. */
+#define AF_SMC 41  /* smc sockets  */
+#define AF_MAX 42  /* For now.. */
 
 /* Protocol families, same as address families. */
 #define PF_UNSPEC  AF_UNSPEC
@@ -246,6 +247,7 @@ struct ucred {
 #define PF_ALG AF_ALG
 #define PF_NFC AF_NFC
 #define PF_VSOCK   AF_VSOCK
+#define PF_SMC AF_SMC
 #define PF_MAX AF_MAX
 
 /* Maximum queue length specifiable by listen.  */
diff --git a/include/net/smc.h b/include/net/smc.h
new file mode 100644
index 000..cd513ee
--- /dev/null
+++ b/include/net/smc.h
@@ -0,0 +1,13 @@
+/*
+ * SMC Definitions for the SMC protocol.
+ *
+ * Author: Ursula Braun ursula.br...@de.ibm.com
+ */
+#ifndef _SMC_H
+#define _SMC_H
+
+/* SMC socket options - disjunct with TCP socket options */
+#define SMC_KEEPALIVE  99  /* start/stop keepalives */
+
+#endif /* _SMC_H */
+
diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index 7149ebc..070bfc7 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -557,7 +557,7 @@ void tcp_set_keepalive(struct sock *sk, int val)
else if (!val)
inet_csk_delete_keepalive_timer(sk);
 }
-
+EXPORT_SYMBOL(tcp_set_keepalive);
 
 static void tcp_keepalive_timer (unsigned long data)
 {
-- 
2.3.8

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next] route: fix breakage after moving lwtunnel state

2015-08-21 Thread Jiri Benc

__recnt and related fields need to be in its own cacheline for performance
reasons. Commit 61adedf3e3f1 (route: move lwtunnel state to dst_entry)
broke that on 32bit archs, causing BUILD_BUG_ON in dst_hold to be triggered.

This patch fixes the breakage by moving the lwtunnel state to the end of
dst_entry on 32bit archs. Unfortunately, this makes it share the cacheline
with __refcnt and may affect performance, thus further patches may be
needed.

Reported-by: kbuild test robot fengguang...@intel.com
Fixes: 61adedf3e3f1 (route: move lwtunnel state to dst_entry)
Signed-off-by: Jiri Benc jb...@redhat.com
---
I'm working on this, I'm going to grab performance numbers with this patch
applied and work on follow up patches as necessary. Until then, this patch
at least fixes the 32bit build.

I'm very sorry for the breakage. I tried to build the patchset with various
configs (IPv6 off, lwtunnel off, etc.) but obviously did not test on 32bit.
I have no excuse for this, I should have tested it, the #ifdef was very
obvious.
---
 include/net/dst.h | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/include/net/dst.h b/include/net/dst.h
index 0a9a723f6c19..ef8f1d43a203 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -44,7 +44,6 @@ struct dst_entry {
 #else
void*__pad1;
 #endif
-   struct lwtunnel_state   *lwtstate;
int (*input)(struct sk_buff *);
int (*output)(struct sock *sk, struct sk_buff *skb);
 
@@ -85,11 +84,12 @@ struct dst_entry {
__u32   __pad2;
 #endif
 
+#ifdef CONFIG_64BIT
+   struct lwtunnel_state   *lwtstate;
/*
 * Align __refcnt to a 64 bytes alignment
 * (L1_CACHE_SIZE would be too much)
 */
-#ifdef CONFIG_64BIT
long__pad_to_align_refcnt[1];
 #endif
/*
@@ -99,6 +99,9 @@ struct dst_entry {
atomic_t__refcnt;   /* client references*/
int __use;
unsigned long   lastuse;
+#ifndef CONFIG_64BIT
+   struct lwtunnel_state   *lwtstate;
+#endif
union {
struct dst_entry*next;
struct rtable __rcu *rt_next;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: DEBUG_LOCKS_WARN_ON(in_interrupt()) triggering in socket code

2015-08-21 Thread Michal Kubecek

On Fri, Aug 21, 2015 at 03:42:33PM +0200, Jason A. Donenfeld wrote:
 Ahhh, interesting, so it turns out you can't do a number of things
 with a read_lock_bh held, because it increases the softirq count.
 Mystery solved.

You must not do anything that can sleep (like taking a mutex) while
holding a rwlock (even for reading) as someone else could call
write_lock() on the same rwlock on the same CPU in the meantime and
would end up spinning indefinitely while waiting for you to release it.

Michal Kubecek
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: DEBUG_LOCKS_WARN_ON(in_interrupt()) triggering in socket code

2015-08-21 Thread Jason A. Donenfeld

Ahhh, interesting, so it turns out you can't do a number of things
with a read_lock_bh held, because it increases the softirq count.
Mystery solved.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: e1000e: possible reggresion?

2015-08-21 Thread Eric Dumazet

On Fri, 2015-08-21 at 12:48 +0200, Tomas Papan wrote:
 Hi Eric,
 
 It was probalby it. Uptime 5+ hours and no problem.
 
 Thanks for the hint, I was compiling linus tree yesterday around 12:00 UTC,
 your change was added later.

Sure, let me know if you have any problems.

A timer fix was also queued, not yet in Linus tree.

https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=d0023a1448abdcc892b8bca631e74bb1888efd02


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 3/5] net: add Hisilicon Network Subsystem MDIO support

2015-08-21 Thread Arnd Bergmann

On Monday 17 August 2015 17:17:50 Kenneth Lee wrote:
 Thanks, Arnd, 
 
 You are right. This is the same IP as hip04_mdio.c. We just mis-understand the
 hardware design. We will merge them and re-submit the patches.

Ok, great!

Arnd
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [net-next:master 1179/1189] include/linux/compiler.h:447:38: error: call to '__compiletime_assert_243' declared with attribute error: BUILD_BUG_ON failed: offsetof(struct dst_entry, __refcnt) 63

2015-08-21 Thread Jiri Benc

On Thu, 20 Aug 2015 23:26:50 -0700 (PDT), David Miller wrote:
 Yeah, I should have predicted this would happen on 32-bit builds when I saw
 the adjustment of __pad_to_align_refcnt[] for 64-bit.
 
 Jiri, you might not have any reasonable options to fix this I'm afraid.

Still working on this, the patch I sent should at least relieve the
pressure (but of course, I'll understand if you revert the whole set).

I'm currently fighting with vxlan triggering null pointer dereference
in include/net/netns/generic.h:41, seems that net-gen is NULL. This is
with commit 938049e18dca, i.e. before my lwtunnel ipv6 patchset.

Pasting the trace below in case anyone has an idea. CONFIG_NET_NS is
enabled. When adding debug printk to vxlan_init_net (before the call to
net_generic), the issue disappears. Smells like a race.

I'm not sure how much time I will have during the weekend.

 Jiri

[   26.102174] BUG: unable to handle kernel NULL pointer dereference at 0010
[   26.109299] IP: [f8501154] vxlan_init_net+0x14/0x50 [vxlan]
[   26.115032] *pdpt = 33b48001 *pde =  
[   26.120770] Oops:  [#1] SMP 
[   26.124000] Modules linked in: vxlan(+) tg3(+) ip6_udp_tunnel snd_pcm 
udp_tunnel snd_timer hp_wmi sparse_keymap snd ptp coretemp rfkill pps_core 
gpio_ich iTCO_wdt mdio dca iTCO_vendor_support ppdev kvm_intel kvm soundcore 
lpc_ich mfd_core pcspkr crc32_pclmul floppy parport_pc i7core_edac parport 
edac_core acpi_cpufreq xfs libcrc32c nouveau video mxm_wmi i2c_algo_bit 
drm_kms_helper ttm drm mptsas scsi_transport_sas firewire_ohci mptscsih 
crc32c_intel serio_raw firewire_core mptbase crc_itu_t wmi
[   26.168070] CPU: 0 PID: 370 Comm: systemd-udevd Not tainted 4.2.0-rc6+ #1
[   26.174829] Hardware name: Hewlett-Packard HP Z800 Workstation/0AECh, BIOS 
786G5 v03.54 11/02/2011
[   26.183750] task: f4ec45c0 ti: f4a82000 task.ti: f4a82000
[   26.189125] EIP: 0060:[f8501154] EFLAGS: 00010282 CPU: 0
[   26.194588] EIP is at vxlan_init_net+0x14/0x50 [vxlan]
[   26.199703] EAX:  EBX: f8509000 ECX: 0002 EDX: 0002
[   26.205942] ESI: f6583000 EDI: c0df7900 EBP: f4a83d74 ESP: f4a83d74
[   26.212182]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[   26.217557] CR0: 80050033 CR2: 0010 CR3: 34a3f4a0 CR4: 06f0
[   26.223797] Stack:
[   26.225797]  f4a83d94 c09beab1  f6aed2c0 0002 f8509000 c0df7900 
f8509000
[   26.233576]  f4a83db4 c09bedb4 f4a83d9c f4a83d9c 3cd52c75 f8509000 c0d72780 
f8091000
[   26.241355]  f4a83dc0 c09bee51 f4f08960 f4a83dcc f8091040 f4f08960 f4a83e48 
c040211a
[   26.249133] Call Trace:
[   26.251568]  [c09beab1] ops_init+0x31/0x130
[   26.255905]  [c09bedb4] register_pernet_operations+0xe4/0x160
[   26.261798]  [f8091000] ? 0xf8091000
[   26.265530]  [c09bee51] register_pernet_subsys+0x21/0x40
[   26.270993]  [f8091040] vxlan_init_module+0x40/0x1000 [vxlan]
[   26.276889]  [c040211a] do_one_initcall+0xaa/0x200
[   26.281829]  [f8091000] ? 0xf8091000
[   26.285562]  [c05a45d5] ? kmem_cache_alloc_trace+0x175/0x1f0
[   26.291369]  [c0ac73db] ? do_init_module+0x21/0x1b5
[   26.296398]  [c0ac73db] ? do_init_module+0x21/0x1b5
[   26.301428]  [c0ac740a] do_init_module+0x50/0x1b5
[   26.306285]  [c04e743b] load_module+0x1dbb/0x23c0
[   26.311143]  [c04e4049] ? copy_module_from_fd.isra.48+0xf9/0x190
[   26.317297]  [c04e7c75] SyS_finit_module+0xa5/0xf0
[   26.322240]  [c05724cb] ? vm_mmap_pgoff+0x9b/0xc0
[   26.327097]  [c0acd79f] sysenter_do_call+0x12/0x12
[   26.332038] Code: c0 00 00 00 5d c3 90 55 89 e5 66 66 66 66 90 8b 80 18 05 
00 00 5d c3 55 89 e5 66 66 66 66 90 8b 15 4c 94 50 f8 8b 80 98 0c 00 00 8b 54 
90 08 89 12 89 52 04 8d 42 08 c7 82 08 04 00 00 00 00 00
[   26.351295] EIP: [f8501154] vxlan_init_net+0x14/0x50 [vxlan] SS:ESP 
0068:f4a83d74
[   26.358938] CR2: 0010
[   26.362259] ---[ end trace 3aa9af5192e30e1f ]---

-- 
Jiri Benc
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next] tcp: fix slow start after idle vs TSO/GSO

2015-08-21 Thread Neal Cardwell

On Thu, Aug 20, 2015 at 1:08 PM, Eric Dumazet eric.duma...@gmail.com wrote:
 From: Eric Dumazet eduma...@google.com

 slow start after idle might reduce cwnd, but we perform this
 after first packet was cooked and sent.

 With TSO/GSO, it means that we might send a full TSO packet
 even if cwnd should have been reduced to IW10.

 Moving the SSAI check in skb_entail() makes sense, because
 we slightly reduce number of times this check is done,
 especially for large send() and TCP Small queue callbacks from
 softirq context.

Very nice catch, and this fix seems like a definite improvement.

One potential issue is that the connection can restart from idle not
just because new data has been written (which this patch addresses),
but also because the receive window opens and so now packets can be
sent again. The old version of the code implicitly fired the restart
code path in the receive window opens case as well, since it fired
every time new data was sent. We might want to check if we need to
call tcp_cwnd_restart() in tcp_ack_update_window(), next to the call
for tcp_fast_path_check()?

neal
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 答复: [PATCH 1/5] net: add Hisilicon Network Subsystem support (config and documents)

2015-08-21 Thread Arnd Bergmann

On Monday 17 August 2015 01:28:07 Liguozhu wrote:
 Thanks, Arnd. 
 
 Regarding the ae-name: it is the name of the Acceleration Engine. It is 
 provided
 by the BIOS according to the position and the feature enabled of the IP.
 So soc0 means it is on SoC No. 0, while n4 means it is running on 
Non-dsaf mode 4. Ideally, we should setup the rule to name it. But as I
 said in the patchset, the IP is original designed for a bare metal solution,
 it is worthless to export all modes and we are planning to add more mode
 for Linux itself in the IP in future version. So I think the better way is
 to leave it as a name but add more meaning in the future.

The name property is a bit awkward. The position is normally implied by
the location of the parent device in the DT, so you should not need that
at all and instead derive it elsewhere. You can also add strings to the
compatible property instead of this, to signify differences in the programming
that are based on how the IP block is used.
 
 Regarding the ae-opts: it is the initial value for the AE's runtime options,
 Currently, we have only port number (there are 6XGE+2GE port for a DSAF AE)
 as option. But for future version, we will add other options such as enable
 Spanning Tree Protocol algorithm) and so on. 

I think these can easily be converted into an index property and boolean
flags (present if true, absent otherwise) for additional features.
 
 Should I add these background to somewhere?

The binding document needs to list all supported configurations, if you
have a string property, describe specifically what strings are allowed
and what they mean, but better try to avoid strings altogether.

Arnd
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next] netfilter: ipset: Fixing unnamed union init

2015-08-21 Thread Elad Raz

In continue to proposed Vinson Lee's post [1], this patch fixes compilation
issues founded at gcc 4.4.7. The initialization of .cidr field of unnamed
unions causes compilation error in gcc 4.4.x.

References

Visible links
[1] https://lkml.org/lkml/2015/7/5/74

Signed-off-by: Elad Raz el...@mellanox.com
---
 net/netfilter/ipset/ip_set_hash_netnet.c | 23 +--
 net/netfilter/ipset/ip_set_hash_netportnet.c | 23 +--
 2 files changed, 42 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/ipset/ip_set_hash_netnet.c 
b/net/netfilter/ipset/ip_set_hash_netnet.c
index 3c862c0..2bff1f0 100644
--- a/net/netfilter/ipset/ip_set_hash_netnet.c
+++ b/net/netfilter/ipset/ip_set_hash_netnet.c
@@ -131,6 +131,14 @@ hash_netnet4_data_next(struct hash_netnet4_elem *next,
 #define HOST_MASK  32
 #include ip_set_hash_gen.h
 
+static void
+hash_netnet4_init(struct hash_netnet4_elem *e)
+{
+   e-ipcmp = 0;
+   e-cidr[0] = HOST_MASK;
+   e-cidr[1] = HOST_MASK;
+}
+
 static int
 hash_netnet4_kadt(struct ip_set *set, const struct sk_buff *skb,
  const struct xt_action_param *par,
@@ -160,7 +168,7 @@ hash_netnet4_uadt(struct ip_set *set, struct nlattr *tb[],
 {
const struct hash_netnet *h = set-data;
ipset_adtfn adtfn = set-variant-adt[adt];
-   struct hash_netnet4_elem e = { .cidr = { HOST_MASK, HOST_MASK, }, };
+   struct hash_netnet4_elem e = { };
struct ip_set_ext ext = IP_SET_INIT_UEXT(set);
u32 ip = 0, ip_to = 0, last;
u32 ip2 = 0, ip2_from = 0, ip2_to = 0, last2;
@@ -169,6 +177,7 @@ hash_netnet4_uadt(struct ip_set *set, struct nlattr *tb[],
if (tb[IPSET_ATTR_LINENO])
*lineno = nla_get_u32(tb[IPSET_ATTR_LINENO]);
 
+   hash_netnet4_init(e);
if (unlikely(!tb[IPSET_ATTR_IP] || !tb[IPSET_ATTR_IP2] ||
 !ip_set_optattr_netorder(tb, IPSET_ATTR_CADT_FLAGS)))
return -IPSET_ERR_PROTOCOL;
@@ -357,6 +366,15 @@ hash_netnet6_data_next(struct hash_netnet4_elem *next,
 #define IP_SET_EMIT_CREATE
 #include ip_set_hash_gen.h
 
+static void
+hash_netnet6_init(struct hash_netnet6_elem *e)
+{
+   ipv6_addr_set(e-ip[0].in6, 0, 0, 0, 0);
+   ipv6_addr_set(e-ip[1].ip6, 0, 0, 0, 0);
+   e-cidr[0] = HOST_MASK;
+   e-cidr[1] = HOST_MASK;
+}
+
 static int
 hash_netnet6_kadt(struct ip_set *set, const struct sk_buff *skb,
  const struct xt_action_param *par,
@@ -385,13 +403,14 @@ hash_netnet6_uadt(struct ip_set *set, struct nlattr *tb[],
  enum ipset_adt adt, u32 *lineno, u32 flags, bool retried)
 {
ipset_adtfn adtfn = set-variant-adt[adt];
-   struct hash_netnet6_elem e = { .cidr = { HOST_MASK, HOST_MASK, }, };
+   struct hash_netnet6_elem e = { };
struct ip_set_ext ext = IP_SET_INIT_UEXT(set);
int ret;
 
if (tb[IPSET_ATTR_LINENO])
*lineno = nla_get_u32(tb[IPSET_ATTR_LINENO]);
 
+   hash_netnet6_init(e);
if (unlikely(!tb[IPSET_ATTR_IP] || !tb[IPSET_ATTR_IP2] ||
 !ip_set_optattr_netorder(tb, IPSET_ATTR_CADT_FLAGS)))
return -IPSET_ERR_PROTOCOL;
diff --git a/net/netfilter/ipset/ip_set_hash_netportnet.c 
b/net/netfilter/ipset/ip_set_hash_netportnet.c
index 0c68734..0695c5c 100644
--- a/net/netfilter/ipset/ip_set_hash_netportnet.c
+++ b/net/netfilter/ipset/ip_set_hash_netportnet.c
@@ -142,6 +142,14 @@ hash_netportnet4_data_next(struct hash_netportnet4_elem 
*next,
 #define HOST_MASK  32
 #include ip_set_hash_gen.h
 
+static void
+hash_netportnet4_init(struct hash_netportnet4_elem *e)
+{
+   e-ipcmp = 0;
+   e-cidr[0] = HOST_MASK;
+   e-cidr[1] = HOST_MASK;
+}
+
 static int
 hash_netportnet4_kadt(struct ip_set *set, const struct sk_buff *skb,
  const struct xt_action_param *par,
@@ -175,7 +183,7 @@ hash_netportnet4_uadt(struct ip_set *set, struct nlattr 
*tb[],
 {
const struct hash_netportnet *h = set-data;
ipset_adtfn adtfn = set-variant-adt[adt];
-   struct hash_netportnet4_elem e = { .cidr = { HOST_MASK, HOST_MASK, }, };
+   struct hash_netportnet4_elem e = { };
struct ip_set_ext ext = IP_SET_INIT_UEXT(set);
u32 ip = 0, ip_to = 0, ip_last, p = 0, port, port_to;
u32 ip2_from = 0, ip2_to = 0, ip2_last, ip2;
@@ -185,6 +193,7 @@ hash_netportnet4_uadt(struct ip_set *set, struct nlattr 
*tb[],
if (tb[IPSET_ATTR_LINENO])
*lineno = nla_get_u32(tb[IPSET_ATTR_LINENO]);
 
+   hash_netportnet4_init(e);
if (unlikely(!tb[IPSET_ATTR_IP] || !tb[IPSET_ATTR_IP2] ||
 !ip_set_attr_netorder(tb, IPSET_ATTR_PORT) ||
 !ip_set_optattr_netorder(tb, IPSET_ATTR_PORT_TO) ||
@@ -412,6 +421,15 @@ hash_netportnet6_data_next(struct hash_netportnet4_elem 
*next,
 #define IP_SET_EMIT_CREATE
 #include ip_set_hash_gen.h
 
+static void
+hash_netportnet6_init(struct hash_netportnet6_elem *e)
+{

[net-next PATCH 3/3] net: sched: fall back to noqueue when removing root qdisc

2015-08-21 Thread Phil Sutter

When removing the root qdisc, the interface should fall back to noqueue
as the 'real' minimal qdisc instead of the default one. Therefore
dev_graft_qdisc() has to be adjusted to assign noqueue if NULL was
passed as new qdisc, and qdisc_graft() needs to assign noqueue to
dev-qdisc instead of noop to prevent dev_activate() from attaching
default qdiscs to the interface.

Note that it is also necessary to have dev_graft_qdisc() set
dev_queue-qdisc to the new qdisc instead of (unconditionally) noop. I
don't know why this was there at all (originates from pre-git time), but
it seems wrong to me.  It could be worked around by droping the extra
check for noqueue in transition_one_qdisc(), maybe with unintended
side-effects.

Signed-off-by: Phil Sutter p...@nwl.cc
---
 net/sched/sch_api.c | 2 +-
 net/sched/sch_generic.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 224374c..3b2cf30 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -839,7 +839,7 @@ skip:
   dev-qdisc, new);
if (new  !new-ops-attach)
atomic_inc(new-refcnt);
-   dev-qdisc = new ? : noop_qdisc;
+   dev-qdisc = new ? : noqueue_qdisc;
 
if (new  new-ops-attach)
new-ops-attach(new);
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index ab614ee..556de30 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -723,9 +723,9 @@ struct Qdisc *dev_graft_qdisc(struct netdev_queue 
*dev_queue,
 
/* ... and graft new one */
if (qdisc == NULL)
-   qdisc = noop_qdisc;
+   qdisc = noqueue_qdisc;
dev_queue-qdisc_sleeping = qdisc;
-   rcu_assign_pointer(dev_queue-qdisc, noop_qdisc);
+   rcu_assign_pointer(dev_queue-qdisc, qdisc);
 
spin_unlock_bh(root_lock);
 
-- 
2.1.2

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[net-next PATCH 1/3] net: sched: make noqueue_qdisc non-static

2015-08-21 Thread Phil Sutter

This needs to be referenced from within net/sched/sched_api.c later.

Signed-off-by: Phil Sutter p...@nwl.cc
---
 include/net/sch_generic.h | 1 +
 net/sched/sch_generic.c   | 3 +--
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 2eab08c..4495193 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -337,6 +337,7 @@ static inline void sch_tree_unlock(const struct Qdisc *q)
 #define tcf_tree_unlock(tp)sch_tree_unlock((tp)-q)
 
 extern struct Qdisc noop_qdisc;
+extern struct Qdisc noqueue_qdisc;
 extern struct Qdisc_ops noop_qdisc_ops;
 extern struct Qdisc_ops pfifo_fast_ops;
 extern struct Qdisc_ops mq_qdisc_ops;
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 942fea8..1fb65f9 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -425,13 +425,12 @@ static struct Qdisc_ops noqueue_qdisc_ops __read_mostly = 
{
.owner  =   THIS_MODULE,
 };
 
-static struct Qdisc noqueue_qdisc;
 static struct netdev_queue noqueue_netdev_queue = {
.qdisc  =   noqueue_qdisc,
.qdisc_sleeping =   noqueue_qdisc,
 };
 
-static struct Qdisc noqueue_qdisc = {
+struct Qdisc noqueue_qdisc = {
.enqueue=   NULL,
.dequeue=   noop_dequeue,
.flags  =   TCQ_F_BUILTIN,
-- 
2.1.2

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[net-next PATCH 0/3] net: sched: allow switching qdisc to noqueue intuitively

2015-08-21 Thread Phil Sutter

This patch series improves the integration of the noqueue qdisc to become the
fallback queueing if no other is attached to an interface. Before it was rather
an add-on, a simpler alternative to a FIFO if no congestion is expected or
possible. It has become the default qdisc for virtual interfaces, and could be
attached by this mechanism only (through removing the root qdisc after having
set tx_queue_len to zero for interfaces not defaulting to noqueue otherwise).

This series does not change the default qdisc chosen for new interfaces, but
upon removal of the root qdisc from an interface, the kernel won't fall back to
the default but to noqueue instead.

Phil Sutter (3):
  net: sched: make noqueue_qdisc non-static
  net: sched: allocate a handle to default qdiscs
  net: sched: fall back to noqueue when removing root qdisc

 include/net/sch_generic.h |  2 ++
 net/sched/sch_api.c   |  5 +++--
 net/sched/sch_generic.c   | 12 
 3 files changed, 13 insertions(+), 6 deletions(-)

-- 
2.1.2

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next] tcp: fix slow start after idle vs TSO/GSO

2015-08-21 Thread Eric Dumazet

On Fri, 2015-08-21 at 11:10 -0400, Neal Cardwell wrote:

 Very nice catch, and this fix seems like a definite improvement.
 
 One potential issue is that the connection can restart from idle not
 just because new data has been written (which this patch addresses),
 but also because the receive window opens and so now packets can be
 sent again. The old version of the code implicitly fired the restart
 code path in the receive window opens case as well, since it fired
 every time new data was sent. We might want to check if we need to
 call tcp_cwnd_restart() in tcp_ack_update_window(), next to the call
 for tcp_fast_path_check()?

Excellent, I wrote a 2nd packetdrill test to exercise this path, will
submit a v2 soon.

Thanks Neal


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [net-next PATCH 2/3] net: sched: allocate a handle to default qdiscs

2015-08-21 Thread Eric Dumazet

On Fri, 2015-08-21 at 17:58 +0200, Phil Sutter wrote:
 Since tc_get_qdisc() does not allow to remove a qdisc with zero handle,
 a handle needs to be allocated to default qdiscs (currently pfifo_fast
 or mq) in order to allow removing them.
 
 Signed-off-by: Phil Sutter p...@nwl.cc
 ---
  include/net/sch_generic.h | 1 +
  net/sched/sch_api.c   | 3 ++-
  net/sched/sch_generic.c   | 5 +
  3 files changed, 8 insertions(+), 1 deletion(-)
 
 diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
 index 4495193..2bfc898 100644
 --- a/include/net/sch_generic.h
 +++ b/include/net/sch_generic.h
 @@ -391,6 +391,7 @@ void dev_deactivate(struct net_device *dev);
  void dev_deactivate_many(struct list_head *head);
  struct Qdisc *dev_graft_qdisc(struct netdev_queue *dev_queue,
 struct Qdisc *qdisc);
 +u32 qdisc_alloc_handle(struct net_device *dev);
  void qdisc_reset(struct Qdisc *qdisc);
  void qdisc_destroy(struct Qdisc *qdisc);
  void qdisc_tree_decrease_qlen(struct Qdisc *qdisc, unsigned int n);
 diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
 index f06aa01..224374c 100644
 --- a/net/sched/sch_api.c
 +++ b/net/sched/sch_api.c
 @@ -723,7 +723,7 @@ EXPORT_SYMBOL(qdisc_class_hash_remove);
  /* Allocate an unique handle from space managed by kernel
   * Possible range is [8000-]: (0x8000 values)
   */
 -static u32 qdisc_alloc_handle(struct net_device *dev)
 +u32 qdisc_alloc_handle(struct net_device *dev)
  {
   int i = 0x8000;
   static u32 autohandle = TC_H_MAKE(0x8000U, 0);
 @@ -739,6 +739,7 @@ static u32 qdisc_alloc_handle(struct net_device *dev)
  
   return 0;
  }
 +EXPORT_SYMBOL(qdisc_alloc_handle);
  
  void qdisc_tree_decrease_qlen(struct Qdisc *sch, unsigned int n)
  {
 diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
 index 1fb65f9..ab614ee 100644
 --- a/net/sched/sch_generic.c
 +++ b/net/sched/sch_generic.c
 @@ -634,6 +634,11 @@ struct Qdisc *qdisc_create_dflt(struct netdev_queue 
 *dev_queue,
   if (IS_ERR(sch))
   goto errout;
   sch-parent = parentid;
 +#ifdef CONFIG_NET_SCHED
 + sch-handle = qdisc_alloc_handle(dev_queue-dev);
 + if (!sch-handle)
 + goto errout;
 +#endif
  
   if (!ops-init || ops-init(sch, NULL) == 0)
   return sch;

This might break HTB setups with more than 32768 classes ?

The pfifo qdisc that gets attached had no handle.

qdisc_alloc_handle() has a limited range.


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/4 nf-next] ipvs: add sync_maxlen parameter for the sync daemon

2015-08-21 Thread Simon Horman

From: Julian Anastasov j...@ssi.bg

Allow setups with large MTU to send large sync packets by
adding sync_maxlen parameter. The default value is now based
on MTU but no more than 1500 for compatibility reasons.

To avoid problems if MTU changes allow fragmentation by
sending packets with DF=0. Problem reported by Dan Carpenter.

Reported-by: Dan Carpenter dan.carpen...@oracle.com
Signed-off-by: Julian Anastasov j...@ssi.bg
Signed-off-by: Simon Horman ho...@verge.net.au
---
 include/net/ip_vs.h |  19 +++---
 include/uapi/linux/ip_vs.h  |   1 +
 net/netfilter/ipvs/ip_vs_ctl.c  |  53 ++--
 net/netfilter/ipvs/ip_vs_sync.c | 137 ++--
 4 files changed, 108 insertions(+), 102 deletions(-)

diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 4e3731ee4eac..2fdc13caf712 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -846,6 +846,13 @@ struct ipvs_master_sync_state {
 /* How much time to keep dests in trash */
 #define IP_VS_DEST_TRASH_PERIOD(120 * HZ)
 
+struct ipvs_sync_daemon_cfg {
+   int syncid;
+   u16 sync_maxlen;
+   /* multicast interface name */
+   charmcast_ifn[IP_VS_IFNAME_MAXLEN];
+};
+
 /* IPVS in network namespace */
 struct netns_ipvs {
int gen;/* Generation */
@@ -961,15 +968,10 @@ struct netns_ipvs {
spinlock_t  sync_buff_lock;
struct task_struct  **backup_threads;
int threads_mask;
-   int send_mesg_maxlen;
-   int recv_mesg_maxlen;
volatile intsync_state;
-   volatile intmaster_syncid;
-   volatile intbackup_syncid;
struct mutexsync_mutex;
-   /* multicast interface name */
-   charmaster_mcast_ifn[IP_VS_IFNAME_MAXLEN];
-   charbackup_mcast_ifn[IP_VS_IFNAME_MAXLEN];
+   struct ipvs_sync_daemon_cfg mcfg;   /* Master Configuration */
+   struct ipvs_sync_daemon_cfg bcfg;   /* Backup Configuration */
/* net name space ptr */
struct net  *net;/* Needed by timer routines */
/* Number of heterogeneous destinations, needed becaus heterogeneous
@@ -1408,7 +1410,8 @@ static inline void ip_vs_dest_put_and_free(struct 
ip_vs_dest *dest)
 /* IPVS sync daemon data and function prototypes
  * (from ip_vs_sync.c)
  */
-int start_sync_thread(struct net *net, int state, char *mcast_ifn, __u8 
syncid);
+int start_sync_thread(struct net *net, struct ipvs_sync_daemon_cfg *cfg,
+ int state);
 int stop_sync_thread(struct net *net, int state);
 void ip_vs_sync_conn(struct net *net, struct ip_vs_conn *cp, int pkts);
 
diff --git a/include/uapi/linux/ip_vs.h b/include/uapi/linux/ip_vs.h
index 3199243f2028..68377d8c8870 100644
--- a/include/uapi/linux/ip_vs.h
+++ b/include/uapi/linux/ip_vs.h
@@ -406,6 +406,7 @@ enum {
IPVS_DAEMON_ATTR_STATE, /* sync daemon state (master/backup) */
IPVS_DAEMON_ATTR_MCAST_IFN, /* multicast interface name */
IPVS_DAEMON_ATTR_SYNC_ID,   /* SyncID we belong to */
+   IPVS_DAEMON_ATTR_SYNC_MAXLEN,   /* UDP Payload Size */
__IPVS_DAEMON_ATTR_MAX,
 };
 
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index af0b69e411b7..96f7bbfd5e1d 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -2336,10 +2336,15 @@ do_ip_vs_set_ctl(struct sock *sk, int cmd, void __user 
*user, unsigned int len)
struct ip_vs_daemon_user *dm = (struct ip_vs_daemon_user *)arg;
 
if (cmd == IP_VS_SO_SET_STARTDAEMON) {
+   struct ipvs_sync_daemon_cfg cfg;
+
+   memset(cfg, 0, sizeof(cfg));
+   strlcpy(cfg.mcast_ifn, dm-mcast_ifn,
+   sizeof(cfg.mcast_ifn));
+   cfg.syncid = dm-syncid;
rtnl_lock();
mutex_lock(ipvs-sync_mutex);
-   ret = start_sync_thread(net, dm-state, dm-mcast_ifn,
-   dm-syncid);
+   ret = start_sync_thread(net, cfg, dm-state);
mutex_unlock(ipvs-sync_mutex);
rtnl_unlock();
} else {
@@ -2650,15 +2655,15 @@ do_ip_vs_get_ctl(struct sock *sk, int cmd, void __user 
*user, int *len)
mutex_lock(ipvs-sync_mutex);
if (ipvs-sync_state  IP_VS_STATE_MASTER) {
d[0].state = IP_VS_STATE_MASTER;
-   strlcpy(d[0].mcast_ifn, ipvs-master_mcast_ifn,
+   strlcpy(d[0].mcast_ifn, ipvs-mcfg.mcast_ifn,
sizeof(d[0].mcast_ifn));
-

[GIT PULL nf-next] Second Round of IPVS Updates for v4.3

2015-08-21 Thread Simon Horman

Hi Pablo,

please consider these IPVS Updates for v4.3.

I realise these are a little late in the cycle, so if you would prefer
me to repost them for v4.4 then just let me know.

The updates include:
* A new scheduler from Raducu Deaconu
* Enhanced configurability of the sync daemon from Julian Anastasov

The following changes since commit 81bf1c64e7fe08f956c74fe2b0f1fa6eb163bd91:

  Merge branch 'master' of 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next (2015-08-21 
06:09:05 +0200)

are available in the git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-next.git 
tags/ipvs2-for-v4.3

for you to fetch changes up to d33288172e72c4729e8b9f2243fb40601afabc8f:

  ipvs: add more mcast parameters for the sync daemon (2015-08-21 09:10:11 
-0700)


Julian Anastasov (3):
  ipvs: call rtnl_lock early
  ipvs: add sync_maxlen parameter for the sync daemon
  ipvs: add more mcast parameters for the sync daemon

Raducu Deaconu (1):
  ipvs: Add ovf scheduler

 include/net/ip_vs.h |  23 ++--
 include/uapi/linux/ip_vs.h  |   5 +
 net/netfilter/ipvs/Kconfig  |  11 ++
 net/netfilter/ipvs/Makefile |   1 +
 net/netfilter/ipvs/ip_vs_ctl.c  | 143 -
 net/netfilter/ipvs/ip_vs_ovf.c  |  86 +
 net/netfilter/ipvs/ip_vs_sync.c | 269 ++--
 7 files changed, 402 insertions(+), 136 deletions(-)
 create mode 100644 net/netfilter/ipvs/ip_vs_ovf.c
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 4/4 nf-next] ipvs: add more mcast parameters for the sync daemon

2015-08-21 Thread Simon Horman

From: Julian Anastasov j...@ssi.bg

- mcast_group: configure the multicast address, now IPv6
is supported too

- mcast_port: configure the multicast port

- mcast_ttl: configure the multicast TTL/HOP_LIMIT

Signed-off-by: Julian Anastasov j...@ssi.bg
Signed-off-by: Simon Horman ho...@verge.net.au
---
 include/net/ip_vs.h |   4 ++
 include/uapi/linux/ip_vs.h  |   4 ++
 net/netfilter/ipvs/ip_vs_ctl.c  |  50 ++-
 net/netfilter/ipvs/ip_vs_sync.c | 138 +---
 4 files changed, 172 insertions(+), 24 deletions(-)

diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 2fdc13caf712..9b9ca87a4210 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -847,8 +847,12 @@ struct ipvs_master_sync_state {
 #define IP_VS_DEST_TRASH_PERIOD(120 * HZ)
 
 struct ipvs_sync_daemon_cfg {
+   union nf_inet_addr  mcast_group;
int syncid;
u16 sync_maxlen;
+   u16 mcast_port;
+   u8  mcast_af;
+   u8  mcast_ttl;
/* multicast interface name */
charmcast_ifn[IP_VS_IFNAME_MAXLEN];
 };
diff --git a/include/uapi/linux/ip_vs.h b/include/uapi/linux/ip_vs.h
index 68377d8c8870..391395c06c7e 100644
--- a/include/uapi/linux/ip_vs.h
+++ b/include/uapi/linux/ip_vs.h
@@ -407,6 +407,10 @@ enum {
IPVS_DAEMON_ATTR_MCAST_IFN, /* multicast interface name */
IPVS_DAEMON_ATTR_SYNC_ID,   /* SyncID we belong to */
IPVS_DAEMON_ATTR_SYNC_MAXLEN,   /* UDP Payload Size */
+   IPVS_DAEMON_ATTR_MCAST_GROUP,   /* IPv4 Multicast Address */
+   IPVS_DAEMON_ATTR_MCAST_GROUP6,  /* IPv6 Multicast Address */
+   IPVS_DAEMON_ATTR_MCAST_PORT,/* Multicast Port (base) */
+   IPVS_DAEMON_ATTR_MCAST_TTL, /* Multicast TTL */
__IPVS_DAEMON_ATTR_MAX,
 };
 
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index 96f7bbfd5e1d..1a23e91d50d8 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -2819,6 +2819,10 @@ static const struct nla_policy 
ip_vs_daemon_policy[IPVS_DAEMON_ATTR_MAX + 1] = {
.len = IP_VS_IFNAME_MAXLEN },
[IPVS_DAEMON_ATTR_SYNC_ID]  = { .type = NLA_U32 },
[IPVS_DAEMON_ATTR_SYNC_MAXLEN]  = { .type = NLA_U16 },
+   [IPVS_DAEMON_ATTR_MCAST_GROUP]  = { .type = NLA_U32 },
+   [IPVS_DAEMON_ATTR_MCAST_GROUP6] = { .len = sizeof(struct in6_addr) },
+   [IPVS_DAEMON_ATTR_MCAST_PORT]   = { .type = NLA_U16 },
+   [IPVS_DAEMON_ATTR_MCAST_TTL]= { .type = NLA_U8 },
 };
 
 /* Policy used for attributes in nested attribute IPVS_CMD_ATTR_SERVICE */
@@ -3288,8 +3292,21 @@ static int ip_vs_genl_fill_daemon(struct sk_buff *skb, 
__u32 state,
if (nla_put_u32(skb, IPVS_DAEMON_ATTR_STATE, state) ||
nla_put_string(skb, IPVS_DAEMON_ATTR_MCAST_IFN, c-mcast_ifn) ||
nla_put_u32(skb, IPVS_DAEMON_ATTR_SYNC_ID, c-syncid) ||
-   nla_put_u16(skb, IPVS_DAEMON_ATTR_SYNC_MAXLEN, c-sync_maxlen))
+   nla_put_u16(skb, IPVS_DAEMON_ATTR_SYNC_MAXLEN, c-sync_maxlen) ||
+   nla_put_u16(skb, IPVS_DAEMON_ATTR_MCAST_PORT, c-mcast_port) ||
+   nla_put_u8(skb, IPVS_DAEMON_ATTR_MCAST_TTL, c-mcast_ttl))
goto nla_put_failure;
+#ifdef CONFIG_IP_VS_IPV6
+   if (c-mcast_af == AF_INET6) {
+   if (nla_put_in6_addr(skb, IPVS_DAEMON_ATTR_MCAST_GROUP6,
+c-mcast_group.in6))
+   goto nla_put_failure;
+   } else
+#endif
+   if (c-mcast_af == AF_INET 
+   nla_put_in_addr(skb, IPVS_DAEMON_ATTR_MCAST_GROUP,
+   c-mcast_group.ip))
+   goto nla_put_failure;
nla_nest_end(skb, nl_daemon);
 
return 0;
@@ -3370,6 +3387,37 @@ static int ip_vs_genl_new_daemon(struct net *net, struct 
nlattr **attrs)
if (a)
c.sync_maxlen = nla_get_u16(a);
 
+   a = attrs[IPVS_DAEMON_ATTR_MCAST_GROUP];
+   if (a) {
+   c.mcast_af = AF_INET;
+   c.mcast_group.ip = nla_get_in_addr(a);
+   if (!ipv4_is_multicast(c.mcast_group.ip))
+   return -EINVAL;
+   } else {
+   a = attrs[IPVS_DAEMON_ATTR_MCAST_GROUP6];
+   if (a) {
+#ifdef CONFIG_IP_VS_IPV6
+   int addr_type;
+
+   c.mcast_af = AF_INET6;
+   c.mcast_group.in6 = nla_get_in6_addr(a);
+   addr_type = ipv6_addr_type(c.mcast_group.in6);
+   if (!(addr_type  IPV6_ADDR_MULTICAST))
+   return -EINVAL;
+#else
+   return -EAFNOSUPPORT;
+#endif
+   }
+   }
+
+   a = attrs[IPVS_DAEMON_ATTR_MCAST_PORT];
+   if (a)
+

[PATCH 2/4 nf-next] ipvs: call rtnl_lock early

2015-08-21 Thread Simon Horman

From: Julian Anastasov j...@ssi.bg

When the sync damon is started we need to hold rtnl
lock while calling ip_mc_join_group. Currently, we have
a wrong locking order because the correct one is
rtnl_lock-__ip_vs_mutex. It is implied from the usage
of __ip_vs_mutex in ip_vs_dst_event() which is called
under rtnl lock during NETDEV_* notifications.

Fix the problem by calling rtnl_lock early only for the
start_sync_thread call. As a bonus this fixes the usage
__dev_get_by_name which was not called under rtnl lock.

This patch actually extends and depends on commit 54ff9ef36bdf
(ipv4, ipv6: kill ip_mc_{join, leave}_group and
ipv6_sock_mc_{join, drop}).

Signed-off-by: Julian Anastasov j...@ssi.bg
Signed-off-by: Simon Horman ho...@verge.net.au
---
 net/netfilter/ipvs/ip_vs_ctl.c  | 50 +++--
 net/netfilter/ipvs/ip_vs_sync.c |  2 --
 2 files changed, 33 insertions(+), 19 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index 24c554201a76..af0b69e411b7 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -2335,13 +2335,18 @@ do_ip_vs_set_ctl(struct sock *sk, int cmd, void __user 
*user, unsigned int len)
cmd == IP_VS_SO_SET_STOPDAEMON) {
struct ip_vs_daemon_user *dm = (struct ip_vs_daemon_user *)arg;
 
-   mutex_lock(ipvs-sync_mutex);
-   if (cmd == IP_VS_SO_SET_STARTDAEMON)
+   if (cmd == IP_VS_SO_SET_STARTDAEMON) {
+   rtnl_lock();
+   mutex_lock(ipvs-sync_mutex);
ret = start_sync_thread(net, dm-state, dm-mcast_ifn,
dm-syncid);
-   else
+   mutex_unlock(ipvs-sync_mutex);
+   rtnl_unlock();
+   } else {
+   mutex_lock(ipvs-sync_mutex);
ret = stop_sync_thread(net, dm-state);
-   mutex_unlock(ipvs-sync_mutex);
+   mutex_unlock(ipvs-sync_mutex);
+   }
goto out_dec;
}
 
@@ -3342,6 +3347,9 @@ nla_put_failure:
 
 static int ip_vs_genl_new_daemon(struct net *net, struct nlattr **attrs)
 {
+   struct netns_ipvs *ipvs = net_ipvs(net);
+   int ret;
+
if (!(attrs[IPVS_DAEMON_ATTR_STATE] 
  attrs[IPVS_DAEMON_ATTR_MCAST_IFN] 
  attrs[IPVS_DAEMON_ATTR_SYNC_ID]))
@@ -3353,19 +3361,30 @@ static int ip_vs_genl_new_daemon(struct net *net, 
struct nlattr **attrs)
if (net_ipvs(net)-mixed_address_family_dests  0)
return -EINVAL;
 
-   return start_sync_thread(net,
-nla_get_u32(attrs[IPVS_DAEMON_ATTR_STATE]),
-nla_data(attrs[IPVS_DAEMON_ATTR_MCAST_IFN]),
-nla_get_u32(attrs[IPVS_DAEMON_ATTR_SYNC_ID]));
+   rtnl_lock();
+   mutex_lock(ipvs-sync_mutex);
+   ret = start_sync_thread(net,
+   nla_get_u32(attrs[IPVS_DAEMON_ATTR_STATE]),
+   nla_data(attrs[IPVS_DAEMON_ATTR_MCAST_IFN]),
+   nla_get_u32(attrs[IPVS_DAEMON_ATTR_SYNC_ID]));
+   mutex_unlock(ipvs-sync_mutex);
+   rtnl_unlock();
+   return ret;
 }
 
 static int ip_vs_genl_del_daemon(struct net *net, struct nlattr **attrs)
 {
+   struct netns_ipvs *ipvs = net_ipvs(net);
+   int ret;
+
if (!attrs[IPVS_DAEMON_ATTR_STATE])
return -EINVAL;
 
-   return stop_sync_thread(net,
-   nla_get_u32(attrs[IPVS_DAEMON_ATTR_STATE]));
+   mutex_lock(ipvs-sync_mutex);
+   ret = stop_sync_thread(net,
+  nla_get_u32(attrs[IPVS_DAEMON_ATTR_STATE]));
+   mutex_unlock(ipvs-sync_mutex);
+   return ret;
 }
 
 static int ip_vs_genl_set_config(struct net *net, struct nlattr **attrs)
@@ -3389,7 +3408,7 @@ static int ip_vs_genl_set_config(struct net *net, struct 
nlattr **attrs)
 
 static int ip_vs_genl_set_daemon(struct sk_buff *skb, struct genl_info *info)
 {
-   int ret = 0, cmd;
+   int ret = -EINVAL, cmd;
struct net *net;
struct netns_ipvs *ipvs;
 
@@ -3400,22 +3419,19 @@ static int ip_vs_genl_set_daemon(struct sk_buff *skb, 
struct genl_info *info)
if (cmd == IPVS_CMD_NEW_DAEMON || cmd == IPVS_CMD_DEL_DAEMON) {
struct nlattr *daemon_attrs[IPVS_DAEMON_ATTR_MAX + 1];
 
-   mutex_lock(ipvs-sync_mutex);
if (!info-attrs[IPVS_CMD_ATTR_DAEMON] ||
nla_parse_nested(daemon_attrs, IPVS_DAEMON_ATTR_MAX,
 info-attrs[IPVS_CMD_ATTR_DAEMON],
-ip_vs_daemon_policy)) {
-   ret = -EINVAL;
+ip_vs_daemon_policy))
goto out;
-   }

[PATCH 1/4 nf-next] ipvs: Add ovf scheduler

2015-08-21 Thread Simon Horman

From: Raducu Deaconu rhadoo.i...@gmail.com

The weighted overflow scheduling algorithm directs network connections
to the server with the highest weight that is currently available
and overflows to the next when active connections exceed the node's weight.

Signed-off-by: Raducu Deaconu rhadoo.i...@gmail.com
Acked-by: Julian Anastasov j...@ssi.bg
Signed-off-by: Simon Horman ho...@verge.net.au
---
 net/netfilter/ipvs/Kconfig | 11 ++
 net/netfilter/ipvs/Makefile|  1 +
 net/netfilter/ipvs/ip_vs_ovf.c | 86 ++
 3 files changed, 98 insertions(+)
 create mode 100644 net/netfilter/ipvs/ip_vs_ovf.c

diff --git a/net/netfilter/ipvs/Kconfig b/net/netfilter/ipvs/Kconfig
index 3b6929dec748..b32fb0dbe237 100644
--- a/net/netfilter/ipvs/Kconfig
+++ b/net/netfilter/ipvs/Kconfig
@@ -162,6 +162,17 @@ config  IP_VS_FO
  If you want to compile it in kernel, say Y. To compile it as a
  module, choose M here. If unsure, say N.
 
+config  IP_VS_OVF
+   tristate weighted overflow scheduling
+   ---help---
+ The weighted overflow scheduling algorithm directs network
+ connections to the server with the highest weight that is
+ currently available and overflows to the next when active
+ connections exceed the node's weight.
+
+ If you want to compile it in kernel, say Y. To compile it as a
+ module, choose M here. If unsure, say N.
+
 config IP_VS_LBLC
tristate locality-based least-connection scheduling
---help---
diff --git a/net/netfilter/ipvs/Makefile b/net/netfilter/ipvs/Makefile
index 38b2723b2e3d..67f3f4389602 100644
--- a/net/netfilter/ipvs/Makefile
+++ b/net/netfilter/ipvs/Makefile
@@ -27,6 +27,7 @@ obj-$(CONFIG_IP_VS_WRR) += ip_vs_wrr.o
 obj-$(CONFIG_IP_VS_LC) += ip_vs_lc.o
 obj-$(CONFIG_IP_VS_WLC) += ip_vs_wlc.o
 obj-$(CONFIG_IP_VS_FO) += ip_vs_fo.o
+obj-$(CONFIG_IP_VS_OVF) += ip_vs_ovf.o
 obj-$(CONFIG_IP_VS_LBLC) += ip_vs_lblc.o
 obj-$(CONFIG_IP_VS_LBLCR) += ip_vs_lblcr.o
 obj-$(CONFIG_IP_VS_DH) += ip_vs_dh.o
diff --git a/net/netfilter/ipvs/ip_vs_ovf.c b/net/netfilter/ipvs/ip_vs_ovf.c
new file mode 100644
index ..f7d62c3b7329
--- /dev/null
+++ b/net/netfilter/ipvs/ip_vs_ovf.c
@@ -0,0 +1,86 @@
+/*
+ * IPVS:Overflow-Connection Scheduling module
+ *
+ * Authors: Raducu Deaconu rhadoo...@yahoo.com
+ *
+ *  This program is free software; you can redistribute it and/or
+ *  modify it under the terms of the GNU General Public License
+ *  as published by the Free Software Foundation; either version
+ *  2 of the License, or (at your option) any later version.
+ *
+ * Scheduler implements overflow loadbalancing according to number of active
+ * connections , will keep all conections to the node with the highest weight
+ * and overflow to the next node if the number of connections exceeds the 
node's
+ * weight.
+ * Note that this scheduler might not be suitable for UDP because it only uses
+ * active connections
+ *
+ */
+
+#define KMSG_COMPONENT IPVS
+#define pr_fmt(fmt) KMSG_COMPONENT :  fmt
+
+#include linux/module.h
+#include linux/kernel.h
+
+#include net/ip_vs.h
+
+/* OVF Connection scheduling  */
+static struct ip_vs_dest *
+ip_vs_ovf_schedule(struct ip_vs_service *svc, const struct sk_buff *skb,
+  struct ip_vs_iphdr *iph)
+{
+   struct ip_vs_dest *dest, *h = NULL;
+   int hw = 0, w;
+
+   IP_VS_DBG(6, ip_vs_ovf_schedule(): Scheduling...\n);
+   /* select the node with highest weight, go to next in line if active
+   * connections exceed weight
+   */
+   list_for_each_entry_rcu(dest, svc-destinations, n_list) {
+   w = atomic_read(dest-weight);
+   if ((dest-flags  IP_VS_DEST_F_OVERLOAD) ||
+   atomic_read(dest-activeconns)  w ||
+   w == 0)
+   continue;
+   if (!h || w  hw) {
+   h = dest;
+   hw = w;
+   }
+   }
+
+   if (h) {
+   IP_VS_DBG_BUF(6, OVF: server %s:%u active %d w %d\n,
+ IP_VS_DBG_ADDR(h-af, h-addr),
+ ntohs(h-port),
+ atomic_read(h-activeconns),
+ atomic_read(h-weight));
+   return h;
+   }
+
+   ip_vs_scheduler_err(svc, no destination available);
+   return NULL;
+}
+
+static struct ip_vs_scheduler ip_vs_ovf_scheduler = {
+   .name = ovf,
+   .refcnt =   ATOMIC_INIT(0),
+   .module =   THIS_MODULE,
+   .n_list =   LIST_HEAD_INIT(ip_vs_ovf_scheduler.n_list),
+   .schedule = ip_vs_ovf_schedule,
+};
+
+static int __init ip_vs_ovf_init(void)
+{
+   return register_ip_vs_scheduler(ip_vs_ovf_scheduler);
+}
+
+static void __exit ip_vs_ovf_cleanup(void)
+{
+

[PATCH] /net/ethernet/3com/3c59x.c:Fixed coding style errors and warnings.

2015-08-21 Thread Ravinder Atla

Checks are also cleared
Signed-off-by: Ravinder Atla rednivara...@gmail.com
---
 drivers/net/ethernet/3com/3c59x.c | 39 +--
 1 file changed, 21 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ethernet/3com/3c59x.c 
b/drivers/net/ethernet/3com/3c59x.c
index 2d1ce3c..18d242c 100644
--- a/drivers/net/ethernet/3com/3c59x.c
+++ b/drivers/net/ethernet/3com/3c59x.c
@@ -208,15 +208,18 @@ limit of 4K.
of the drivers, and will likely be provided by some future kernel.
 */
 enum pci_flags_bit {
-   PCI_USES_MASTER=4,
+   PCI_USES_MASTER = 4,
 };
 
-enum { IS_VORTEX=1, IS_BOOMERANG=2, IS_CYCLONE=4, IS_TORNADO=8,
-   EEPROM_8BIT=0x10,   /* AKPM: Uses 0x230 as the base bitmaps for 
EEPROM reads */
-   HAS_PWR_CTRL=0x20, HAS_MII=0x40, HAS_NWAY=0x80, HAS_CB_FNS=0x100,
-   INVERT_MII_PWR=0x200, INVERT_LED_PWR=0x400, MAX_COLLISION_RESET=0x800,
-   EEPROM_OFFSET=0x1000, HAS_HWCKSM=0x2000, WNO_XCVR_PWR=0x4000,
-   EXTRA_PREAMBLE=0x8000, EEPROM_RESET=0x1, };
+enum { IS_VORTEX=1, IS_BOOMERANG = 2, IS_CYCLONE = 4, IS_TORNADO = 8,
+   EEPROM_8BIT = 0x10,
+   /* AKPM: Uses 0x230 as the base bitmaps for EEPROM reads */
+   HAS_PWR_CTRL = 0x20, HAS_MII = 0x40, HAS_NWAY = 0x80,
+   HAS_CB_FNS = 0x100,
+   INVERT_MII_PWR = 0x200, INVERT_LED_PWR = 0x400,
+   MAX_COLLISION_RESET = 0x800,
+   EEPROM_OFFSET = 0x1000, HAS_HWCKSM = 0x2000, WNO_XCVR_PWR = 0x4000,
+   EXTRA_PREAMBLE = 0x8000, EEPROM_RESET = 0x1, };
 
 enum vortex_chips {
CH_3C590 = 0,
@@ -267,7 +270,6 @@ enum vortex_chips {
CH_920B_EMB_WNM,
 };
 
-
 /* note: this array directly indexed by above enums, and MUST
  * be kept in sync with both the enums above, and the PCI device
  * table below
@@ -280,9 +282,9 @@ static struct vortex_chip_info {
 } vortex_info_tbl[] = {
{3c590 Vortex 10Mbps,
 PCI_USES_MASTER, IS_VORTEX, 32, },
-   {3c592 EISA 10Mbps Demon/Vortex,  
/* AKPM: from Don's 3c59x_cb.c 0.49H */
+   {3c592 EISA 10Mbps Demon/Vortex,  /* AKPM: from Don's 3c59x_cb.c 
0.49H */
 PCI_USES_MASTER, IS_VORTEX, 32, },
-   {3c597 EISA Fast Demon/Vortex,
/* AKPM: from Don's 3c59x_cb.c 0.49H */
+   {3c597 EISA Fast Demon/Vortex,/* AKPM: from Don's 3c59x_cb.c 
0.49H */
 PCI_USES_MASTER, IS_VORTEX, 32, },
{3c595 Vortex 100baseTx,
 PCI_USES_MASTER, IS_VORTEX, 32, },
@@ -292,15 +294,15 @@ static struct vortex_chip_info {
{3c595 Vortex 100base-MII,
 PCI_USES_MASTER, IS_VORTEX, 32, },
{3c900 Boomerang 10baseT,
-PCI_USES_MASTER, IS_BOOMERANG|EEPROM_RESET, 64, },
+PCI_USES_MASTER, IS_BOOMERANG | EEPROM_RESET, 64, },
{3c900 Boomerang 10Mbps Combo,
-PCI_USES_MASTER, IS_BOOMERANG|EEPROM_RESET, 64, },
-   {3c900 Cyclone 10Mbps TPO,
/* AKPM: from Don's 0.99M */
+PCI_USES_MASTER, IS_BOOMERANG | EEPROM_RESET, 64, },
+   {3c900 Cyclone 10Mbps TPO,/* AKPM: from Don's 0.99M */
 PCI_USES_MASTER, IS_CYCLONE|HAS_HWCKSM, 128, },
{3c900 Cyclone 10Mbps Combo,
 PCI_USES_MASTER, IS_CYCLONE|HAS_HWCKSM, 128, },
 
-   {3c900 Cyclone 10Mbps TPC,
/* AKPM: from Don's 0.99M */
+   {3c900 Cyclone 10Mbps TPC,/* AKPM: from Don's 0.99M */
 PCI_USES_MASTER, IS_CYCLONE|HAS_HWCKSM, 128, },
{3c900B-FL Cyclone 10base-FL,
 PCI_USES_MASTER, IS_CYCLONE|HAS_HWCKSM, 128, },
@@ -331,8 +333,8 @@ static struct vortex_chip_info {
{3c555 Laptop Hurricane,
 PCI_USES_MASTER, IS_CYCLONE|EEPROM_8BIT|HAS_HWCKSM, 128, },
{3c556 Laptop Tornado,
-PCI_USES_MASTER, 
IS_TORNADO|HAS_NWAY|EEPROM_8BIT|HAS_CB_FNS|INVERT_MII_PWR|
-   
HAS_HWCKSM, 128, },
+PCI_USES_MASTER, IS_TORNADO | HAS_NWAY | EEPROM_8BIT | HAS_CB_FNS | 
+   INVERT_MII_PWR | HAS_HWCKSM, 128, },
{3c556B Laptop Hurricane,
 PCI_USES_MASTER, 
IS_TORNADO|HAS_NWAY|EEPROM_OFFSET|HAS_CB_FNS|INVERT_MII_PWR|
WNO_XCVR_PWR|HAS_HWCKSM, 128, },
@@ -474,13 +476,14 @@ enum vortex_status {
On the Vortex this window is always mapped at offsets 0x10-0x1f. */
 enum Window1 {
TX_FIFO = 0x10,  RX_FIFO = 0x10,  RxErrors = 0x14,
-   RxStatus = 0x18,  Timer=0x1A, TxStatus = 0x1B,
+   RxStatus = 0x18,  Timer = 0x1A, TxStatus = 0x1B,
TxFree = 0x1C, /* Remaining free bytes in Tx buffer. */
 };
+
 enum Window0 {
Wn0EepromCmd = 10,  /* Window 0: EEPROM command register. */
Wn0EepromData = 12, /* Window 0: EEPROM results register. */
-   IntrStatus=0x0E,/* Valid in all windows. */
+   IntrStatus = 0x0E,  /*

ipg and dl2k mess

2015-08-21 Thread Ondrej Zary

Hello,
I've got an Asus NX1101 card with ICPlus IP1000A chip:
02:01.0 Ethernet controller [0200]: Sundance Technology Inc / IC Plus Corp 
IP1000 Family Gigabit Ethernet [13f0:1023] (rev 41)
Subsystem: ASUSTeK Computer Inc. NX1101 [1043:8180]
Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 19
I/O ports at a000 [size=256]
Memory at f500 (32-bit, non-prefetchable) [size=256]
[virtual] Expansion ROM at 3000 [disabled] [size=64K]
Capabilities: [50] Power Management version 2

It does not work properly because the ipg driver is broken - it loses packets
(easily reproduced by ping -f) and stops working under load with no messages
(copying a 200MB file using scp at 100mbit is enough to reproduce it).

The dl2k (for TC902x chips, DL2000 is probably a rebranded TC902x) driver is
very similar to ipg (for IP1000A). According to datasheets, IP1000A chip looks
like a TC9021 with integrated PHY.

The patch below is enough to make my IP1000A card work with dl2k driver - no
more lost packets and hangs. Haven't tested gigabit speed yet - the PHY will
probably need some tweaking but that should be easy.

So maybe we should add IP1000A support to dl2k and remove the broken ipg
driver. Does anyone have HW to test?

diff --git a/drivers/net/ethernet/dlink/dl2k.c 
b/drivers/net/ethernet/dlink/dl2k.c
index cf0a5fc..d5a60fe 100644
--- a/drivers/net/ethernet/dlink/dl2k.c
+++ b/drivers/net/ethernet/dlink/dl2k.c
@@ -433,9 +455,9 @@ rio_open (struct net_device *dev)
 
alloc_list (dev);
 
-   /* Get station address */
-   for (i = 0; i  6; i++)
-   dw8(StationAddr0 + i, dev-dev_addr[i]);
+   /* Set station address */
+   for (i = 0; i  3; i++)
+   dw16(StationAddr0 + 2 * i, cpu_to_le16(((u16 
*)dev-dev_addr)[i]));
 
set_multicast (dev);
if (np-coalesce) {
diff --git a/drivers/net/ethernet/dlink/dl2k.h 
b/drivers/net/ethernet/dlink/dl2k.h
index 23c07b0..da35e66 100644
--- a/drivers/net/ethernet/dlink/dl2k.h
+++ b/drivers/net/ethernet/dlink/dl2k.h
@@ -411,6 +411,7 @@ struct netdev_private {
 static const struct pci_device_id rio_pci_tbl[] = {
{0x1186, 0x4000, PCI_ANY_ID, PCI_ANY_ID, },
{0x13f0, 0x1021, PCI_ANY_ID, PCI_ANY_ID, },
+   {0x13f0, 0x1023, PCI_ANY_ID, PCI_ANY_ID, },
{ }
 };
 MODULE_DEVICE_TABLE (pci, rio_pci_tbl);


-- 
Ondrej Zary
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 net-next] r8169: Add values missing in @get_stats64 from HW counters

2015-08-21 Thread Francois Romieu

Corinna Vinschen vinsc...@redhat.com :
[...]
 diff --git a/drivers/net/ethernet/realtek/r8169.c 
 b/drivers/net/ethernet/realtek/r8169.c
 index f790f61..f26a48d 100644
 --- a/drivers/net/ethernet/realtek/r8169.c
 +++ b/drivers/net/ethernet/realtek/r8169.c
[...]
 @@ -2179,6 +2191,47 @@ static int rtl8169_get_sset_count(struct net_device 
 *dev, int sset)
   }
  }
  
 +DECLARE_RTL_COND(rtl_reset_counters_cond)
 +{
 + void __iomem *ioaddr = tp-mmio_addr;
 +
 + return RTL_R32(CounterAddrLow)  CounterReset;
 +}
 +
 +static void rtl8169_reset_counters(struct net_device *dev)
 +{

rtl8169_reset_counters duplicates most of rtl8169_update_counters. Please
factor out the dma_alloc + parametrized CounterAddrLow write + cleanup.

[...]
 @@ -2220,6 +2273,39 @@ static void rtl8169_update_counters(struct net_device 
 *dev)
   dma_free_coherent(d, sizeof(*counters), counters, paddr);
  }
  
 +static void rtl8169_init_counter_offsets(struct net_device *dev)
 +{
 + struct rtl8169_private *tp = netdev_priv(dev);
 +
 + /*
 +  * rtl8169_init_counter_offsets is called from rtl_open.  On chip
 +  * versions prior to RTL_GIGA_MAC_VER_19 the tally counters are only
 +  * reset by a power cycle, while the counter values collected by the
 +  * driver are reset at every driver unload/load cycle.
 +  *
 +  * To make sure the HW values returned by @get_stats64 match the SW
 +  * values, we collect the initial values at first open(*) and use them
 +  * as offsets to normalize the values returned by @get_stats64.
 +  *
 +  * (*) We can't call rtl8169_init_counter_offsets from rtl_init_one
 +  * for the reason stated in rtl8169_update_counters; CmdRxEnb is only
 +  * set at open time by rtl_hw_start.
 +  */
 +
 + if (tp-tc_offset.inited)
 + return;
 +
 + rtl8169_reset_counters(dev);
 +
 + rtl8169_update_counters(dev);


The code should propagate failure when both rtl8169_reset_counters and
rtl8169_update_counters fail.

--
Ueimor
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH ipsec-next v2] xfrm: Use VRF master index if output device is enslaved

2015-08-21 Thread Steffen Klassert

On Fri, Aug 21, 2015 at 02:11:21AM +0300, Nikolay Aleksandrov wrote:
 
  On Aug 21, 2015, at 1:06 AM, David Ahern d...@cumulusnetworks.com wrote:
  
  Directs route lookups to VRF table. Compiles out if NET_VRF is not
  enabled. With this patch able to successfully bring up ipsec tunnels
  in VRFs, even with duplicate network configuration.
  
  Signed-off-by: David Ahern d...@cumulusnetworks.com
  ---
  v2
  - use vrf_master_ifindex rather than vrf_master_ifindex_rcu
  
  net/ipv4/xfrm4_policy.c | 7 +--
  net/ipv6/xfrm6_policy.c | 7 +--
  2 files changed, 10 insertions(+), 4 deletions(-)
  
 
 Looks good to me,
 
 Acked-by: Nikolay Aleksandrov niko...@cumulusnetworks.com

David, can you please take this directly into net-next?

Acked-by: Steffen Klassert steffen.klass...@secunet.com
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] rsi: Fix possible leak when loading firmware

2015-08-21 Thread Christian Engelmayer

Commit 5d5cd85ff441 (rsi: Fix failure to load firmware after memory
leak fix and fix the leak) also added a check on the allocation of
DMA-accessible memory that may directly return. In that case the
already allocated firmware data is leaked. Make sure the data is
always freed correctly. Detected by Coverity CID 1316519.

Signed-off-by: Christian Engelmayer cenge...@gmx.at
---
Compile tested only.
---
 drivers/net/wireless/rsi/rsi_91x_sdio_ops.c | 8 ++--
 drivers/net/wireless/rsi/rsi_91x_usb_ops.c  | 8 ++--
 2 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/rsi/rsi_91x_sdio_ops.c 
b/drivers/net/wireless/rsi/rsi_91x_sdio_ops.c
index 1c6788aecc62..40d72312f3df 100644
--- a/drivers/net/wireless/rsi/rsi_91x_sdio_ops.c
+++ b/drivers/net/wireless/rsi/rsi_91x_sdio_ops.c
@@ -203,8 +203,10 @@ static int rsi_load_ta_instructions(struct rsi_common 
*common)
 
/* Copy firmware into DMA-accessible memory */
fw = kmemdup(fw_entry-data, fw_entry-size, GFP_KERNEL);
-   if (!fw)
-   return -ENOMEM;
+   if (!fw) {
+   status = -ENOMEM;
+   goto out;
+   }
len = fw_entry-size;
 
if (len % 4)
@@ -217,6 +219,8 @@ static int rsi_load_ta_instructions(struct rsi_common 
*common)
 
status = rsi_copy_to_card(common, fw, len, num_blocks);
kfree(fw);
+
+out:
release_firmware(fw_entry);
return status;
 }
diff --git a/drivers/net/wireless/rsi/rsi_91x_usb_ops.c 
b/drivers/net/wireless/rsi/rsi_91x_usb_ops.c
index 30c2cf7fa93b..de4900862836 100644
--- a/drivers/net/wireless/rsi/rsi_91x_usb_ops.c
+++ b/drivers/net/wireless/rsi/rsi_91x_usb_ops.c
@@ -148,8 +148,10 @@ static int rsi_load_ta_instructions(struct rsi_common 
*common)
 
/* Copy firmware into DMA-accessible memory */
fw = kmemdup(fw_entry-data, fw_entry-size, GFP_KERNEL);
-   if (!fw)
-   return -ENOMEM;
+   if (!fw) {
+   status = -ENOMEM;
+   goto out;
+   }
len = fw_entry-size;
 
if (len % 4)
@@ -162,6 +164,8 @@ static int rsi_load_ta_instructions(struct rsi_common 
*common)
 
status = rsi_copy_to_card(common, fw, len, num_blocks);
kfree(fw);
+
+out:
release_firmware(fw_entry);
return status;
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 net-next] tcp: fix slow start after idle vs TSO/GSO

2015-08-21 Thread Eric Dumazet

From: Eric Dumazet eduma...@google.com

slow start after idle might reduce cwnd, but we perform this
after first packet was cooked and sent.

With TSO/GSO, it means that we might send a full TSO packet
even if cwnd should have been reduced to IW10.

Moving the SSAI check in skb_entail() makes sense, because
we slightly reduce number of times this check is done,
especially for large send() and TCP Small queue callbacks from
softirq context.

As Neal pointed out, we also need to perform the check
if/when receive window opens.

Tested:

Following packetdrill test demonstrates the problem
// Test of slow start after idle

`sysctl -q net.ipv4.tcp_slow_start_after_idle=1`

0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0bind(3, ..., ...) = 0
+0listen(3, 1) = 0

+0 S 0:0(0) win 65535 mss 1000,sackOK,nop,nop,nop,wscale 7
+0 S. 0:0(0) ack 1 mss 1460,nop,nop,sackOK,nop,wscale 6
+.100  . 1:1(0) ack 1 win 511
+0accept(3, ..., ...) = 4
+0setsockopt(4, SOL_SOCKET, SO_SNDBUF, [20], 4) = 0

+0write(4, ..., 26000) = 26000
+0 . 1:5001(5000) ack 1
+0 . 5001:10001(5000) ack 1
+0%{ assert tcpi_snd_cwnd == 10 }%

+.100  . 1:1(0) ack 10001 win 511
+0%{ assert tcpi_snd_cwnd == 20, tcpi_snd_cwnd }%
+0 . 10001:20001(1) ack 1
+0 P. 20001:26001(6000) ack 1

+.100  . 1:1(0) ack 26001 win 511
+0%{ assert tcpi_snd_cwnd == 36, tcpi_snd_cwnd }%

+4 write(4, ..., 2) = 2
// If slow start after idle works properly, we should send 5 MSS here (cwnd/2)
+0 . 26001:31001(5000) ack 1
+0%{ assert tcpi_snd_cwnd == 10, tcpi_snd_cwnd }%
+0 . 31001:36001(5000) ack 1

Signed-off-by: Eric Dumazet eduma...@google.com
Cc: Neal Cardwell ncardw...@google.com
Cc: Yuchung Cheng ych...@google.com
---
 include/net/tcp.h |   13 +
 net/ipv4/tcp.c|2 ++
 net/ipv4/tcp_input.c  |3 +++
 net/ipv4/tcp_output.c |   12 
 4 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 364426a..309801f 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1165,6 +1165,19 @@ static inline void tcp_sack_reset(struct 
tcp_options_received *rx_opt)
 }
 
 u32 tcp_default_init_rwnd(u32 mss);
+void tcp_cwnd_restart(struct sock *sk, s32 delta);
+
+static inline void tcp_slow_start_after_idle_check(struct sock *sk)
+{
+   struct tcp_sock *tp = tcp_sk(sk);
+   s32 delta;
+
+   if (!sysctl_tcp_slow_start_after_idle || tp-packets_out)
+   return;
+   delta = tcp_time_stamp - tp-lsndtime;
+   if (delta  inet_csk(sk)-icsk_rto)
+   tcp_cwnd_restart(sk, delta);
+}
 
 /* Determine a window scaling and initial window to offer. */
 void tcp_select_initial_window(int __space, __u32 mss, __u32 *rcv_wnd,
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 45534a5..b8b8fa1 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -627,6 +627,8 @@ static void skb_entail(struct sock *sk, struct sk_buff *skb)
sk_mem_charge(sk, skb-truesize);
if (tp-nonagle  TCP_NAGLE_PUSH)
tp-nonagle = ~TCP_NAGLE_PUSH;
+
+   tcp_slow_start_after_idle_check(sk);
 }
 
 static inline void tcp_mark_urg(struct tcp_sock *tp, int flags)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 4e4d6bc..0abca28 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3332,6 +3332,9 @@ static int tcp_ack_update_window(struct sock *sk, const 
struct sk_buff *skb, u32
tp-pred_flags = 0;
tcp_fast_path_check(sk);
 
+   if (tcp_send_head(sk))
+   tcp_slow_start_after_idle_check(sk);
+
if (nwin  tp-max_window) {
tp-max_window = nwin;
tcp_sync_mss(sk, 
inet_csk(sk)-icsk_pmtu_cookie);
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 444ab5b..1188e4f 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -137,12 +137,12 @@ static __u16 tcp_advertise_mss(struct sock *sk)
 }
 
 /* RFC2861. Reset CWND after idle period longer RTO to restart window.
- * This is the first part of cwnd validation mechanism. */
-static void tcp_cwnd_restart(struct sock *sk, const struct dst_entry *dst)
+ * This is the first part of cwnd validation mechanism.
+ */
+void tcp_cwnd_restart(struct sock *sk, s32 delta)
 {
struct tcp_sock *tp = tcp_sk(sk);
-   s32 delta = tcp_time_stamp - tp-lsndtime;
-   u32 restart_cwnd = tcp_init_cwnd(tp, dst);
+   u32 restart_cwnd = tcp_init_cwnd(tp, __sk_dst_get(sk));
u32 cwnd = tp-snd_cwnd;
 
tcp_ca_event(sk, CA_EVENT_CWND_RESTART);
@@ -164,10 +164,6 @@ static void tcp_event_data_sent(struct tcp_sock *tp,
struct inet_connection_sock *icsk = inet_csk(sk);
const u32 now = tcp_time_stamp;
 
-   if (sysctl_tcp_slow_start_after_idle 
-

Re: [PATCH net-next v2] tcp: reduce cpu usage under tcp memory pressure when SO_SNDBUF is set

2015-08-21 Thread Jason Baron



On 08/11/2015 01:59 PM, Jason Baron wrote:
 
 
 On 08/11/2015 12:12 PM, Eric Dumazet wrote:
 On Tue, 2015-08-11 at 11:03 -0400, Jason Baron wrote:


 Yes, so the test case I'm using to test against is somewhat contrived.
 In that I am simply allocating around 40,000 sockets that are idle to
 create a 'permanent' memory pressure in the background. Then, I have
 just 1 flow that sets SO_SNDBUF, which results in the: poll(), write() loop.

 That said, we encountered this issue initially where we had 10,000+
 flows and whenever the system would get into memory pressure, we would
 see all the cpus spin at 100%.

 So the testcase I wrote, was just a simplistic version for testing. But
 I am going to try and test against the more realistic workload where
 this issue was initially observed.


 Note that I am still trying to understand why we need to increase socket
 structure, for something which is inherently a problem of sharing memory
 with an unknown (potentially big) number of sockets.

 
 I was trying to mirror the wakeups when SO_SNDBUF is not set, where we
 continue to trigger on 1/3 of the buffer being available, as the
 sk-sndbuf is shrunk. And I saw this value as dynamic depending on
 number of sockets and read/write buffer usage. So that's where I was
 coming from with it.
 
 Also, at least with the .config I have the tcp_sock structure didn't
 increase in size (although struct sock did go up by 8 and not 4).
 
 I suggested to use a flag (one bit).

 If set, then we should fallback to tcp_wmem[0] (each socket has 4096
 bytes, so that we can avoid starvation)



 
 Ok, I will test this approach.

Hi Eric,

So I created a test here with 20,000 streams, and if I set SO_SNDBUF
high enough on the server side, I can create tcp memory pressure above
tcp_mem[2]. In this case, with the 'one bit' approach using tcp_wmem[0]
as the wakeup threshold I can still observe the 100% cpu spinning issue,
but with this v2 patch, cpu usage is minimal (1-2%). Since, we don't
guarantee tcp_wmem[0], above tcp_mem[2]. So using the 'one bit'
definitely alleviates the spinning between tcp_mem[1] and tcp_mem[2],
but not above tcp_mem[2] in my testing.

Maybe nobody cares about this case (you are getting what you ask for by
using SO_SNDBUF), but it seems to me that it would be nice to avoid this
sort of behavior. I also like the fact that with the
sk_effective_sndbuf, we keep doing wakeups on 1/3 of the write buffer
emptying, which keeps the wakeup behavior consistent. In theory this
would matter for high latency and bandwidth link, but in the testing I
did, I didn't observe any throughput differences between this v2 patch,
and the 'one bit' approach.

As I mentioned with this v2, the 'struct sock' grows by 4 bytes, but
struct tcp_sock does not increase. So since this is tcp specific, we
could add the sk_effective_sndbuf only to the struct tcp_sock.

So the 'one bit' approach definitely seems to me to be an improvement,
but I wanted to get feedback on this testing, before deciding how to
proceed.

Thanks,

-Jason
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] iproute2: provide common json output formatter

2015-08-21 Thread Stephen Hemminger

Formatting JSON is moderately painful.
Provide a simple API to do the syntax formatting.
Use it to replace existing json in *stat commands.

---
 include/json_writer.h |  61 ++
 lib/Makefile  |   3 +-
 lib/json_writer.c | 312 ++
 misc/Makefile |   2 +-
 misc/ifstat.c | 103 +
 misc/lnstat.c |  22 ++--
 misc/nstat.c  |  59 ++
 7 files changed, 477 insertions(+), 85 deletions(-)
 create mode 100644 include/json_writer.h
 create mode 100644 lib/json_writer.c

diff --git a/include/json_writer.h b/include/json_writer.h
new file mode 100644
index 000..ab9a008
--- /dev/null
+++ b/include/json_writer.h
@@ -0,0 +1,61 @@
+/*
+ * Simple streaming JSON writer
+ *
+ * This takes care of the annoying bits of JSON syntax like the commas
+ * after elements
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ *
+ * Authors:Stephen Hemminger step...@networkplumber.org
+ */
+
+#ifndef _JSON_WRITER_H_
+#define _JSON_WRITER_H_
+
+#include stdbool.h
+#include stdint.h
+
+/* Opaque class structure */
+typedef struct json_writer json_writer_t;
+
+/* Create a new JSON stream */
+json_writer_t *jsonw_new(FILE *f);
+/* End output to JSON stream */
+void jsonw_destroy(json_writer_t **self_p);
+
+/* Cause output to have pretty whitespace */
+void jsonw_pretty(json_writer_t *self, bool on);
+
+/* Add property name */
+void jsonw_name(json_writer_t *self, const char *name);
+
+/* Add value  */
+void jsonw_string(json_writer_t *self, const char *value);
+void jsonw_bool(json_writer_t *self, bool value);
+void jsonw_float(json_writer_t *self, double number);
+void jsonw_uint(json_writer_t *self, uint64_t number);
+void jsonw_int(json_writer_t *self, int64_t number);
+void jsonw_null(json_writer_t *self);
+
+/* Useful Combinations of name and value */
+void jsonw_string_field(json_writer_t *self, const char *prop, const char 
*val);
+void jsonw_bool_field(json_writer_t *self, const char *prop, bool value);
+void jsonw_float_field(json_writer_t *self, const char *prop, double num);
+void jsonw_uint_field(json_writer_t *self, const char *prop, uint64_t num);
+void jsonw_int_field(json_writer_t *self, const char *prop, int64_t num);
+void jsonw_null_field(json_writer_t *self, const char *prop);
+
+/* Collections */
+void jsonw_start_object(json_writer_t *self);
+void jsonw_end_object(json_writer_t *self);
+
+void jsonw_start_array(json_writer_t *self);
+void jsonw_end_array(json_writer_t *self);
+
+/* Override default exception handling */
+typedef void (jsonw_err_handler_fn)(const char *);
+
+#endif /* _JSON_WRITER_H_ */
diff --git a/lib/Makefile b/lib/Makefile
index 1d4045f..9d1307d 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -6,7 +6,8 @@ endif
 
 CFLAGS += -fPIC
 
-UTILOBJ=utils.o rt_names.o ll_types.o ll_proto.o ll_addr.o inet_proto.o 
namespace.o \
+UTILOBJ = utils.o rt_names.o ll_types.o ll_proto.o ll_addr.o \
+   inet_proto.o namespace.o json_writer.o \
names.o color.o
 
 NLOBJ=libgenl.o ll_map.o libnetlink.o
diff --git a/lib/json_writer.c b/lib/json_writer.c
new file mode 100644
index 000..2af16e1
--- /dev/null
+++ b/lib/json_writer.c
@@ -0,0 +1,312 @@
+/*
+ * Simple streaming JSON writer
+ *
+ * This takes care of the annoying bits of JSON syntax like the commas
+ * after elements
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ *
+ * Authors:Stephen Hemminger step...@networkplumber.org
+ */
+
+#include stdio.h
+#include stdbool.h
+#include stdarg.h
+#include assert.h
+#include malloc.h
+#include inttypes.h
+#include stdint.h
+
+#include json_writer.h
+
+struct json_writer {
+   FILE*out;   /* output file */
+   unsigneddepth;  /* nesting */
+   boolpretty; /* optional whitepace */
+   charsep;/* either nul or comma */
+};
+
+/* indentation for pretty print */
+static void jsonw_indent(json_writer_t *self)
+{
+   unsigned i;
+   for (i = 0; i = self-depth; ++i)
+   fputs(, self-out);
+}
+
+/* end current line and indent if pretty printing */
+static void jsonw_eol(json_writer_t *self)
+{
+   if (!self-pretty)
+   return;
+
+   putc('\n', self-out);
+   jsonw_indent(self);
+}
+
+/* If current object is not empty print a comma */
+static void jsonw_eor(json_writer_t *self)
+{
+   if (self-sep != '\0')
+   putc(self-sep, self-out);
+   self-sep = ',';
+}
+
+
+/* Output JSON encoded string */
+/* Handles C escapes, does not do Unicode */

Re: [net-next PATCH 2/3] net: sched: allocate a handle to default qdiscs

2015-08-21 Thread Phil Sutter

On Fri, Aug 21, 2015 at 09:14:58AM -0700, Eric Dumazet wrote:
[...]
  diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
  index 1fb65f9..ab614ee 100644
  --- a/net/sched/sch_generic.c
  +++ b/net/sched/sch_generic.c
  @@ -634,6 +634,11 @@ struct Qdisc *qdisc_create_dflt(struct netdev_queue 
  *dev_queue,
  if (IS_ERR(sch))
  goto errout;
  sch-parent = parentid;
  +#ifdef CONFIG_NET_SCHED
  +   sch-handle = qdisc_alloc_handle(dev_queue-dev);
  +   if (!sch-handle)
  +   goto errout;
  +#endif
   
  if (!ops-init || ops-init(sch, NULL) == 0)
  return sch;
 
 This might break HTB setups with more than 32768 classes ?

Urgh. Thanks for noticing this!

 The pfifo qdisc that gets attached had no handle.

Yes, looks like I need to leave qdisc_create_dflt() alone. It is
possible, by doing the above twice in sch_generic.c (once in
attach_one_default_qdisc(), and in attach_default_qdiscs() as well).

 qdisc_alloc_handle() has a limited range.

Yes, I noticed this. Handles aren't reused in a running system either,
which might contribute to this problem in other situations.

V2 will follow, thanks again.

Cheers, Phil
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHv4 net-next 10/10] openvswitch: Allow attaching helpers to ct action

2015-08-21 Thread Pravin Shelar

On Thu, Aug 20, 2015 at 5:47 PM, Joe Stringer joestrin...@nicira.com wrote:
 On 19 August 2015 at 15:57, Pravin Shelar pshe...@nicira.com wrote:
 On Tue, Aug 18, 2015 at 4:39 PM, Joe Stringer joestrin...@nicira.com wrote:
 Add support for using conntrack helpers to assist protocol detection.
 The new OVS_CT_ATTR_HELPER attribute of the ct action specifies a helper
 to be used for this connection.

 Example ODP flows allowing FTP connections from ports 1-2:
 in_port=1,tcp,action=ct(helper=ftp,commit),2
 in_port=2,tcp,ct_state=-trk,action=ct(),recirc(1)
 recirc_id=1,in_port=2,tcp,ct_state=+trk-new+est,action=1
 recirc_id=1,in_port=2,tcp,ct_state=+trk+rel,action=1

 Signed-off-by: Joe Stringer joestrin...@nicira.com
 ---
 v2-v3: No change.
 v4: Change error code for unknown helper ENOENT-EINVAL.

 I got following compilation warning :

 net/openvswitch/conntrack.c:352:42: error: incompatible types in
 comparison expression (different address spaces)

 Is this made available via another sparse flag? It looks like it's
 related to the __rcu as you've mentioned below, but I'm not seeing
 this (latest sparse, gcc-4.9.2)

You need to enable RCU space checker in kernel config.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/4 net-next] enic: add devcmd2

2015-08-21 Thread David Miller

From: Brandeburg, Jesse jesse.brandeb...@intel.com
Date: Fri, 21 Aug 2015 17:19:03 +

 This series introduces a compile error
 drivers/net/ethernet/cisco/enic/enic_main.c: In function 'enic_probe':
 drivers/net/ethernet/cisco/enic/enic_main.c:2490:3: error: label 
 'err_out_vnic_unregister' used but not defined
goto err_out_vnic_unregister;
^

Thanks I'm about to push the following to fix this:


[PATCH] enic: Fix build failure with SRIOV disabled.

err_out_vnic_unregister is used regardless of whether
SRIOV is enabled or not.

Reported-by: Jesse Brandeburg jesse.brangeb...@intel.com
Signed-off-by: David S. Miller da...@davemloft.net
---
 drivers/net/ethernet/cisco/enic/enic_main.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c 
b/drivers/net/ethernet/cisco/enic/enic_main.c
index cb1fdc3..3352d02 100644
--- a/drivers/net/ethernet/cisco/enic/enic_main.c
+++ b/drivers/net/ethernet/cisco/enic/enic_main.c
@@ -2663,8 +2663,8 @@ err_out_disable_sriov_pp:
pci_disable_sriov(pdev);
enic-priv_flags = ~ENIC_SRIOV_ENABLED;
}
-err_out_vnic_unregister:
 #endif
+err_out_vnic_unregister:
vnic_dev_unregister(enic-vdev);
 err_out_iounmap:
enic_iounmap(enic);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next 1/2] lan78xx: change phy id and fix phy reset issue

2015-08-21 Thread Woojung.Huh

Patch to change internal PHYID to 1 and fix PHY reset issue.

Signed-off-by: Woojung Huh woojung@microchip.com
---
 drivers/net/usb/lan78xx.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index 39364a4..4bcbf28 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -36,7 +36,7 @@
 #define DRIVER_AUTHOR  WOOJUNG HUH woojung@microchip.com
 #define DRIVER_DESCLAN78XX USB 3.0 Gigabit Ethernet Devices
 #define DRIVER_NAMElan78xx
-#define DRIVER_VERSION 1.0.0
+#define DRIVER_VERSION 1.0.1
 
 #define TX_TIMEOUT_JIFFIES (5 * HZ)
 #define THROTTLE_JIFFIES   (HZ / 8)
@@ -57,7 +57,7 @@
 #define DEFAULT_RX_CSUM_ENABLE (true)
 #define DEFAULT_TSO_CSUM_ENABLE(true)
 #define DEFAULT_VLAN_FILTER_ENABLE (true)
-#define INTERNAL_PHY_ID(2) /* 2: GMII */
+#define INTERNAL_PHY_ID(1)
 #define TX_OVERHEAD(8)
 #define RXW_PADDING2
 
@@ -2003,7 +2003,7 @@ static int lan78xx_reset(struct lan78xx_net *dev)
netdev_warn(dev-net, timeout waiting for PHY Reset);
return -EIO;
}
-   } while (buf  PMT_CTL_PHY_RST_);
+   } while ((buf  PMT_CTL_PHY_RST_) || !(buf  PMT_CTL_READY_));
 
lan78xx_mii_init(dev);
 
-- 
2.1.4
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 net-next] r8169: Add values missing in @get_stats64 from HW counters

2015-08-21 Thread Corinna Vinschen

On Aug 21 21:39, Francois Romieu wrote:
 Corinna Vinschen vinsc...@redhat.com :
 [...]
  diff --git a/drivers/net/ethernet/realtek/r8169.c 
  b/drivers/net/ethernet/realtek/r8169.c
  index f790f61..f26a48d 100644
  --- a/drivers/net/ethernet/realtek/r8169.c
  +++ b/drivers/net/ethernet/realtek/r8169.c
 [...]
  @@ -2179,6 +2191,47 @@ static int rtl8169_get_sset_count(struct net_device 
  *dev, int sset)
  }
   }
   
  +DECLARE_RTL_COND(rtl_reset_counters_cond)
  +{
  +   void __iomem *ioaddr = tp-mmio_addr;
  +
  +   return RTL_R32(CounterAddrLow)  CounterReset;
  +}
  +
  +static void rtl8169_reset_counters(struct net_device *dev)
  +{
 
 rtl8169_reset_counters duplicates most of rtl8169_update_counters. Please
 factor out the dma_alloc + parametrized CounterAddrLow write + cleanup.

Ok, will do.

  +   rtl8169_reset_counters(dev);
  +
  +   rtl8169_update_counters(dev);
 
 
 The code should propagate failure when both rtl8169_reset_counters and
 rtl8169_update_counters fail.

This one I don't understand.  Neither failing to reset the counters nor
failing to update the counters is fatal for the driver.  So far the
(unchanged) rtl8169_update_counters doesn't even print a log message,
while a failing reset in rtl8169_reset_counters now does.

Why is that not sufficent?


Thanks,
Corinna


pgpKL0AKrvb57.pgp
Description: PGP signature

Re: [RFC PATCH v5 net-next 4/4] tcp: add NV congestion control

2015-08-21 Thread Lawrence Brakmo

Kenneth, thank you for your comments, I¹ve implemented most of the
improvements you've mentioned.

I¹m finishing the new patch and the updated results, they should
be done by Monday (including cdg).

On 8/5/15, 5:51 PM, knn...@gmail.com on behalf of Kenneth Klette
Jonassen knn...@gmail.com on behalf of kenne...@ifi.uio.no wrote:

On Wed, Aug 5, 2015 at 3:39 AM, Lawrence Brakmo bra...@fb.com wrote:
 This is a request for comments.

Nice to see more development on delay-based congestion control.

Thank you.


It would be good to see how NV stacks up against CDG. Any chance of
adding cdg as a congestion control parameter to your experiments?
Experiments on NV without its temporary cwnd reductions would also be
of interest -- to get a reference of how effective this mechanism is.

I¹m finishing with cdg experiments, they will be up on Monday together
with an update to the NV patch.
I will also have some experiments with variations in the temporary cwnd
reduction. This mechanism is meant to reduce min_rtt creep, but it is
now always successful. Its drawback is that it can increase high
percentile latency.



 +#define NV_INIT_RTT  0x

Maybe use U32_MAX?

Done



 +static void tcpnv_init(struct sock *sk)
 +{
 +   struct tcpnv *ca = inet_csk_ca(sk);
 +
 +   tcpnv_reset(ca, sk);
 +
 +   ca-nv_min_rtt_reset_jiffies = jiffies + 2*HZ;
 +   ca-nv_min_rtt = NV_INIT_RTT;
 +   ca-nv_min_rtt_new = NV_INIT_RTT;
 +   ca-nv_enable = nv_enable;

Can this assignment be ca-nv_enable = 1? That would match the
TCP_CA_Open case in tcpnv_state().

Done



 +   if (nv_dec_eval_min_calls  255)
 +   nv_dec_eval_min_calls = 255;
 +   if (nv_rtt_min_cnt  63)
 +   nv_rtt_min_cnt = 63;

nv_dec_eval_min_calls can be clamped to 0-255 by changing its type to u8.

nv_rtt_min_cnt can also be u8? In struct tcpnv, perhaps move
nv_rtt_cnt to the available byte.

Done



 +static void tcpnv_cong_avoid(struct sock *sk, u32 ack, u32 acked)
 +{
 +   struct tcp_sock *tp = tcp_sk(sk);
 +   struct tcpnv *ca = inet_csk_ca(sk);
 +
 +   if (!tcp_is_cwnd_limited(sk))
 +   return;
 +
 +   /* Only grow cwnd if NV has not detected congestion */
 +   if (nv_enable  ca-nv_enable  !ca-nv_allow_cwnd_growth)
 +   return;

The check for ca-nv_enable might be overly harsh on some unfortunate
sockets in TCP_CA_Disorder. Is it needed here?

TCP_CA_Disorder should not affect ca-nv_enable in the new patch



 +static void tcpnv_acked(struct sock *sk, struct ack_sample *sample)

Maybe move some of this function to tcpnv_cong_avoid()?

It needs to be here since We need the information provided in argument
sample



 +{
 +   const struct inet_connection_sock *icsk = inet_csk(sk);
 +   struct tcp_sock *tp = tcp_sk(sk);
 +   struct tcpnv *ca = inet_csk_ca(sk);
 +   unsigned long now = jiffies;
 +   s64 rate64 = 0;
 +   u32 rate, max_win, cwnd_by_slope;
 +   u32 avg_rtt;
 +   u32 bytes_acked = 0;
 +
 +   /* Some calls are for duplicates without timetamps */
 +   if (sample-rtt_us  0)
 +   return;
 +
 +   /* If not in TCP_CA_Open state, skip. */
 +   if (icsk-icsk_ca_state != TCP_CA_Open)
 +   return;

Consider using samples in other states too, especially
TCP_CA_Disorder. Linux 4.2 enhances RTT sampling from SACKs, so any
non-negative RTT sample should be fully usable.

Done



--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next 2/2] lan78xx: update eee code

2015-08-21 Thread Woojung.Huh

Patch to pdate EEE code.

Signed-off-by: Woojung Huh woojung@microchip.com
---
 drivers/net/usb/lan78xx.c | 44 
 drivers/net/usb/lan78xx.h | 22 +++---
 2 files changed, 35 insertions(+), 31 deletions(-)

diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index 4bcbf28..af102b0 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -1296,38 +1296,37 @@ static int lan78xx_get_eee(struct net_device *net, 
struct ethtool_eee *edata)
if (ret  0)
return ret;
 
+   buf = lan78xx_mmd_read(dev-net, dev-mii.phy_id,
+  PHY_MMD_DEV_7, PHY_EEE_ADVERTISEMENT);
+   adv = mmd_eee_adv_to_ethtool_adv_t(buf);
+   buf = lan78xx_mmd_read(dev-net, dev-mii.phy_id,
+  PHY_MMD_DEV_7, PHY_EEE_LP_ADVERTISEMENT);
+   lpadv = mmd_eee_adv_to_ethtool_adv_t(buf);
+
ret = lan78xx_read_reg(dev, MAC_CR, buf);
if (buf  MAC_CR_EEE_EN_) {
-   buf = lan78xx_mmd_read(dev-net, dev-mii.phy_id,
-  PHY_MMD_DEV_7, PHY_EEE_ADVERTISEMENT);
-   adv = mmd_eee_adv_to_ethtool_adv_t(buf);
-   buf = lan78xx_mmd_read(dev-net, dev-mii.phy_id,
-  PHY_MMD_DEV_7, PHY_EEE_LP_ADVERTISEMENT);
-   lpadv = mmd_eee_adv_to_ethtool_adv_t(buf);
-
edata-eee_enabled = true;
-   edata-supported = true;
edata-eee_active = !!(adv  lpadv);
-   edata-advertised = adv;
-   edata-lp_advertised = lpadv;
edata-tx_lpi_enabled = true;
/* EEE_TX_LPI_REQ_DLY  tx_lpi_timer are same uSec unit */
ret = lan78xx_read_reg(dev, EEE_TX_LPI_REQ_DLY, buf);
edata-tx_lpi_timer = buf;
} else {
-   buf = lan78xx_mmd_read(dev-net, dev-mii.phy_id,
-  PHY_MMD_DEV_7, PHY_EEE_LP_ADVERTISEMENT);
-   lpadv = mmd_eee_adv_to_ethtool_adv_t(buf);
 
edata-eee_enabled = false;
edata-eee_active = false;
-   edata-supported = false;
-   edata-advertised = 0;
-   edata-lp_advertised = mmd_eee_adv_to_ethtool_adv_t(lpadv);
edata-tx_lpi_enabled = false;
edata-tx_lpi_timer = 0;
}
 
+   edata-supported = ADVERTISED_100baseT_Full |
+  ADVERTISED_1000baseT_Full;
+
+   edata-advertised = ADVERTISED_100baseT_Full |
+   ADVERTISED_1000baseT_Full;
+
+   edata-lp_advertised = lpadv;
+
usb_autopm_put_interface(dev-intf);
 
return 0;
@@ -1351,6 +1350,9 @@ static int lan78xx_set_eee(struct net_device *net, struct 
ethtool_eee *edata)
buf = ethtool_adv_to_mmd_eee_adv_t(edata-advertised);
lan78xx_mmd_write(dev-net, dev-mii.phy_id,
  PHY_MMD_DEV_7, PHY_EEE_ADVERTISEMENT, buf);
+
+   buf = (u32)edata-tx_lpi_timer;
+   ret = lan78xx_write_reg(dev, EEE_TX_LPI_REQ_DLY, buf);
} else {
ret = lan78xx_read_reg(dev, MAC_CR, buf);
buf = ~MAC_CR_EEE_EN_;
@@ -1641,6 +1643,12 @@ static int lan78xx_phy_init(struct lan78xx_net *dev)
mii-mdio_write(mii-dev, mii-phy_id, MII_CTRL1000,
temp  ~ADVERTISE_1000HALF);
 
+   /* Set EEE advertise */
+   lan78xx_mmd_write(dev-net, dev-mii.phy_id,
+ PHY_MMD_DEV_7, PHY_EEE_ADVERTISEMENT,
+ PHY_EEE_ADVERTISEMENT_1000BT_EEE_ |
+ PHY_EEE_ADVERTISEMENT_100BT_EEE_);
+
/* clear interrupt */
mii-mdio_read(mii-dev, mii-phy_id, PHY_VTSE_INT_STS);
mii-mdio_write(mii-dev, mii-phy_id, PHY_VTSE_INT_MASK,
@@ -2016,10 +2024,6 @@ static int lan78xx_reset(struct lan78xx_net *dev)
 
ret = lan78xx_write_reg(dev, MAC_CR, buf);
 
-   /* enable on PHY */
-   if (buf  MAC_CR_EEE_EN_)
-   lan78xx_mmd_write(dev-net, dev-mii.phy_id, 0x07, 0x3C, 0x06);
-
/* enable PHY interrupts */
ret = lan78xx_read_reg(dev, INT_EP_CTL, buf);
buf |= INT_ENP_PHY_INT;
diff --git a/drivers/net/usb/lan78xx.h b/drivers/net/usb/lan78xx.h
index ae7562e..95e721b 100644
--- a/drivers/net/usb/lan78xx.h
+++ b/drivers/net/usb/lan78xx.h
@@ -1047,23 +1047,23 @@
 #define PHY_MMD_DEV_3  3
 
 #define PHY_EEE_PCS_STATUS (0x1)
-#define PHY_EEE_PCS_STATUS_TX_LPI_RCVD_((WORD)0x0800)
-#define PHY_EEE_PCS_STATUS_RX_LPI_RCVD_((WORD)0x0400)
-#define PHY_EEE_PCS_STATUS_TX_LPI_IND_ ((WORD)0x0200)
-#define PHY_EEE_PCS_STATUS_RX_LPI_IND_ ((WORD)0x0100)
-#define PHY_EEE_PCS_STATUS_PCS_RCV_LNK_STS_((WORD)0x0004)
+#define PHY_EEE_PCS_STATUS_TX_LPI_RCVD_(0x0800)

Re: [PATCH net-next 2/2] lan78xx: update eee code

2015-08-21 Thread Florian Fainelli

On 21/08/15 14:41, woojung@microchip.com wrote:
 Patch to pdate EEE code.

This really deserves a better explanation of what is it that you are
fixing here.

 
 Signed-off-by: Woojung Huh woojung@microchip.com
 ---
  drivers/net/usb/lan78xx.c | 44 
  drivers/net/usb/lan78xx.h | 22 +++---
  2 files changed, 35 insertions(+), 31 deletions(-)
 
 diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
 index 4bcbf28..af102b0 100644
 --- a/drivers/net/usb/lan78xx.c
 +++ b/drivers/net/usb/lan78xx.c
 @@ -1296,38 +1296,37 @@ static int lan78xx_get_eee(struct net_device *net, 
 struct ethtool_eee *edata)
   if (ret  0)
   return ret;
  
 + buf = lan78xx_mmd_read(dev-net, dev-mii.phy_id,
 +PHY_MMD_DEV_7, PHY_EEE_ADVERTISEMENT);
 + adv = mmd_eee_adv_to_ethtool_adv_t(buf);
 + buf = lan78xx_mmd_read(dev-net, dev-mii.phy_id,
 +PHY_MMD_DEV_7, PHY_EEE_LP_ADVERTISEMENT);
 + lpadv = mmd_eee_adv_to_ethtool_adv_t(buf);

Considering your function signatures, it sounds like you should
implement a libphy driver and you could get things like phy_init_eee()
for free.

[snip]

   /* enable PHY interrupts */
   ret = lan78xx_read_reg(dev, INT_EP_CTL, buf);
   buf |= INT_ENP_PHY_INT;
 diff --git a/drivers/net/usb/lan78xx.h b/drivers/net/usb/lan78xx.h
 index ae7562e..95e721b 100644
 --- a/drivers/net/usb/lan78xx.h
 +++ b/drivers/net/usb/lan78xx.h
 @@ -1047,23 +1047,23 @@
  #define PHY_MMD_DEV_33
  
  #define PHY_EEE_PCS_STATUS   (0x1)
 -#define PHY_EEE_PCS_STATUS_TX_LPI_RCVD_  ((WORD)0x0800)
 -#define PHY_EEE_PCS_STATUS_RX_LPI_RCVD_  ((WORD)0x0400)
 -#define PHY_EEE_PCS_STATUS_TX_LPI_IND_   ((WORD)0x0200)
 -#define PHY_EEE_PCS_STATUS_RX_LPI_IND_   ((WORD)0x0100)
 -#define PHY_EEE_PCS_STATUS_PCS_RCV_LNK_STS_  ((WORD)0x0004)
 +#define PHY_EEE_PCS_STATUS_TX_LPI_RCVD_  (0x0800)
 +#define PHY_EEE_PCS_STATUS_RX_LPI_RCVD_  (0x0400)
 +#define PHY_EEE_PCS_STATUS_TX_LPI_IND_   (0x0200)
 +#define PHY_EEE_PCS_STATUS_RX_LPI_IND_   (0x0100)
 +#define PHY_EEE_PCS_STATUS_PCS_RCV_LNK_STS_  (0x0004)

Can you look at updating include/uapi/linux/mdio.h with the missing
registers for your use case instead of replicating this in a driver?
-- 
Florian
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] phylib: Make PHYs children of their MDIO bus, not the bus' parent.

2015-08-21 Thread David Daney

From: David Daney david.da...@cavium.com

commit 18ee49ddb0d2 (phylib: rename mii_bus::dev to mii_bus::parent)
changed the parent of PHY devices from the bus to the bus parent.

Then, commit 4dea547fef1b (phylib: rework to prepare for OF
registration of PHYs) moved the code into phy_device.c

At this point, it is somewhat unclear why the change was seen as
necessary.  But, when we look at the device model tree in
/sys/devices, it is clearly incorrect.  The PHYs should be children of
their MDIO bus.

Change the PHY's parent device to be the MDIO bus device.

Cc: Lennert Buytenhek buyt...@wantstofly.org
Cc: Grant Likely grant.lik...@secretlab.ca
Signed-off-by: David Daney david.da...@cavium.com
---
 drivers/net/phy/phy_device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 0302483..55f0178 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -176,7 +176,7 @@ struct phy_device *phy_device_create(struct mii_bus *bus, 
int addr, int phy_id,
if (c45_ids)
dev-c45_ids = *c45_ids;
dev-bus = bus;
-   dev-dev.parent = bus-parent;
+   dev-dev.parent = bus-dev;
dev-dev.bus = mdio_bus_type;
dev-irq = bus-irq != NULL ? bus-irq[addr] : PHY_POLL;
dev_set_name(dev-dev, PHY_ID_FMT, bus-id, addr);
-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[net-next PATCH v2 0/3] net: sched: allow switching qdisc to noqueue intuitively

2015-08-21 Thread Phil Sutter

This patch series improves the integration of the noqueue qdisc to become the
fallback queueing if no other is attached to an interface. Before it was rather
an add-on, a simpler alternative to a FIFO if no congestion is expected or
possible. It has become the default qdisc for virtual interfaces, and could be
attached by this mechanism only (through removing the root qdisc after having
set tx_queue_len to zero for interfaces not defaulting to noqueue otherwise).

This series does not change the default qdisc chosen for new interfaces, but
upon removal of the root qdisc from an interface, the kernel won't fall back to
the default but to noqueue instead.

Changes since v1:
- Leave qdisc_create_dflt() alone as it is used in sch_htb.c as well. Instead
  allocate the handle in attach_default_qdiscs() and
  attach_one_default_qdisc().

Phil Sutter (3):
  net: sched: make noqueue_qdisc non-static
  net: sched: allocate a handle to default qdiscs
  net: sched: fall back to noqueue when removing root qdisc

 include/net/sch_generic.h |  2 ++
 net/sched/sch_api.c   |  5 +++--
 net/sched/sch_generic.c   | 18 ++
 3 files changed, 19 insertions(+), 6 deletions(-)

-- 
2.1.2

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH v1 0/6] Add fec1 and fec2 support for i.MX7d sdb board

2015-08-21 Thread Duan Andy

From: Florian Fainelli f.faine...@gmail.com
 Sent: Friday, August 21, 2015 5:14 AM
 To: David Miller; Duan Fugang-B38611
 Cc: netdev@vger.kernel.org; shawn...@kernel.org; linux-arm-
 ker...@lists.infradead.org
 Subject: Re: [PATCH v1 0/6] Add fec1 and fec2 support for i.MX7d sdb
 board

 On 20/08/15 14:05, David Miller wrote:
  From: Fugang Duan b38...@freescale.com
  Date: Wed, 19 Aug 2015 13:33:58 +0800

  The patch series is to add fec support for i.MX7d sdb board. Since
  i.MX7d fec ip is the same as i.MX6SX, so there have no change for
 driver itself.

  Patch#1: add bcm54220 PHY ID entry into brcmphy.h file.

  This is completely, and utterly, pointless.

  The only reason a PHY ID should be defined in brcmphy.h is so that it
  can be used in the broadcom.c PHY driver or similar.

  If there is no user in the tree, there is no reason to add it to the
  header file.

 There is a valid reason for which you may have a PHY id, which is
 defining a PHY fixup in your platform code like Andy is doing, however,
 this should not be used in conjunction with the Generic PHY driver,
 because this driver has absolutely no clue about your PHY fixup, and this
 could create at best inconsistencies in how the fixup is managed later on.

 At the very least, I would like to see a change to
 drivers/net/phy/broadcom.c which identifies this PHY id, and eventually
 just invokes the genphy_* functions where relevant.
 --
 Florian

I will try to add the phy support in Broadcom phy driver. Thanks for your 
comments.

Regards,
Andy

Re: [PATCH net-next] tcp: refine pacing rate determination

2015-08-21 Thread Neal Cardwell

On Fri, Aug 21, 2015 at 8:38 PM, Eric Dumazet eric.duma...@gmail.com wrote:
 From: Eric Dumazet eduma...@google.com

 When TCP pacing was added back in linux-3.12, we chose
 to apply a fixed ratio of 200 % against current rate,
 to allow probing for optimal throughput even during
 slow start phase, where cwnd can be doubled every other gRTT.

 At Google, we found it was better applying a different ratio
 while in Congestion Avoidance phase.
 This ratio was set to 120 %.

 We've used the normal tcp_in_slow_start() helper for a while,
 then tuned the condition to select the conservative ratio
 as soon as cwnd = ssthresh/2 :

 - After cwnd reduction, it is safer to ramp up more slowly,
   as we approach optimal cwnd.
 - Initial ramp up (ssthresh == INFINITY) still allows doubling
   cwnd every other RTT.

 Signed-off-by: Eric Dumazet eduma...@google.com
 Cc: Neal Cardwell ncardw...@google.com
 Cc: Yuchung Cheng ych...@google.com

Acked-by: Neal Cardwell ncardw...@google.com

Looks great to me. Thanks, Eric!

neal
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 3/4] Add support for driver cross-timestamp to PTP_SYS_OFFSET ioctl

2015-08-21 Thread Christopher S. Hall

From: Christopher Hall christopher.s.h...@intel.com

This patch allows system and device time (cross-timestamp) to be
performed by the driver. Currently, the cross-timestamping is performed
in the PTP_SYS_OFFSET ioctl.  The PTP clock driver reads gettimeofday()
and the gettime64() callback provided by the driver. The cross-timestamp
is best effort where the latency between the capture of system time
(getnstimeofday()) and the device time (driver callback) may be
significant.

This patch adds an additional callback getsynctime64(). Which will be
called when the driver is able to perform a more accurate, implementation
specific cross-timestamping.  For example, future network devices that
implement PCIE PTM will be able to precisely correlate the device clock
with the system clock with virtually zero latency between captures.
This added callback can be used by the driver to expose this functionality.

The callback, getsynctime64(), will only be called when defined and
n_samples == 1 because the driver returns only 1 cross-timestamp where
multiple samples cannot be chained together.

This patch also adds to the capabilities ioctl (PTP_CLOCK_GETCAPS),
allowing applications to query whether or not drivers implement the
getsynctime callback, providing more precise cross timestamping.

Commit Details:

Added additional callback to ptp_clock_info:

* getsynctime64()

This takes 2 arguments referring to system and device time

With this callback drivers may provide both system time and device time
to ensure precise correlation

Modified PTP_SYS_OFFSET ioctl in PTP clock driver to use the above
callback if it's available

Added capability (PTP_CLOCK_GETCAPS) for checking whether driver supports
cross timestamping

Added check for cross timestamping flag to testptp.c

Signed-off-by: Christopher S. Hall christopher.s.h...@intel.com
---
 Documentation/ptp/testptp.c  |  6 --
 drivers/ptp/ptp_chardev.c| 29 +
 include/linux/ptp_clock_kernel.h |  7 +++
 include/uapi/linux/ptp_clock.h   |  4 +++-
 4 files changed, 35 insertions(+), 11 deletions(-)

diff --git a/Documentation/ptp/testptp.c b/Documentation/ptp/testptp.c
index 2bc8abc..8004efd 100644
--- a/Documentation/ptp/testptp.c
+++ b/Documentation/ptp/testptp.c
@@ -276,13 +276,15 @@ int main(int argc, char *argv[])
 %d external time stamp channels\n
 %d programmable periodic signals\n
 %d pulse per second\n
-%d programmable pins\n,
+%d programmable pins\n
+%d cross timestamping\n,
   caps.max_adj,
   caps.n_alarm,
   caps.n_ext_ts,
   caps.n_per_out,
   caps.pps,
-  caps.n_pins);
+  caps.n_pins,
+  caps.cross_timestamping);
}
}
 
diff --git a/drivers/ptp/ptp_chardev.c b/drivers/ptp/ptp_chardev.c
index da7bae9..392ccfa 100644
--- a/drivers/ptp/ptp_chardev.c
+++ b/drivers/ptp/ptp_chardev.c
@@ -124,7 +124,7 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, 
unsigned long arg)
struct ptp_clock *ptp = container_of(pc, struct ptp_clock, clock);
struct ptp_clock_info *ops = ptp-info;
struct ptp_clock_time *pct;
-   struct timespec64 ts;
+   struct timespec64 ts, systs;
int enable, err = 0;
unsigned int i, pin_index;
 
@@ -138,6 +138,7 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, 
unsigned long arg)
caps.n_per_out = ptp-info-n_per_out;
caps.pps = ptp-info-pps;
caps.n_pins = ptp-info-n_pins;
+   caps.cross_timestamping = ptp-info-getsynctime64 != NULL;
if (copy_to_user((void __user *)arg, caps, sizeof(caps)))
err = -EFAULT;
break;
@@ -196,19 +197,31 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, 
unsigned long arg)
break;
}
pct = sysoff-ts[0];
-   for (i = 0; i  sysoff-n_samples; i++) {
-   getnstimeofday64(ts);
+   if (ptp-info-getsynctime64  sysoff-n_samples == 1 
+   ptp-info-getsynctime64(ptp-info, ts, systs) == 0) {
+   pct-sec = systs.tv_sec;
+   pct-nsec = systs.tv_nsec;
+   pct++;
pct-sec = ts.tv_sec;
pct-nsec = ts.tv_nsec;
pct++;
-   ptp-info-gettime64(ptp-info, ts);
+   pct-sec = systs.tv_sec;
+   pct-nsec = systs.tv_nsec;
+   } else {
+   for (i = 0; i

RE: [PATCH v1 0/6] Add fec1 and fec2 support for i.MX7d sdb board

2015-08-21 Thread Duan Andy

From: David Miller da...@davemloft.net Sent: Friday, August 21, 2015 5:06 AM
 To: Duan Fugang-B38611
 Cc: shawn...@kernel.org; linux-arm-ker...@lists.infradead.org;
 netdev@vger.kernel.org
 Subject: Re: [PATCH v1 0/6] Add fec1 and fec2 support for i.MX7d sdb
 board

 From: Fugang Duan b38...@freescale.com
 Date: Wed, 19 Aug 2015 13:33:58 +0800

  The patch series is to add fec support for i.MX7d sdb board. Since
  i.MX7d fec ip is the same as i.MX6SX, so there have no change for
 driver itself.

  Patch#1: add bcm54220 PHY ID entry into brcmphy.h file.

 This is completely, and utterly, pointless.

 The only reason a PHY ID should be defined in brcmphy.h is so that it can
 be used in the broadcom.c PHY driver or similar.

 If there is no user in the tree, there is no reason to add it to the
 header file.

Ok, I will try to add the phy support in Broadcom phy driver. Thanks for your 
comment.

Regards,
Andy
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 net-next] r8169: Add values missing in @get_stats64 from HW counters

2015-08-21 Thread Francois Romieu

Corinna Vinschen vinsc...@redhat.com :
 On Aug 21 21:39, Francois Romieu wrote:
[...]
  The code should propagate failure when both rtl8169_reset_counters and
  rtl8169_update_counters fail.
 
 This one I don't understand.  Neither failing to reset the counters nor
 failing to update the counters is fatal for the driver.  So far the
 (unchanged) rtl8169_update_counters doesn't even print a log message,

I wouldn't overestimate the value of log messages vs real status return.
Users can be quite unhappy with default settings that spam their logs
(it isn't a problem in open(), it's marginaly murphy plausible from
a periodic get_stats context).

The driver can't propagate errors from the current get_stats context
where rtl8169_update_counters is used. However it can be done in
open().

 while a failing reset in rtl8169_reset_counters now does.
 
 Why is that not sufficent?

Because of the same reason(s) why this patch wants to improve things.

It isn't a showstopper.

-- 
Ueimor
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[net-next PATCH v2 3/3] net: sched: fall back to noqueue when removing root qdisc

2015-08-21 Thread Phil Sutter

When removing the root qdisc, the interface should fall back to noqueue
as the 'real' minimal qdisc instead of the default one. Therefore
dev_graft_qdisc() has to be adjusted to assign noqueue if NULL was
passed as new qdisc, and qdisc_graft() needs to assign noqueue to
dev-qdisc instead of noop to prevent dev_activate() from attaching
default qdiscs to the interface.

Note that it is also necessary to have dev_graft_qdisc() set
dev_queue-qdisc to the new qdisc instead of (unconditionally) noop. I
don't know why this was there at all (originates from pre-git time), but
it seems wrong to me.  It could be worked around by droping the extra
check for noqueue in transition_one_qdisc(), maybe with unintended
side-effects.

Signed-off-by: Phil Sutter p...@nwl.cc
---
 net/sched/sch_api.c | 2 +-
 net/sched/sch_generic.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 224374c..3b2cf30 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -839,7 +839,7 @@ skip:
   dev-qdisc, new);
if (new  !new-ops-attach)
atomic_inc(new-refcnt);
-   dev-qdisc = new ? : noop_qdisc;
+   dev-qdisc = new ? : noqueue_qdisc;
 
if (new  new-ops-attach)
new-ops-attach(new);
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 68df721..ecc369b 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -718,9 +718,9 @@ struct Qdisc *dev_graft_qdisc(struct netdev_queue 
*dev_queue,
 
/* ... and graft new one */
if (qdisc == NULL)
-   qdisc = noop_qdisc;
+   qdisc = noqueue_qdisc;
dev_queue-qdisc_sleeping = qdisc;
-   rcu_assign_pointer(dev_queue-qdisc, noop_qdisc);
+   rcu_assign_pointer(dev_queue-qdisc, qdisc);
 
spin_unlock_bh(root_lock);
 
-- 
2.1.2

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[net-next PATCH v2 2/3] net: sched: allocate a handle to default qdiscs

2015-08-21 Thread Phil Sutter

Since tc_get_qdisc() does not allow to remove a qdisc with zero handle,
a handle needs to be allocated to default qdiscs (currently pfifo_fast
or mq) in order to allow removing them.

Signed-off-by: Phil Sutter p...@nwl.cc
---
 include/net/sch_generic.h |  1 +
 net/sched/sch_api.c   |  3 ++-
 net/sched/sch_generic.c   | 11 +++
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 4495193..2bfc898 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -391,6 +391,7 @@ void dev_deactivate(struct net_device *dev);
 void dev_deactivate_many(struct list_head *head);
 struct Qdisc *dev_graft_qdisc(struct netdev_queue *dev_queue,
  struct Qdisc *qdisc);
+u32 qdisc_alloc_handle(struct net_device *dev);
 void qdisc_reset(struct Qdisc *qdisc);
 void qdisc_destroy(struct Qdisc *qdisc);
 void qdisc_tree_decrease_qlen(struct Qdisc *qdisc, unsigned int n);
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index f06aa01..224374c 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -723,7 +723,7 @@ EXPORT_SYMBOL(qdisc_class_hash_remove);
 /* Allocate an unique handle from space managed by kernel
  * Possible range is [8000-]: (0x8000 values)
  */
-static u32 qdisc_alloc_handle(struct net_device *dev)
+u32 qdisc_alloc_handle(struct net_device *dev)
 {
int i = 0x8000;
static u32 autohandle = TC_H_MAKE(0x8000U, 0);
@@ -739,6 +739,7 @@ static u32 qdisc_alloc_handle(struct net_device *dev)
 
return 0;
 }
+EXPORT_SYMBOL(qdisc_alloc_handle);
 
 void qdisc_tree_decrease_qlen(struct Qdisc *sch, unsigned int n)
 {
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 1fb65f9..68df721 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -741,6 +741,11 @@ static void attach_one_default_qdisc(struct net_device 
*dev,
netdev_info(dev, activation failed\n);
return;
}
+#ifdef CONFIG_NET_SCHED
+   qdisc-handle = qdisc_alloc_handle(dev);
+   if (!qdisc-handle)
+   netdev_info(dev, qdisc handle allocation failed\n);
+#endif
if (!netif_is_multiqueue(dev))
qdisc-flags |= TCQ_F_ONETXQUEUE;
}
@@ -763,6 +768,12 @@ static void attach_default_qdiscs(struct net_device *dev)
} else {
qdisc = qdisc_create_dflt(txq, mq_qdisc_ops, TC_H_ROOT);
if (qdisc) {
+#ifdef CONFIG_NET_SCHED
+   qdisc-handle = qdisc_alloc_handle(dev);
+   if (!qdisc-handle)
+   netdev_info(dev,
+   qdisc handle allocation failed\n);
+#endif
dev-qdisc = qdisc;
qdisc-ops-attach(qdisc);
}
-- 
2.1.2

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next] tcp: refine pacing rate determination

2015-08-21 Thread Eric Dumazet

From: Eric Dumazet eduma...@google.com

When TCP pacing was added back in linux-3.12, we chose
to apply a fixed ratio of 200 % against current rate,
to allow probing for optimal throughput even during
slow start phase, where cwnd can be doubled every other gRTT.

At Google, we found it was better applying a different ratio
while in Congestion Avoidance phase.
This ratio was set to 120 %.

We've used the normal tcp_in_slow_start() helper for a while,
then tuned the condition to select the conservative ratio
as soon as cwnd = ssthresh/2 :

- After cwnd reduction, it is safer to ramp up more slowly,
  as we approach optimal cwnd.
- Initial ramp up (ssthresh == INFINITY) still allows doubling
  cwnd every other RTT.

Signed-off-by: Eric Dumazet eduma...@google.com
Cc: Neal Cardwell ncardw...@google.com
Cc: Yuchung Cheng ych...@google.com
---
 Documentation/networking/ip-sysctl.txt |   15 +++
 include/net/tcp.h  |2 ++
 net/ipv4/sysctl_net_ipv4.c |   19 +++
 net/ipv4/tcp_input.c   |   18 +-
 4 files changed, 53 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/ip-sysctl.txt 
b/Documentation/networking/ip-sysctl.txt
index 46e88ed7f41d..ac77a13d2ea2 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -586,6 +586,21 @@ tcp_min_tso_segs - INTEGER
if available window is too small.
Default: 2
 
+tcp_pacing_ss_ratio - INTEGER
+   sk-sk_pacing_rate is set by TCP stack using a ratio applied
+   to current rate. (current_rate = cwnd * mss / srtt)
+   If TCP is in slow start, tcp_pacing_ss_ratio is applied
+   to let TCP probe for bigger speeds, assuming cwnd can be
+   doubled every other RTT.
+   Default: 200
+
+tcp_pacing_ca_ratio - INTEGER
+   sk-sk_pacing_rate is set by TCP stack using a ratio applied
+   to current rate. (current_rate = cwnd * mss / srtt)
+   If TCP is in congestion avoidance phase, tcp_pacing_ca_ratio
+   is applied to conservatively probe for bigger throughput.
+   Default: 120
+
 tcp_tso_win_divisor - INTEGER
This allows control over what percentage of the congestion window
can be consumed by a single TSO frame.
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 364426a2be5a..3e2b3ba43ae5 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -281,6 +281,8 @@ extern unsigned int sysctl_tcp_notsent_lowat;
 extern int sysctl_tcp_min_tso_segs;
 extern int sysctl_tcp_autocorking;
 extern int sysctl_tcp_invalid_ratelimit;
+extern int sysctl_tcp_pacing_ss_ratio;
+extern int sysctl_tcp_pacing_ca_ratio;
 
 extern atomic_long_t tcp_memory_allocated;
 extern struct percpu_counter tcp_sockets_allocated;
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 0330ab2e2b63..879bdc5c95b1 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -29,6 +29,7 @@
 static int zero;
 static int one = 1;
 static int four = 4;
+static int thousand = 1000;
 static int gso_max_segs = GSO_MAX_SEGS;
 static int tcp_retr1_max = 255;
 static int ip_local_port_range_min[] = { 1, 1 };
@@ -712,6 +713,24 @@ static struct ctl_table ipv4_table[] = {
.extra2 = gso_max_segs,
},
{
+   .procname   = tcp_pacing_ss_ratio,
+   .data   = sysctl_tcp_pacing_ss_ratio,
+   .maxlen = sizeof(int),
+   .mode   = 0644,
+   .proc_handler   = proc_dointvec_minmax,
+   .extra1 = zero,
+   .extra2 = thousand,
+   },
+   {
+   .procname   = tcp_pacing_ca_ratio,
+   .data   = sysctl_tcp_pacing_ca_ratio,
+   .maxlen = sizeof(int),
+   .mode   = 0644,
+   .proc_handler   = proc_dointvec_minmax,
+   .extra1 = zero,
+   .extra2 = thousand,
+   },
+   {
.procname   = tcp_autocorking,
.data   = sysctl_tcp_autocorking,
.maxlen = sizeof(int),
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 4e4d6bcd0ca9..7e1623775744 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -753,13 +753,29 @@ static void tcp_rtt_estimator(struct sock *sk, long 
mrtt_us)
  * TCP pacing, to smooth the burst on large writes when packets
  * in flight is significantly lower than cwnd (or rwin)
  */
+int sysctl_tcp_pacing_ss_ratio __read_mostly = 200;
+int sysctl_tcp_pacing_ca_ratio __read_mostly = 120;
+
 static void tcp_update_pacing_rate(struct sock *sk)
 {
const struct tcp_sock *tp = tcp_sk(sk);
u64 rate;
 
/* set sk_pacing_rate to 200 % of current rate (mss * cwnd / srtt) */
-   rate = (u64)tp-mss_cache * 2 * (USEC_PER_SEC  3);
+   rate = (u64)tp-mss_cache * ((USEC_PER_SEC

[PATCH v3 2/4] Added ART correlated clocksource and ART CPU feature

2015-08-21 Thread Christopher S. Hall

Add detect_art() call to early TSC initialization which reads ART-TSC
numerator/denominator and sets CPU feature if present

Add convert_art_to_tsc() function performing conversion ART to TSC

Add art_timestamp referencing art_to_tsc() and clocksource_tsc enabling
driver conversion of ART to TSC

Signed-off-by: Christopher S. Hall christopher.s.h...@intel.com
---
 arch/x86/include/asm/cpufeature.h |  3 ++-
 arch/x86/include/asm/tsc.h|  2 ++
 arch/x86/kernel/tsc.c | 54 +++
 3 files changed, 58 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/cpufeature.h 
b/arch/x86/include/asm/cpufeature.h
index 3d6606f..a9322e5 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -85,7 +85,7 @@
 #define X86_FEATURE_P4 ( 3*32+ 7) /*  P4 */
 #define X86_FEATURE_CONSTANT_TSC ( 3*32+ 8) /* TSC ticks at a constant rate */
 #define X86_FEATURE_UP ( 3*32+ 9) /* smp kernel running on up */
-/* free, was #define X86_FEATURE_FXSAVE_LEAK ( 3*32+10) *  FXSAVE leaks 
FOP/FIP/FOP */
+#define X86_FEATURE_ART(3*32+10) /* Platform has always 
running timer (ART) */
 #define X86_FEATURE_ARCH_PERFMON ( 3*32+11) /* Intel Architectural PerfMon */
 #define X86_FEATURE_PEBS   ( 3*32+12) /* Precise-Event Based Sampling */
 #define X86_FEATURE_BTS( 3*32+13) /* Branch Trace Store */
@@ -352,6 +352,7 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
 #define cpu_has_de boot_cpu_has(X86_FEATURE_DE)
 #define cpu_has_pseboot_cpu_has(X86_FEATURE_PSE)
 #define cpu_has_tscboot_cpu_has(X86_FEATURE_TSC)
+#define cpu_has_artboot_cpu_has(X86_FEATURE_ART)
 #define cpu_has_pgeboot_cpu_has(X86_FEATURE_PGE)
 #define cpu_has_apic   boot_cpu_has(X86_FEATURE_APIC)
 #define cpu_has_sepboot_cpu_has(X86_FEATURE_SEP)
diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
index 94605c0..8d52d91 100644
--- a/arch/x86/include/asm/tsc.h
+++ b/arch/x86/include/asm/tsc.h
@@ -45,6 +45,8 @@ static __always_inline cycles_t vget_cycles(void)
return (cycles_t)__native_read_tsc();
 }
 
+extern struct correlated_cs art_timestamper;
+
 extern void tsc_init(void);
 extern void mark_tsc_unstable(char *reason);
 extern int unsynchronized_tsc(void);
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 7437b41..13f12e0 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -939,10 +939,36 @@ static struct notifier_block time_cpufreq_notifier_block 
= {
.notifier_call  = time_cpufreq_notifier
 };
 
+#define ART_CPUID_LEAF (0x15)
+#define ART_MIN_DENOMINATOR (2)
+
+static u32 art_to_tsc_numerator;
+static u32 art_to_tsc_denominator;
+
+/*
+ * If ART is present detect the numberator:denominator to convert to TSC
+ */
+void detect_art(void)
+{
+   unsigned int unused[2];
+
+   if (boot_cpu_data.cpuid_level = ART_CPUID_LEAF) {
+   cpuid(ART_CPUID_LEAF, art_to_tsc_denominator,
+ art_to_tsc_numerator, unused, unused+1);
+
+   if (art_to_tsc_denominator = ART_MIN_DENOMINATOR) {
+   set_cpu_cap(boot_cpu_data, X86_FEATURE_ART);
+   }
+   }
+}
+
 static int __init cpufreq_tsc(void)
 {
if (!cpu_has_tsc)
return 0;
+
+   detect_art();
+
if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC))
return 0;
cpufreq_register_notifier(time_cpufreq_notifier_block,
@@ -1059,6 +1085,32 @@ int unsynchronized_tsc(void)
return 0;
 }
 
+/*
+ * Convert ART to TSC given numerator/denominator found in detect_art()
+ */
+static u64 convert_art_to_tsc(struct correlated_cs *cs, u64 cycles)
+{
+   u64 tmp, res;
+
+   switch (art_to_tsc_denominator) {
+   default:
+   res = (cycles / art_to_tsc_denominator) * art_to_tsc_numerator;
+   tmp = (cycles % art_to_tsc_denominator) * art_to_tsc_numerator;
+   res += tmp / art_to_tsc_denominator;
+   break;
+   case 2:
+  res = (cycles  1) * art_to_tsc_numerator;
+  tmp = (cycles  0x1) * art_to_tsc_numerator;
+  res += tmp  1;
+  break;
+   }
+   return res;
+}
+
+struct correlated_cs art_timestamper = {
+   .convert= convert_art_to_tsc,
+};
+EXPORT_SYMBOL(art_timestamper);
 
 static void tsc_refine_calibration_work(struct work_struct *work);
 static DECLARE_DELAYED_WORK(tsc_irqwork, tsc_refine_calibration_work);
@@ -1130,6 +1182,8 @@ static void tsc_refine_calibration_work(struct 
work_struct *work)
(unsigned long)tsc_khz % 1000);
 
 out:
+   if (cpu_has_art)
+   art_timestamper.related_cs = clocksource_tsc;
clocksource_register_khz(clocksource_tsc, tsc_khz);
 }
 
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a

[PATCH v3 4/4] Enabling hardware supported PTP system/device crosstimestamping

2015-08-21 Thread Christopher S. Hall

From: Christopher Hall christopher.s.h...@intel.com

Add getsynctime() PTP device callback to cross timestamp system device
clock using ART translation depends on platform being = SPT
and having ART

getsynctime() reads ART (TSC-derived)/device cross timestamp and
converts to realtime/device time reporting cross timestamp to
PTP driver

Signed-off-by: Christopher S. Hall christopher.s.h...@intel.com
---
 drivers/net/ethernet/intel/e1000e/defines.h |  5 ++
 drivers/net/ethernet/intel/e1000e/ptp.c | 88 +
 drivers/net/ethernet/intel/e1000e/regs.h|  4 ++
 3 files changed, 97 insertions(+)

diff --git a/drivers/net/ethernet/intel/e1000e/defines.h 
b/drivers/net/ethernet/intel/e1000e/defines.h
index 133d407..13cff75 100644
--- a/drivers/net/ethernet/intel/e1000e/defines.h
+++ b/drivers/net/ethernet/intel/e1000e/defines.h
@@ -527,6 +527,11 @@
 #define E1000_RXCW_C  0x2000/* Receive config */
 #define E1000_RXCW_SYNCH  0x4000/* Receive config synch */
 
+/* HH Time Sync */
+#define E1000_TSYNCTXCTL_MAX_ALLOWED_DLY_MASK  0xF000 /* max delay */
+#define E1000_TSYNCTXCTL_SYNC_COMP 0x4000 /* sync complete */
+#define E1000_TSYNCTXCTL_START_SYNC0x8000 /* initiate sync */
+
 #define E1000_TSYNCTXCTL_VALID 0x0001 /* Tx timestamp valid */
 #define E1000_TSYNCTXCTL_ENABLED   0x0010 /* enable Tx timestamping */
 
diff --git a/drivers/net/ethernet/intel/e1000e/ptp.c 
b/drivers/net/ethernet/intel/e1000e/ptp.c
index 25a0ad5..228f3f3 100644
--- a/drivers/net/ethernet/intel/e1000e/ptp.c
+++ b/drivers/net/ethernet/intel/e1000e/ptp.c
@@ -25,6 +25,8 @@
  */
 
 #include e1000.h
+#include asm/tsc.h
+#include linux/timekeeping.h
 
 /**
  * e1000e_phc_adjfreq - adjust the frequency of the hardware clock
@@ -98,6 +100,87 @@ static int e1000e_phc_adjtime(struct ptp_clock_info *ptp, 
s64 delta)
return 0;
 }
 
+#define MAX_HW_WAIT_COUNT (3)
+
+static int e1000e_phc_get_ts(struct correlated_ts *cts)
+{
+   struct e1000_adapter *adapter = (struct e1000_adapter *)cts-private;
+   struct e1000_hw *hw = adapter-hw;
+   int i;
+   u32 tsync_ctrl;
+   int ret;
+
+   tsync_ctrl = er32(TSYNCTXCTL);
+   tsync_ctrl |= E1000_TSYNCTXCTL_START_SYNC |
+   E1000_TSYNCTXCTL_MAX_ALLOWED_DLY_MASK;
+   ew32(TSYNCTXCTL, tsync_ctrl);
+   for (i = 0; i  MAX_HW_WAIT_COUNT; ++i) {
+   udelay(1);
+   tsync_ctrl = er32(TSYNCTXCTL);
+   if (tsync_ctrl  E1000_TSYNCTXCTL_SYNC_COMP)
+   break;
+   }
+
+   if (i == MAX_HW_WAIT_COUNT) {
+   ret = -ETIMEDOUT;
+   } else {
+   ret = 0;
+   cts-system_ts = er32(PLTSTMPH);
+   cts-system_ts = 32;
+   cts-system_ts |= er32(PLTSTMPL);
+   cts-device_ts = er32(SYSSTMPH);
+   cts-device_ts = 32;
+   cts-device_ts |= er32(SYSSTMPL);
+   }
+
+   return ret;
+}
+
+/**
+ * e1000e_phc_getsynctime - Reads the current time from the hardware clock and
+ * correlated system time
+ * @ptp: ptp clock structure
+ * @devts: timespec structure to hold the current device time value
+ * @systs: timespec structure to hold the current system time value
+ *
+ * Read device and system (ART) clock simultaneously and return the correct
+ * clock values in ns after converting into a struct timespec.
+ **/
+static int e1000e_phc_getsynctime(struct ptp_clock_info *ptp,
+ struct timespec64 *devts,
+ struct timespec64 *systs)
+{
+   struct e1000_adapter *adapter = container_of(ptp, struct e1000_adapter,
+ptp_clock_info);
+   unsigned long flags;
+   u32 remainder;
+   struct correlated_ts art_correlated_ts;
+   u64 device_time;
+   int ret;
+
+   art_correlated_ts.get_ts = e1000e_phc_get_ts;
+   art_correlated_ts.private = adapter;
+   ret = get_correlated_timestamp(art_correlated_ts,
+  art_timestamper);
+   if (ret != 0)
+   goto bail;
+
+   systs-tv_sec =
+   div_u64_rem(art_correlated_ts.system_real.tv64,
+   NSEC_PER_SEC, remainder);
+   systs-tv_nsec = remainder;
+   spin_lock_irqsave(adapter-systim_lock, flags);
+   device_time = timecounter_cyc2time(adapter-tc,
+  art_correlated_ts.device_ts);
+   spin_unlock_irqrestore(adapter-systim_lock, flags);
+   devts-tv_sec =
+   div_u64_rem(device_time, NSEC_PER_SEC, remainder);
+   devts-tv_nsec = remainder;
+
+bail:
+   return ret;
+}
+
 /**
  * e1000e_phc_gettime - Reads the current time from the hardware clock
  * @ptp: ptp clock structure
@@ -190,6 +273,7 @@ static const struct ptp_clock_info

[net-next:master 790/1189] xt_TEE.c:undefined reference to `nf_dup_ipv6'

2015-08-21 Thread kbuild test robot

tree:   git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master
head:   a9e01ed986aa80d3092134428f453072752da223
commit: bbde9fc1824aab58bc78c084163007dd6c03fe5b [790/1189] netfilter: factor 
out packet duplication for IPv4/IPv6
config: x86_64-nfsroot (attached as .config)
reproduce:
  git checkout bbde9fc1824aab58bc78c084163007dd6c03fe5b
  # save the attached .config to linux build tree
  make ARCH=x86_64 

All error/warnings (new ones prefixed by ):

   net/built-in.o: In function `tee_tg6':
 xt_TEE.c:(.text+0x6cd8c): undefined reference to `nf_dup_ipv6'

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 4.2.0-rc4 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_PERF_EVENTS_INTEL_UNCORE=y
CONFIG_OUTPUT_FORMAT=elf64-x86-64
CONFIG_ARCH_DEFCONFIG=arch/x86/configs/x86_64_defconfig
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_MMU=y
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ZONE_DMA32=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_X86_64_SMP=y
CONFIG_ARCH_HWEIGHT_CFLAGS=-fcall-saved-rdi -fcall-saved-rsi -fcall-saved-rdx 
-fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 -fcall-saved-r10 
-fcall-saved-r11
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_PGTABLE_LEVELS=4
CONFIG_DEFCONFIG_LIST=/lib/modules/$UNAME_RELEASE/.config
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME=(none)
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_CROSS_MEMORY_ATTACH=y
# CONFIG_FHANDLE is not set
CONFIG_USELIB=y
CONFIG_AUDIT=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y
CONFIG_AUDITSYSCALL=y
CONFIG_AUDIT_WATCH=y
CONFIG_AUDIT_TREE=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_IRQ_DOMAIN=y
CONFIG_IRQ_DOMAIN_HIERARCHY=y
CONFIG_GENERIC_MSI_IRQ=y
CONFIG_GENERIC_MSI_IRQ_DOMAIN=y
# CONFIG_IRQ_DOMAIN_DEBUG is not set
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_DATA=y
CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
# CONFIG_NO_HZ_FULL is not set
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y

#
# CPU/Task time and stats accounting
#
CONFIG_TICK_CPU_ACCOUNTING=y
# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
# CONFIG_IRQ_TIME_ACCOUNTING is not set
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y

#
# RCU Subsystem
#
CONFIG_TREE_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_SRCU=y
CONFIG_TASKS_RCU=y
CONFIG_RCU_STALL_COMMON=y
CONFIG_TREE_RCU_TRACE=y
# CONFIG_RCU_NOCB_CPU is not set
# CONFIG_RCU_EXPEDITE_BOOT is not set
CONFIG_BUILD_BIN2C=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=18
CONFIG_LOG_CPU_MAX_BUF_SHIFT=12
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
CONFIG_ARCH_SUPPORTS_INT128=y
# CONFIG_NUMA_BALANCING is not set
CONFIG_CGROUPS=y
CONFIG_CGROUP_DEBUG=y
CONFIG_CGROUP_FREEZER=y
CONFIG_CGROUP_DEVICE=y
CONFIG_CPUSETS=y
CONFIG_PROC_PID_CPUSET=y
# CONFIG_CGROUP_CPUACCT is not set
# CONFIG_MEMCG is not set
# CONFIG_CGROUP_HUGETLB is not set
# CONFIG_CGROUP_PERF is not set
CONFIG_CGROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
# CONFIG_CFS_BANDWIDTH is not set
# CONFIG_RT_GROUP_SCHED is not set
CONFIG_BLK_CGROUP=y
CONFIG_DEBUG_BLK_CGROUP=y
# CONFIG_CHECKPOINT_RESTORE

[net-next:master 791/1189] net/ipv4/netfilter/nft_dup_ipv4.c:29:37: sparse: incorrect type in initializer (different base types)

2015-08-21 Thread kbuild test robot

tree:   git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master
head:   a9e01ed986aa80d3092134428f453072752da223
commit: d877f07112f1e5a247c6b585c971a93895c9f738 [791/1189] netfilter: 
nf_tables: add nft_dup expression
reproduce:
  # apt-get install sparse
  git checkout d877f07112f1e5a247c6b585c971a93895c9f738
  make ARCH=x86_64 allmodconfig
  make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by )

 net/ipv4/netfilter/nft_dup_ipv4.c:29:37: sparse: incorrect type in 
 initializer (different base types)
   net/ipv4/netfilter/nft_dup_ipv4.c:29:37:expected restricted __be32 
[usertype] s_addr
   net/ipv4/netfilter/nft_dup_ipv4.c:29:37:got unsigned int [unsigned] 
noident

vim +29 net/ipv4/netfilter/nft_dup_ipv4.c

13  #include linux/netfilter.h
14  #include linux/netfilter/nf_tables.h
15  #include net/netfilter/nf_tables.h
16  #include net/netfilter/ipv4/nf_dup_ipv4.h
17  
18  struct nft_dup_ipv4 {
19  enum nft_registers  sreg_addr:8;
20  enum nft_registers  sreg_dev:8;
21  };
22  
23  static void nft_dup_ipv4_eval(const struct nft_expr *expr,
24struct nft_regs *regs,
25const struct nft_pktinfo *pkt)
26  {
27  struct nft_dup_ipv4 *priv = nft_expr_priv(expr);
28  struct in_addr gw = {
   29  .s_addr = regs-data[priv-sreg_addr],
30  };
31  int oif = regs-data[priv-sreg_dev];
32  
33  nf_dup_ipv4(pkt-skb, pkt-ops-hooknum, gw, oif);
34  }
35  
36  static int nft_dup_ipv4_init(const struct nft_ctx *ctx,
37   const struct nft_expr *expr,

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Bug in tc of iproute2 ? Deleting single filter, deletes all the filters (apart from hashtable 800::) ...

2015-08-21 Thread Akshat Kakkar

I actually posted this on lartc first. But then it was suggested to
post it over here as you guys might be able to guide better.

Please help ...

On Fri, Aug 21, 2015 at 10:38 AM, Akshat Kakkar akshat.1...@gmail.com wrote:
 When I am trying to  delete a single tc filter, it deleting all the
 filters with the same priority/preference. i.e. it is ignoring the
 handle specified.

 But, When I am deleting in hashtable 800: it is deleting only the
 specified filter.

 For example, following set of commands  create a hashtable 15: and add
 2 filters to it.

 tc filter add dev eth0 parent 1:0 prio 5 handle 15: protocol ip u32 divisor 
 256
 tc filter add dev eth0 protocol ip parent 1: prio 5 handle 15:2:2 u32
 ht 15:2: match ip src 10.0.0.2 flowid 1:10
 tc filter add dev eth0 protocol ip parent 1: prio 5 handle 15:2:3 u32
 ht 15:2: match ip src 10.0.0.3 flowid 1:10

 Now following command DELETES ALL THE FILTERS, though it should only
 delete FILTER 15:2:3 !
 tc filter del dev eth0 protocol ip parent 1: prio 5 handle 15:2:3 u32

 O/p of tc filter show eth0 is this case is blank. As all filters are deleted.


 However, similar commands when executed for hashtable 800: is deleting
 only the specified filter
 tc filter add dev eth0 protocol ip parent 1: prio 5 handle 800:0:2 u32
 ht 800:0: match ip src 10.0.0.2 flowid 1:10
 tc filter add dev eth0 protocol ip parent 1: prio 5 handle 800:0:3 u32
 ht 800:0: match ip src 10.0.0.3 flowid 1:10

 tc filter del dev eth0 protocol ip parent 1: prio 5 handle 800:0:2 u32

 Above mentioned command only deletes single filter.
 O/p of tc filter show eth0 is 2nd case is

 filter parent 1: protocol ip pref 5 u32
 filter parent 1: protocol ip pref 5 u32 fh 800: ht divisor 1
 filter parent 1: protocol ip pref 5 u32 fh 800::3 order 3 key ht 800
 bkt 0 flowid 1:10
   match 0a03/ at 12
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [lkp] [rhashtable] 9d901bc0515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/ioremap.c:63 __ioremap_check_ram+0x6a/0x99()

2015-08-21 Thread Herbert Xu

On Fri, Aug 21, 2015 at 02:05:19PM +0800, kernel test robot wrote:
 FYI, we noticed the below changes on
 
 git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
 commit 9d901bc05153bbf33b5da2cd6266865e531f0545 (rhashtable: Free bucket 
 tables asynchronously after rehash)
 
 With the commit, the possibility of OOM is increased under our boot testing.

Can you gather some stats on how much memory rhashtable is actually
using? With that kernel you've probably got only one rhashtable user
which is netlink.

Bear in mind that this is a fairly low-memory machine ( 300M) so
it's not clear to me that this patch is the root cause of your OOM
problem.

Thanks,
-- 
Email: Herbert Xu herb...@gondor.apana.org.au
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH v2.2 01/22] fjes: Introduce FUJITSU Extended Socket Network Device driver

2015-08-21 Thread Izumi, Taku

Dear David,

 Thank you for reviewing.
 I'll update patchset according to your comment.

 Sincerely,
 Taku Izumi

 -Original Message-
 From: David Miller [mailto:da...@davemloft.net]
 Sent: Friday, August 21, 2015 7:49 AM
 To: Izumi, Taku/泉 拓
 Cc: netdev@vger.kernel.org; platform-driver-...@vger.kernel.org; 
 dvh...@infradead.org; rk...@redhat.com;
 alexander.h.du...@redhat.com; linux-a...@vger.kernel.org; j...@perches.com; 
 sergei.shtyl...@cogentembedded.com;
 step...@networkplumber.org; yasu.isim...@gmail.com
 Subject: Re: [PATCH v2.2 01/22] fjes: Introduce FUJITSU Extended Socket 
 Network Device driver
 
 From: Taku Izumi izumi.t...@jp.fujitsu.com
 Date: Thu, 20 Aug 2015 17:46:05 +0900
 
  +obj-$(CONFIG_FUJITSU_ES) += fjes.o
  +
  +fjes-objs := fjes_main.o
  +
 
 Please do not have trailing empty lines in any files you add
 or edit, 'git' warns about this even when applying patches.
 
  +static int fjes_acpi_add(struct acpi_device *device)
  +{
  +   acpi_status status;
  +   struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL};
  +   union acpi_object *str;
  +   char str_buf[sizeof(FJES_ACPI_SYMBOL) + 1];
  +   int result;
  +   struct platform_device *plat_dev;
 
 Please order your local variables in reverse christmas tree order, which
 means longer lines come before shorter ones.
 
 Please correct this problem in your entire submission, as I am not going
 to point out each and every other place where this problem exists.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[net-next:master 790/1189] net/ipv6/netfilter/nf_dup_ipv6.c:48:23: sparse: incorrect type in assignment (different base types)

2015-08-21 Thread kbuild test robot

tree:   git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master
head:   a9e01ed986aa80d3092134428f453072752da223
commit: bbde9fc1824aab58bc78c084163007dd6c03fe5b [790/1189] netfilter: factor 
out packet duplication for IPv4/IPv6
reproduce:
  # apt-get install sparse
  git checkout bbde9fc1824aab58bc78c084163007dd6c03fe5b
  make ARCH=x86_64 allmodconfig
  make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by )

 net/ipv6/netfilter/nf_dup_ipv6.c:48:23: sparse: incorrect type in assignment 
 (different base types)
   net/ipv6/netfilter/nf_dup_ipv6.c:48:23:expected restricted __be32 
[addressable] [assigned] [usertype] flowlabel
   net/ipv6/netfilter/nf_dup_ipv6.c:48:23:got int

vim +48 net/ipv6/netfilter/nf_dup_ipv6.c

32  return init_net;
33  }
34  
35  static bool nf_dup_ipv6_route(struct sk_buff *skb, const struct 
in6_addr *gw,
36int oif)
37  {
38  const struct ipv6hdr *iph = ipv6_hdr(skb);
39  struct net *net = pick_net(skb);
40  struct dst_entry *dst;
41  struct flowi6 fl6;
42  
43  memset(fl6, 0, sizeof(fl6));
44  if (oif != -1)
45  fl6.flowi6_oif = oif;
46  
47  fl6.daddr = *gw;
   48  fl6.flowlabel = ((iph-flow_lbl[0]  0xF)  16) |
49   (iph-flow_lbl[1]  8) | iph-flow_lbl[2];
50  dst = ip6_route_output(net, NULL, fl6);
51  if (dst-error) {
52  dst_release(dst);
53  return false;
54  }
55  skb_dst_drop(skb);
56  skb_dst_set(skb, dst);

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/5] net: add Hisilicon Network Subsystem hnae framework support

2015-08-21 Thread Kenneth Lee

Thanks, Klimov,

You are right. I will fix it in next patches.

On Tue, Aug 18, 2015 at 03:12:02AM +0300, Alexey Klimov wrote:
 Date: Tue, 18 Aug 2015 03:12:02 +0300
 From: Alexey Klimov klimov.li...@gmail.com
 To: Kenneth Lee liguo...@hisilicon.com
 CC: robh...@kernel.org, pawel.m...@arm.com, Mark Rutland
  mark.rutl...@arm.com, ijc+devicet...@hellion.org.uk, Kumar Gala
  ga...@codeaurora.org, Catalin Marinas catalin.mari...@arm.com, Will
  Deacon will.dea...@arm.com, yisen.zhu...@huawei.com, David S. Miller
  da...@davemloft.net, paul.gortma...@windriver.com,
  dingtianh...@huawei.com, zhangfei@linaro.org,
  devicet...@vger.kernel.org, Linux Kernel Mailing List
  linux-ker...@vger.kernel.org, linux-arm-ker...@lists.infradead.org,
  netdev@vger.kernel.org, linux...@huawei.com, salil.me...@huawei.com,
  huangda...@hisilicon.com, Kenneth Lee liguo...@huawei.com, Yury Norov
  yury.no...@gmail.com
 Subject: Re: [PATCH 2/5] net: add Hisilicon Network Subsystem hnae
  framework support
 Message-ID: 
 CALW4P+J8LkLshu5TuRT+8c__KRwJ8XAdMV4yA0KEnrfUg=m...@mail.gmail.com

 Hi Kenneth,

 just small minor question.

 On Fri, Aug 14, 2015 at 1:30 PM, Kenneth Lee liguo...@hisilicon.com wrote:
  HNAE (Hisilicon Network Acceleration Engine) is a framework to provide a
  unified ring buffer interface for Hisilicon Network Acceleration Engines.

  With the interface, upper layer can work as ethernet driver, ODP driver or
  other service driver on purpose.

  Signed-off-by: Kenneth Lee liguo...@huawei.com
  Signed-off-by: Yisen Zhuang yisen.zhu...@huawei.com
  ---
   drivers/net/ethernet/hisilicon/Kconfig  |  33 +-
   drivers/net/ethernet/hisilicon/Makefile |   1 +
   drivers/net/ethernet/hisilicon/hns/Makefile |  15 +
   drivers/net/ethernet/hisilicon/hns/hnae.c   | 494 +++
   drivers/net/ethernet/hisilicon/hns/hnae.h   | 582 

   5 files changed, 1124 insertions(+), 1 deletion(-)
   create mode 100644 drivers/net/ethernet/hisilicon/hns/Makefile
   create mode 100644 drivers/net/ethernet/hisilicon/hns/hnae.c
   create mode 100644 drivers/net/ethernet/hisilicon/hns/hnae.h

  diff --git a/drivers/net/ethernet/hisilicon/Kconfig 
  b/drivers/net/ethernet/hisilicon/Kconfig
  index dead17b..1e4f5a7 100644
  --- a/drivers/net/ethernet/hisilicon/Kconfig
  +++ b/drivers/net/ethernet/hisilicon/Kconfig
  @@ -5,7 +5,7 @@
   config NET_VENDOR_HISILICON
  bool Hisilicon devices
  default y
  -   depends on ARM
  +   depends on ARM || ARM64
  ---help---
If you have a network (Ethernet) card belonging to this class, 
  say Y.

  @@ -31,4 +31,35 @@ config HIP04_ETH
If you wish to compile a kernel for a hardware with hisilicon p04 
  SoC and
want to use the internal ethernet then you should answer Y to 
  this.

  +config HNS
  +   tristate Hisilicon Network Subsystem Support (Framework)
  +   ---help---
  + This selects the framework support for Hisilicon Network 
  Subsystem. It
  + is needed by any driver which provides HNS acceleration engine or 
  make
  + use of the engine
  +
  +config HNS_DSAF
  +   tristate Hisilicon HNS DSAF device Support
  +   select HNS
  +   select HNS_MDIO
  +   ---help---
  + This selects the DSAF (Distributed System Area Frabric) network
  + acceleration engine support. The engine is used in Hisilicon P660,
  + Hi1610 and further ICT SoC
  +
  +config HNS_MDIO
  +   tristate Hisilicon HNS MDIO device Support
  +   select MDIO
  +   ---help---
  + This selects the HNS MDIO support. It is needed by HNS_DSAF to 
  access
  + the PHY
  +
  +config HNS_ENET
  +   tristate Hisilicon HNS Ethernet Device Support
  +   select PHYLIB
  +   select HNS
  +   ---help---
  + This selects the general ethernet driver for HNS.  This module 
  make
  + use of any HNS AE driver, such as HNS_DSAF
  +
   endif # NET_VENDOR_HISILICON
  diff --git a/drivers/net/ethernet/hisilicon/Makefile 
  b/drivers/net/ethernet/hisilicon/Makefile
  index 6c14540..2503a9b 100644
  --- a/drivers/net/ethernet/hisilicon/Makefile
  +++ b/drivers/net/ethernet/hisilicon/Makefile
  @@ -4,3 +4,4 @@

   obj-$(CONFIG_HIX5HD2_GMAC) += hix5hd2_gmac.o
   obj-$(CONFIG_HIP04_ETH) += hip04_mdio.o hip04_eth.o
  +obj-$(CONFIG_HNS) += hns/
  diff --git a/drivers/net/ethernet/hisilicon/hns/Makefile 
  b/drivers/net/ethernet/hisilicon/hns/Makefile
  new file mode 100644
  index 000..6680602
  --- /dev/null
  +++ b/drivers/net/ethernet/hisilicon/hns/Makefile
  @@ -0,0 +1,15 @@
  +#
  +# Makefile for the HISILICON network device drivers.
  +#
  +
  +obj-$(CONFIG_HNS) += hnae.o
  +
  +obj-$(CONFIG_HNS_DSAF) += hns_dsaf.o
  +hns_dsaf-objs = hns_ae_adapt.o hns_dsaf_gmac.o hns_dsaf_mac.o 
  hns_dsaf_misc.o \
  +   hns_dsaf_main.o hns_dsaf_ppe.o hns_dsaf_rcb.o hns_dsaf_xgmac.o
  +

[PATCH v3 19/22] fjes: update_zone_task

2015-08-21 Thread Taku Izumi

This patch adds update_zone_task.
Zoning information can be changed by user.
This task is used to monitor if zoning information is
changed or not.

Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com
---
 drivers/net/fjes/fjes_hw.c   | 179 +++
 drivers/net/fjes/fjes_hw.h   |   1 +
 drivers/net/fjes/fjes_main.c |  14 
 3 files changed, 194 insertions(+)

diff --git a/drivers/net/fjes/fjes_hw.c b/drivers/net/fjes/fjes_hw.c
index 4a4b750..4525d36 100644
--- a/drivers/net/fjes/fjes_hw.c
+++ b/drivers/net/fjes/fjes_hw.c
@@ -22,6 +22,8 @@
 #include fjes_hw.h
 #include fjes.h
 
+static void fjes_hw_update_zone_task(struct work_struct *);
+
 /* supported MTU list */
 const u32 fjes_support_mtu[] = {
FJES_MTU_DEFINE(8 * 1024),
@@ -322,6 +324,8 @@ int fjes_hw_init(struct fjes_hw *hw)
 
fjes_hw_set_irqmask(hw, REG_ICTL_MASK_ALL, true);
 
+   INIT_WORK(hw-update_zone_task, fjes_hw_update_zone_task);
+
mutex_init(hw-hw_info.lock);
 
hw-max_epid = fjes_hw_get_max_epid(hw);
@@ -349,6 +353,8 @@ void fjes_hw_exit(struct fjes_hw *hw)
}
 
fjes_hw_cleanup(hw);
+
+   cancel_work_sync(hw-update_zone_task);
 }
 
 static enum fjes_dev_command_response_e
@@ -913,3 +919,176 @@ int fjes_hw_epbuf_tx_pkt_send(struct epbuf_handler *epbh,
 
return 0;
 }
+
+static void fjes_hw_update_zone_task(struct work_struct *work)
+{
+   struct fjes_hw *hw = container_of(work,
+   struct fjes_hw, update_zone_task);
+
+   struct my_s {u8 es_status; u8 zone; } *info;
+   union fjes_device_command_res *res_buf;
+   enum ep_partner_status pstatus;
+
+   struct fjes_adapter *adapter;
+   struct net_device *netdev;
+
+   ulong unshare_bit = 0;
+   ulong share_bit = 0;
+   ulong irq_bit = 0;
+
+   int epidx;
+   int ret;
+
+   adapter = (struct fjes_adapter *)hw-back;
+   netdev = adapter-netdev;
+   res_buf = hw-hw_info.res_buf;
+   info = (struct my_s *)res_buf-info.info;
+
+   mutex_lock(hw-hw_info.lock);
+
+   ret = fjes_hw_request_info(hw);
+   switch (ret) {
+   case -ENOMSG:
+   case -EBUSY:
+   default:
+   if (!work_pending(adapter-force_close_task)) {
+   adapter-force_reset = true;
+   schedule_work(adapter-force_close_task);
+   }
+   break;
+
+   case 0:
+
+   for (epidx = 0; epidx  hw-max_epid; epidx++) {
+   if (epidx == hw-my_epid) {
+   hw-ep_shm_info[epidx].es_status =
+   info[epidx].es_status;
+   hw-ep_shm_info[epidx].zone =
+   info[epidx].zone;
+   continue;
+   }
+
+   pstatus = fjes_hw_get_partner_ep_status(hw, epidx);
+   switch (pstatus) {
+   case EP_PARTNER_UNSHARE:
+   default:
+   if ((info[epidx].zone !=
+   FJES_ZONING_ZONE_TYPE_NONE) 
+   (info[epidx].es_status ==
+   FJES_ZONING_STATUS_ENABLE) 
+   (info[epidx].zone ==
+   info[hw-my_epid].zone))
+   set_bit(epidx, share_bit);
+   else
+   set_bit(epidx, unshare_bit);
+   break;
+
+   case EP_PARTNER_COMPLETE:
+   case EP_PARTNER_WAITING:
+   if ((info[epidx].zone ==
+   FJES_ZONING_ZONE_TYPE_NONE) ||
+   (info[epidx].es_status !=
+   FJES_ZONING_STATUS_ENABLE) ||
+   (info[epidx].zone !=
+   info[hw-my_epid].zone)) {
+   set_bit(epidx,
+   
adapter-unshare_watch_bitmask);
+   set_bit(epidx,
+   
hw-hw_info.buffer_unshare_reserve_bit);
+   }
+   break;
+
+   case EP_PARTNER_SHARED:
+   if ((info[epidx].zone ==
+   FJES_ZONING_ZONE_TYPE_NONE) ||
+   (info[epidx].es_status !=
+   FJES_ZONING_STATUS_ENABLE) ||
+   (info[epidx].zone !=
+   info[hw-my_epid].zone))
+   set_bit(epidx,

[PATCH v3 21/22] fjes: handle receive cancellation request interrupt

2015-08-21 Thread Taku Izumi

This patch adds implementation of handling IRQ
of other receiver's receive cancellation request.

Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com
---
 drivers/net/fjes/fjes_main.c | 78 
 1 file changed, 78 insertions(+)

diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c
index 5e77d0c..5f93e42 100644
--- a/drivers/net/fjes/fjes_main.c
+++ b/drivers/net/fjes/fjes_main.c
@@ -820,6 +820,74 @@ static int fjes_vlan_rx_kill_vid(struct net_device *netdev,
return 0;
 }
 
+static void fjes_txrx_stop_req_irq(struct fjes_adapter *adapter,
+  int src_epid)
+{
+   struct fjes_hw *hw = adapter-hw;
+   enum ep_partner_status status;
+
+   status = fjes_hw_get_partner_ep_status(hw, src_epid);
+   switch (status) {
+   case EP_PARTNER_UNSHARE:
+   case EP_PARTNER_COMPLETE:
+   default:
+   break;
+   case EP_PARTNER_WAITING:
+   if (src_epid  hw-my_epid) {
+   hw-ep_shm_info[src_epid].tx.info-v1i.rx_status |=
+   FJES_RX_STOP_REQ_DONE;
+
+   clear_bit(src_epid, hw-txrx_stop_req_bit);
+   set_bit(src_epid, adapter-unshare_watch_bitmask);
+
+   if (!work_pending(adapter-unshare_watch_task))
+   queue_work(adapter-control_wq,
+  adapter-unshare_watch_task);
+   }
+   break;
+   case EP_PARTNER_SHARED:
+   if (hw-ep_shm_info[src_epid].rx.info-v1i.rx_status 
+   FJES_RX_STOP_REQ_REQUEST) {
+   set_bit(src_epid, hw-epstop_req_bit);
+   if (!work_pending(hw-epstop_task))
+   queue_work(adapter-control_wq,
+  hw-epstop_task);
+   }
+   break;
+   }
+}
+
+static void fjes_stop_req_irq(struct fjes_adapter *adapter, int src_epid)
+{
+   struct fjes_hw *hw = adapter-hw;
+   enum ep_partner_status status;
+
+   set_bit(src_epid, hw-hw_info.buffer_unshare_reserve_bit);
+
+   status = fjes_hw_get_partner_ep_status(hw, src_epid);
+   switch (status) {
+   case EP_PARTNER_WAITING:
+   hw-ep_shm_info[src_epid].tx.info-v1i.rx_status |=
+   FJES_RX_STOP_REQ_DONE;
+   clear_bit(src_epid, hw-txrx_stop_req_bit);
+   /* fall through */
+   case EP_PARTNER_UNSHARE:
+   case EP_PARTNER_COMPLETE:
+   default:
+   set_bit(src_epid, adapter-unshare_watch_bitmask);
+   if (!work_pending(adapter-unshare_watch_task))
+   queue_work(adapter-control_wq,
+  adapter-unshare_watch_task);
+   break;
+   case EP_PARTNER_SHARED:
+   set_bit(src_epid, hw-epstop_req_bit);
+
+   if (!work_pending(hw-epstop_task))
+   queue_work(adapter-control_wq, hw-epstop_task);
+   break;
+   }
+}
+
 static void fjes_update_zone_irq(struct fjes_adapter *adapter,
 int src_epid)
 {
@@ -842,6 +910,16 @@ static irqreturn_t fjes_intr(int irq, void *data)
if (icr  REG_ICTL_MASK_RX_DATA)
fjes_rx_irq(adapter, icr  REG_IS_MASK_EPID);
 
+   if (icr  REG_ICTL_MASK_DEV_STOP_REQ)
+   fjes_stop_req_irq(adapter, icr  REG_IS_MASK_EPID);
+
+   if (icr  REG_ICTL_MASK_TXRX_STOP_REQ)
+   fjes_txrx_stop_req_irq(adapter, icr  REG_IS_MASK_EPID);
+
+   if (icr  REG_ICTL_MASK_TXRX_STOP_DONE)
+   fjes_hw_set_irqmask(hw,
+   REG_ICTL_MASK_TXRX_STOP_DONE, true);
+
if (icr  REG_ICTL_MASK_INFO_UPDATE)
fjes_update_zone_irq(adapter, icr  REG_IS_MASK_EPID);
 
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 09/22] fjes: raise_intr_rxdata_task

2015-08-21 Thread Taku Izumi

This patch add raise_intr_rxdata_task.
Extended Socket Network Device is shared memory
based, so someone's transmission denotes other's
reception. In order to notify receivers, sender
has to raise interruption of receivers.
raise_intr_rxdata_task does this work.

Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com
---
 drivers/net/fjes/fjes.h  |  4 +++
 drivers/net/fjes/fjes_main.c | 63 
 2 files changed, 67 insertions(+)

diff --git a/drivers/net/fjes/fjes.h b/drivers/net/fjes/fjes.h
index 7af4304..8e9899e 100644
--- a/drivers/net/fjes/fjes.h
+++ b/drivers/net/fjes/fjes.h
@@ -50,6 +50,10 @@ struct fjes_adapter {
 
bool irq_registered;
 
+   struct workqueue_struct *txrx_wq;
+
+   struct work_struct raise_intr_rxdata_task;
+
struct fjes_hw hw;
 };
 
diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c
index 220ff3d..80e180f 100644
--- a/drivers/net/fjes/fjes_main.c
+++ b/drivers/net/fjes/fjes_main.c
@@ -52,6 +52,7 @@ static int fjes_close(struct net_device *);
 static int fjes_setup_resources(struct fjes_adapter *);
 static void fjes_free_resources(struct fjes_adapter *);
 static netdev_tx_t fjes_xmit_frame(struct sk_buff *, struct net_device *);
+static void fjes_raise_intr_rxdata_task(struct work_struct *);
 static irqreturn_t fjes_intr(int, void*);
 
 static int fjes_acpi_add(struct acpi_device *);
@@ -276,6 +277,8 @@ static int fjes_close(struct net_device *netdev)
 
fjes_free_irq(adapter);
 
+   cancel_work_sync(adapter-raise_intr_rxdata_task);
+
fjes_hw_wait_epstop(hw);
 
fjes_free_resources(adapter);
@@ -404,6 +407,54 @@ static void fjes_free_resources(struct fjes_adapter 
*adapter)
}
 }
 
+static void fjes_raise_intr_rxdata_task(struct work_struct *work)
+{
+   struct fjes_adapter *adapter = container_of(work,
+   struct fjes_adapter, raise_intr_rxdata_task);
+   struct fjes_hw *hw = adapter-hw;
+   enum ep_partner_status pstatus;
+   int max_epid, my_epid, epid;
+
+   my_epid = hw-my_epid;
+   max_epid = hw-max_epid;
+
+   for (epid = 0; epid  max_epid; epid++)
+   hw-ep_shm_info[epid].tx_status_work = 0;
+
+   for (epid = 0; epid  max_epid; epid++) {
+   if (epid == my_epid)
+   continue;
+
+   pstatus = fjes_hw_get_partner_ep_status(hw, epid);
+   if (pstatus == EP_PARTNER_SHARED) {
+   hw-ep_shm_info[epid].tx_status_work =
+   hw-ep_shm_info[epid].tx.info-v1i.tx_status;
+
+   if (hw-ep_shm_info[epid].tx_status_work ==
+   FJES_TX_DELAY_SEND_PENDING) {
+   hw-ep_shm_info[epid].tx.info-v1i.tx_status =
+   FJES_TX_DELAY_SEND_NONE;
+   }
+   }
+   }
+
+   for (epid = 0; epid  max_epid; epid++) {
+   if (epid == my_epid)
+   continue;
+
+   pstatus = fjes_hw_get_partner_ep_status(hw, epid);
+   if ((hw-ep_shm_info[epid].tx_status_work ==
+FJES_TX_DELAY_SEND_PENDING) 
+   (pstatus == EP_PARTNER_SHARED) 
+   !(hw-ep_shm_info[epid].rx.info-v1i.rx_status)) {
+   fjes_hw_raise_interrupt(hw, epid,
+   REG_ICTL_MASK_RX_DATA);
+   }
+   }
+
+   usleep_range(500, 1000);
+}
+
 static int fjes_tx_send(struct fjes_adapter *adapter, int dest,
void *data, size_t len)
 {
@@ -416,6 +467,9 @@ static int fjes_tx_send(struct fjes_adapter *adapter, int 
dest,
 
adapter-hw.ep_shm_info[dest].tx.info-v1i.tx_status =
FJES_TX_DELAY_SEND_PENDING;
+   if (!work_pending(adapter-raise_intr_rxdata_task))
+   queue_work(adapter-txrx_wq,
+  adapter-raise_intr_rxdata_task);
 
retval = 0;
return retval;
@@ -630,6 +684,11 @@ static int fjes_probe(struct platform_device *plat_dev)
adapter-force_reset = false;
adapter-open_guard = false;
 
+   adapter-txrx_wq = create_workqueue(DRV_NAME /txrx);
+
+   INIT_WORK(adapter-raise_intr_rxdata_task,
+ fjes_raise_intr_rxdata_task);
+
res = platform_get_resource(plat_dev, IORESOURCE_MEM, 0);
hw-hw_res.start = res-start;
hw-hw_res.size = res-end - res-start + 1;
@@ -669,6 +728,10 @@ static int fjes_remove(struct platform_device *plat_dev)
struct fjes_adapter *adapter = netdev_priv(netdev);
struct fjes_hw *hw = adapter-hw;
 
+   cancel_work_sync(adapter-raise_intr_rxdata_task);
+   if (adapter-txrx_wq)
+   destroy_workqueue(adapter-txrx_wq);
+
unregister_netdev(netdev);
 
fjes_hw_exit(hw);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line

[PATCH v3 02/22] fjes: Hardware initialization routine

2015-08-21 Thread Taku Izumi

This patch adds hardware initialization routine to be
invoked at driver's .probe routine.

Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com
---
 drivers/net/fjes/Makefile|   2 +-
 drivers/net/fjes/fjes.h  |   1 +
 drivers/net/fjes/fjes_hw.c   | 295 +++
 drivers/net/fjes/fjes_hw.h   | 251 
 drivers/net/fjes/fjes_regs.h | 102 +++
 5 files changed, 650 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/fjes/fjes_hw.c
 create mode 100644 drivers/net/fjes/fjes_hw.h
 create mode 100644 drivers/net/fjes/fjes_regs.h

diff --git a/drivers/net/fjes/Makefile b/drivers/net/fjes/Makefile
index 34bccba..753d52f 100644
--- a/drivers/net/fjes/Makefile
+++ b/drivers/net/fjes/Makefile
@@ -27,4 +27,4 @@
 
 obj-$(CONFIG_FUJITSU_ES) += fjes.o
 
-fjes-objs := fjes_main.o
+fjes-objs := fjes_main.o fjes_hw.o
diff --git a/drivers/net/fjes/fjes.h b/drivers/net/fjes/fjes.h
index 52eb60b..15ded96 100644
--- a/drivers/net/fjes/fjes.h
+++ b/drivers/net/fjes/fjes.h
@@ -28,5 +28,6 @@
 
 extern char fjes_driver_name[];
 extern char fjes_driver_version[];
+extern const u32 fjes_support_mtu[];
 
 #endif /* FJES_H_ */
diff --git a/drivers/net/fjes/fjes_hw.c b/drivers/net/fjes/fjes_hw.c
new file mode 100644
index 000..ae26638
--- /dev/null
+++ b/drivers/net/fjes/fjes_hw.c
@@ -0,0 +1,295 @@
+/*
+ *  FUJITSU Extended Socket Network Device driver
+ *  Copyright (c) 2015 FUJITSU LIMITED
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, see http://www.gnu.org/licenses/.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called COPYING.
+ *
+ */
+
+#include fjes_hw.h
+#include fjes.h
+
+/* supported MTU list */
+const u32 fjes_support_mtu[] = {
+   FJES_MTU_DEFINE(8 * 1024),
+   FJES_MTU_DEFINE(16 * 1024),
+   FJES_MTU_DEFINE(32 * 1024),
+   FJES_MTU_DEFINE(64 * 1024),
+   0
+};
+
+u32 fjes_hw_rd32(struct fjes_hw *hw, u32 reg)
+{
+   u8 *base = hw-base;
+   u32 value = 0;
+
+   value = readl(base[reg]);
+
+   return value;
+}
+
+static u8 *fjes_hw_iomap(struct fjes_hw *hw)
+{
+   u8 *base;
+
+   if (!request_mem_region(hw-hw_res.start, hw-hw_res.size,
+   fjes_driver_name)) {
+   pr_err(request_mem_region failed\n);
+   return NULL;
+   }
+
+   base = (u8 *)ioremap_nocache(hw-hw_res.start, hw-hw_res.size);
+
+   return base;
+}
+
+int fjes_hw_reset(struct fjes_hw *hw)
+{
+   union REG_DCTL dctl;
+   int timeout;
+
+   dctl.reg = 0;
+   dctl.bits.reset = 1;
+   wr32(XSCT_DCTL, dctl.reg);
+
+   timeout = FJES_DEVICE_RESET_TIMEOUT * 1000;
+   dctl.reg = rd32(XSCT_DCTL);
+   while ((dctl.bits.reset == 1)  (timeout  0)) {
+   msleep(1000);
+   dctl.reg = rd32(XSCT_DCTL);
+   timeout -= 1000;
+   }
+
+   return timeout  0 ? 0 : -EIO;
+}
+
+static int fjes_hw_get_max_epid(struct fjes_hw *hw)
+{
+   union REG_MAX_EP info;
+
+   info.reg = rd32(XSCT_MAX_EP);
+
+   return info.bits.maxep;
+}
+
+static int fjes_hw_get_my_epid(struct fjes_hw *hw)
+{
+   union REG_OWNER_EPID info;
+
+   info.reg = rd32(XSCT_OWNER_EPID);
+
+   return info.bits.epid;
+}
+
+static int fjes_hw_alloc_shared_status_region(struct fjes_hw *hw)
+{
+   size_t size;
+
+   size = sizeof(struct fjes_device_shared_info) +
+   (sizeof(u8) * hw-max_epid);
+   hw-hw_info.share = kzalloc(size, GFP_KERNEL);
+   if (!hw-hw_info.share)
+   return -ENOMEM;
+
+   hw-hw_info.share-epnum = hw-max_epid;
+
+   return 0;
+}
+
+static int fjes_hw_alloc_epbuf(struct epbuf_handler *epbh)
+{
+   void *mem;
+
+   mem = vzalloc(EP_BUFFER_SIZE);
+   if (!mem)
+   return -ENOMEM;
+
+   epbh-buffer = mem;
+   epbh-size = EP_BUFFER_SIZE;
+
+   epbh-info = (union ep_buffer_info *)mem;
+   epbh-ring = (u8 *)(mem + sizeof(union ep_buffer_info));
+
+   return 0;
+}
+
+void fjes_hw_setup_epbuf(struct epbuf_handler *epbh, u8 *mac_addr, u32 mtu)
+{
+   union ep_buffer_info *info = epbh-info;
+   u16 vlan_id[EP_BUFFER_SUPPORT_VLAN_MAX];
+   int i;
+
+   for (i = 0; i  EP_BUFFER_SUPPORT_VLAN_MAX; i++)
+   vlan_id[i] = info-v1i.vlan_id[i];
+
+   memset(info, 0, sizeof(union ep_buffer_info));
+
+   info-v1i.version = 0;  /* version 0 */

[PATCH v3 08/22] fjes: net_device_ops.ndo_start_xmit

2015-08-21 Thread Taku Izumi

This patch adds net_device_ops.ndo_start_xmit callback,
which is called when sending packets.

Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com
---
 drivers/net/fjes/fjes.h  |   1 +
 drivers/net/fjes/fjes_hw.c   |  55 ++
 drivers/net/fjes/fjes_hw.h   |  12 +++
 drivers/net/fjes/fjes_main.c | 177 +++
 4 files changed, 245 insertions(+)

diff --git a/drivers/net/fjes/fjes.h b/drivers/net/fjes/fjes.h
index f182ed3..7af4304 100644
--- a/drivers/net/fjes/fjes.h
+++ b/drivers/net/fjes/fjes.h
@@ -29,6 +29,7 @@
 #define FJES_ACPI_SYMBOL   Extended Socket
 #define FJES_MAX_QUEUES1
 #define FJES_TX_RETRY_INTERVAL (20 * HZ)
+#define FJES_TX_RETRY_TIMEOUT  (100)
 #define FJES_OPEN_ZONE_UPDATE_WAIT (300) /* msec */
 
 /* board specific private data structure */
diff --git a/drivers/net/fjes/fjes_hw.c b/drivers/net/fjes/fjes_hw.c
index 1935f48..487dbc6 100644
--- a/drivers/net/fjes/fjes_hw.c
+++ b/drivers/net/fjes/fjes_hw.c
@@ -791,3 +791,58 @@ int fjes_hw_wait_epstop(struct fjes_hw *hw)
return (wait_time  FJES_COMMAND_EPSTOP_WAIT_TIMEOUT * 1000)
? 0 : -EBUSY;
 }
+
+bool fjes_hw_check_epbuf_version(struct epbuf_handler *epbh, u32 version)
+{
+   union ep_buffer_info *info = epbh-info;
+
+   return (info-common.version == version);
+}
+
+bool fjes_hw_check_mtu(struct epbuf_handler *epbh, u32 mtu)
+{
+   union ep_buffer_info *info = epbh-info;
+
+   return (info-v1i.frame_max == FJES_MTU_TO_FRAME_SIZE(mtu));
+}
+
+bool fjes_hw_check_vlan_id(struct epbuf_handler *epbh, u16 vlan_id)
+{
+   union ep_buffer_info *info = epbh-info;
+   bool ret = false;
+   int i;
+
+   if (vlan_id == 0) {
+   ret = true;
+   } else {
+   for (i = 0; i  EP_BUFFER_SUPPORT_VLAN_MAX; i++) {
+   if (vlan_id == info-v1i.vlan_id[i]) {
+   ret = true;
+   break;
+   }
+   }
+   }
+   return ret;
+}
+
+int fjes_hw_epbuf_tx_pkt_send(struct epbuf_handler *epbh,
+ void *frame, size_t size)
+{
+   union ep_buffer_info *info = epbh-info;
+   struct esmem_frame *ring_frame;
+
+   if (EP_RING_FULL(info-v1i.head, info-v1i.tail, info-v1i.count_max))
+   return -ENOBUFS;
+
+   ring_frame = (struct esmem_frame *)(epbh-ring[EP_RING_INDEX
+(info-v1i.tail - 1,
+ info-v1i.count_max) *
+info-v1i.frame_max]);
+
+   ring_frame-frame_size = size;
+   memcpy((void *)(ring_frame-frame_data), (void *)frame, size);
+
+   EP_RING_INDEX_INC(epbh-info-v1i.tail, info-v1i.count_max);
+
+   return 0;
+}
diff --git a/drivers/net/fjes/fjes_hw.h b/drivers/net/fjes/fjes_hw.h
index 9b8df55..07e1226 100644
--- a/drivers/net/fjes/fjes_hw.h
+++ b/drivers/net/fjes/fjes_hw.h
@@ -50,6 +50,9 @@ struct fjes_hw;
 
 #define FJES_ZONING_ZONE_TYPE_NONE (0xFF)
 
+#define FJES_TX_DELAY_SEND_NONE(0)
+#define FJES_TX_DELAY_SEND_PENDING (1)
+
 #define FJES_RX_STOP_REQ_NONE  (0x0)
 #define FJES_RX_STOP_REQ_DONE  (0x1)
 #define FJES_RX_STOP_REQ_REQUEST   (0x2)
@@ -61,6 +64,11 @@ struct fjes_hw;
 
 #define EP_RING_NUM(buffer_size, frame_size) \
(u32)((buffer_size) / (frame_size))
+#define EP_RING_INDEX(_num, _max) (((_num) + (_max)) % (_max))
+#define EP_RING_INDEX_INC(_num, _max) \
+   ((_num) = EP_RING_INDEX((_num) + 1, (_max)))
+#define EP_RING_FULL(_head, _tail, _max)   \
+   (0 == EP_RING_INDEX(((_tail) - (_head)), (_max)))
 
 #define FJES_MTU_TO_BUFFER_SIZE(mtu) \
(ETH_HLEN + VLAN_HLEN + (mtu) + ETH_FCS_LEN)
@@ -309,5 +317,9 @@ enum ep_partner_status
 
 bool fjes_hw_epid_is_same_zone(struct fjes_hw *, int);
 int fjes_hw_epid_is_shared(struct fjes_device_shared_info *, int);
+bool fjes_hw_check_epbuf_version(struct epbuf_handler *, u32);
+bool fjes_hw_check_mtu(struct epbuf_handler *, u32);
+bool fjes_hw_check_vlan_id(struct epbuf_handler *, u16);
+int fjes_hw_epbuf_tx_pkt_send(struct epbuf_handler *, void *, size_t);
 
 #endif /* FJES_HW_H_ */
diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c
index bd50cbd..220ff3d 100644
--- a/drivers/net/fjes/fjes_main.c
+++ b/drivers/net/fjes/fjes_main.c
@@ -51,6 +51,7 @@ static int fjes_open(struct net_device *);
 static int fjes_close(struct net_device *);
 static int fjes_setup_resources(struct fjes_adapter *);
 static void fjes_free_resources(struct fjes_adapter *);
+static netdev_tx_t fjes_xmit_frame(struct sk_buff *, struct net_device *);
 static irqreturn_t fjes_intr(int, void*);
 
 static int fjes_acpi_add(struct acpi_device *);
@@ -212,6 +213,7 @@ static void fjes_free_irq(struct fjes_adapter *adapter)
 static const struct net_device_ops

[PATCH v3 06/22] fjes: buffer address regist/unregistration routine

2015-08-21 Thread Taku Izumi

This patch adds buffer address regist/unregistration routine.

This function is mainly invoked when network device's
activation (open) and deactivation (close)
in order to retist/unregist shared buffer address.

Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com
---
 drivers/net/fjes/fjes_hw.c | 186 +
 drivers/net/fjes/fjes_hw.h |   9 ++-
 2 files changed, 194 insertions(+), 1 deletion(-)

diff --git a/drivers/net/fjes/fjes_hw.c b/drivers/net/fjes/fjes_hw.c
index c31be7f..1e807df 100644
--- a/drivers/net/fjes/fjes_hw.c
+++ b/drivers/net/fjes/fjes_hw.c
@@ -452,6 +452,192 @@ int fjes_hw_request_info(struct fjes_hw *hw)
return result;
 }
 
+int fjes_hw_register_buff_addr(struct fjes_hw *hw, int dest_epid,
+  struct ep_share_mem_info *buf_pair)
+{
+   union fjes_device_command_req *req_buf = hw-hw_info.req_buf;
+   union fjes_device_command_res *res_buf = hw-hw_info.res_buf;
+   enum fjes_dev_command_response_e ret;
+   int page_count;
+   int timeout;
+   int i, idx;
+   void *addr;
+   int result;
+
+   if (test_bit(dest_epid, hw-hw_info.buffer_share_bit))
+   return 0;
+
+   memset(req_buf, 0, hw-hw_info.req_buf_size);
+   memset(res_buf, 0, hw-hw_info.res_buf_size);
+
+   req_buf-share_buffer.length = FJES_DEV_COMMAND_SHARE_BUFFER_REQ_LEN(
+   buf_pair-tx.size,
+   buf_pair-rx.size);
+   req_buf-share_buffer.epid = dest_epid;
+
+   idx = 0;
+   req_buf-share_buffer.buffer[idx++] = buf_pair-tx.size;
+   page_count = buf_pair-tx.size / EP_BUFFER_INFO_SIZE;
+   for (i = 0; i  page_count; i++) {
+   addr = ((u8 *)(buf_pair-tx.buffer)) +
+   (i * EP_BUFFER_INFO_SIZE);
+   req_buf-share_buffer.buffer[idx++] =
+   (__le64)(page_to_phys(vmalloc_to_page(addr)) +
+   offset_in_page(addr));
+   }
+
+   req_buf-share_buffer.buffer[idx++] = buf_pair-rx.size;
+   page_count = buf_pair-rx.size / EP_BUFFER_INFO_SIZE;
+   for (i = 0; i  page_count; i++) {
+   addr = ((u8 *)(buf_pair-rx.buffer)) +
+   (i * EP_BUFFER_INFO_SIZE);
+   req_buf-share_buffer.buffer[idx++] =
+   (__le64)(page_to_phys(vmalloc_to_page(addr)) +
+   offset_in_page(addr));
+   }
+
+   res_buf-share_buffer.length = 0;
+   res_buf-share_buffer.code = 0;
+
+   ret = fjes_hw_issue_request_command(hw, FJES_CMD_REQ_SHARE_BUFFER);
+
+   timeout = FJES_COMMAND_REQ_BUFF_TIMEOUT * 1000;
+   while ((ret == FJES_CMD_STATUS_NORMAL) 
+  (res_buf-share_buffer.length ==
+   FJES_DEV_COMMAND_SHARE_BUFFER_RES_LEN) 
+  (res_buf-share_buffer.code == FJES_CMD_REQ_RES_CODE_BUSY) 
+  (timeout  0)) {
+   msleep(200 + hw-my_epid * 20);
+   timeout -= (200 + hw-my_epid * 20);
+
+   res_buf-share_buffer.length = 0;
+   res_buf-share_buffer.code = 0;
+
+   ret = fjes_hw_issue_request_command(
+   hw, FJES_CMD_REQ_SHARE_BUFFER);
+   }
+
+   result = 0;
+
+   if (res_buf-share_buffer.length !=
+   FJES_DEV_COMMAND_SHARE_BUFFER_RES_LEN)
+   result = -ENOMSG;
+   else if (ret == FJES_CMD_STATUS_NORMAL) {
+   switch (res_buf-share_buffer.code) {
+   case FJES_CMD_REQ_RES_CODE_NORMAL:
+   result = 0;
+   set_bit(dest_epid, hw-hw_info.buffer_share_bit);
+   break;
+   case FJES_CMD_REQ_RES_CODE_BUSY:
+   result = -EBUSY;
+   break;
+   default:
+   result = -EPERM;
+   break;
+   }
+   } else {
+   switch (ret) {
+   case FJES_CMD_STATUS_UNKNOWN:
+   result = -EPERM;
+   break;
+   case FJES_CMD_STATUS_TIMEOUT:
+   result = -EBUSY;
+   break;
+   case FJES_CMD_STATUS_ERROR_PARAM:
+   case FJES_CMD_STATUS_ERROR_STATUS:
+   default:
+   result = -EPERM;
+   break;
+   }
+   }
+
+   return result;
+}
+
+int fjes_hw_unregister_buff_addr(struct fjes_hw *hw, int dest_epid)
+{
+   union fjes_device_command_req *req_buf = hw-hw_info.req_buf;
+   union fjes_device_command_res *res_buf = hw-hw_info.res_buf;
+   struct fjes_device_shared_info *share = hw-hw_info.share;
+   enum fjes_dev_command_response_e ret;
+

[PATCH v3 22/22] fjes: ethtool support

2015-08-21 Thread Taku Izumi

This patch adds implementation for ethtool support.

Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com
---
 drivers/net/fjes/Makefile   |   2 +-
 drivers/net/fjes/fjes.h |   2 +
 drivers/net/fjes/fjes_ethtool.c | 137 
 drivers/net/fjes/fjes_main.c|   1 +
 4 files changed, 141 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/fjes/fjes_ethtool.c

diff --git a/drivers/net/fjes/Makefile b/drivers/net/fjes/Makefile
index 753d52f..523e3d7 100644
--- a/drivers/net/fjes/Makefile
+++ b/drivers/net/fjes/Makefile
@@ -27,4 +27,4 @@
 
 obj-$(CONFIG_FUJITSU_ES) += fjes.o
 
-fjes-objs := fjes_main.o fjes_hw.o
+fjes-objs := fjes_main.o fjes_hw.o fjes_ethtool.o
diff --git a/drivers/net/fjes/fjes.h b/drivers/net/fjes/fjes.h
index 57feee8..a592fe2 100644
--- a/drivers/net/fjes/fjes.h
+++ b/drivers/net/fjes/fjes.h
@@ -72,4 +72,6 @@ extern char fjes_driver_name[];
 extern char fjes_driver_version[];
 extern const u32 fjes_support_mtu[];
 
+void fjes_set_ethtool_ops(struct net_device *);
+
 #endif /* FJES_H_ */
diff --git a/drivers/net/fjes/fjes_ethtool.c b/drivers/net/fjes/fjes_ethtool.c
new file mode 100644
index 000..0119dd1
--- /dev/null
+++ b/drivers/net/fjes/fjes_ethtool.c
@@ -0,0 +1,137 @@
+/*
+ *  FUJITSU Extended Socket Network Device driver
+ *  Copyright (c) 2015 FUJITSU LIMITED
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, see http://www.gnu.org/licenses/.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called COPYING.
+ *
+ */
+
+/* ethtool support for fjes */
+
+#include linux/vmalloc.h
+#include linux/netdevice.h
+#include linux/ethtool.h
+#include linux/platform_device.h
+
+#include fjes.h
+
+struct fjes_stats {
+   char stat_string[ETH_GSTRING_LEN];
+   int sizeof_stat;
+   int stat_offset;
+};
+
+#define FJES_STAT(name, stat) { \
+   .stat_string = name, \
+   .sizeof_stat = FIELD_SIZEOF(struct fjes_adapter, stat), \
+   .stat_offset = offsetof(struct fjes_adapter, stat) \
+}
+
+static const struct fjes_stats fjes_gstrings_stats[] = {
+   FJES_STAT(rx_packets, stats64.rx_packets),
+   FJES_STAT(tx_packets, stats64.tx_packets),
+   FJES_STAT(rx_bytes, stats64.rx_bytes),
+   FJES_STAT(tx_bytes, stats64.rx_bytes),
+   FJES_STAT(rx_dropped, stats64.rx_dropped),
+   FJES_STAT(tx_dropped, stats64.tx_dropped),
+};
+
+static void fjes_get_ethtool_stats(struct net_device *netdev,
+  struct ethtool_stats *stats, u64 *data)
+{
+   struct fjes_adapter *adapter = netdev_priv(netdev);
+   char *p;
+   int i;
+
+   for (i = 0; i  ARRAY_SIZE(fjes_gstrings_stats); i++) {
+   p = (char *)adapter + fjes_gstrings_stats[i].stat_offset;
+   data[i] = (fjes_gstrings_stats[i].sizeof_stat == sizeof(u64))
+   ? *(u64 *)p : *(u32 *)p;
+   }
+}
+
+static void fjes_get_strings(struct net_device *netdev,
+u32 stringset, u8 *data)
+{
+   u8 *p = data;
+   int i;
+
+   switch (stringset) {
+   case ETH_SS_STATS:
+   for (i = 0; i  ARRAY_SIZE(fjes_gstrings_stats); i++) {
+   memcpy(p, fjes_gstrings_stats[i].stat_string,
+  ETH_GSTRING_LEN);
+   p += ETH_GSTRING_LEN;
+   }
+   break;
+   }
+}
+
+static int fjes_get_sset_count(struct net_device *netdev, int sset)
+{
+   switch (sset) {
+   case ETH_SS_STATS:
+   return ARRAY_SIZE(fjes_gstrings_stats);
+   default:
+   return -EOPNOTSUPP;
+   }
+}
+
+static void fjes_get_drvinfo(struct net_device *netdev,
+struct ethtool_drvinfo *drvinfo)
+{
+   struct fjes_adapter *adapter = netdev_priv(netdev);
+   struct platform_device *plat_dev;
+
+   plat_dev = adapter-plat_dev;
+
+   strlcpy(drvinfo-driver, fjes_driver_name, sizeof(drvinfo-driver));
+   strlcpy(drvinfo-version, fjes_driver_version,
+   sizeof(drvinfo-version));
+
+   strlcpy(drvinfo-fw_version, none, sizeof(drvinfo-fw_version));
+   snprintf(drvinfo-bus_info, sizeof(drvinfo-bus_info),
+platform:%s, plat_dev-name);
+   drvinfo-regdump_len = 0;
+   drvinfo-eedump_len = 0;
+}
+
+static int fjes_get_settings(struct net_device *netdev,
+struct ethtool_cmd *ecmd)

[PATCH net-next 1/1] sfc: Allow driver to cope with a lower number of VIs than it needs for RSS

2015-08-21 Thread Shradha Shah

Previously, the driver would refuse to load if it couldn't secure
enough VIs from the MC to fulfill its RSS requirements.
This was causing probe to fail on later functions in configuration
where we'd run out of VIs, such as having many VFs.

This change allows the driver to load with fewer VIs, down to a
minimum of 2. A warning will be printed saying that RSS
requirements were not met, possibly affecting performance.

efx-max_tx_channels needs to be set to avoid going down the
failure path in efx_probe_nic() immediately in the loop after the
probe() NIC-type function.
Also, Set rc=ENOSPC when bombing out of efx_probe_nic due to lack
of VIs.

Signed-off-by: Shradha Shah ss...@solarflare.com
---
 drivers/net/ethernet/sfc/ef10.c   | 38 ++
 drivers/net/ethernet/sfc/efx.c| 44 +--
 drivers/net/ethernet/sfc/efx.h|  1 +
 drivers/net/ethernet/sfc/falcon.c |  1 +
 drivers/net/ethernet/sfc/net_driver.h |  1 +
 drivers/net/ethernet/sfc/siena.c  |  1 +
 6 files changed, 64 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index 06b8061..99e3510 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -295,11 +295,11 @@ static int efx_ef10_probe(struct efx_nic *efx)
/* We can have one VI for each 8K region.  However, until we
 * use TX option descriptors we need two TX queues per channel.
 */
-   efx-max_channels =
-   min_t(unsigned int,
- EFX_MAX_CHANNELS,
- efx_ef10_mem_map_size(efx) /
- (EFX_VI_PAGE_SIZE * EFX_TXQ_TYPES));
+   efx-max_channels = min_t(unsigned int,
+ EFX_MAX_CHANNELS,
+ efx_ef10_mem_map_size(efx) /
+ (EFX_VI_PAGE_SIZE * EFX_TXQ_TYPES));
+   efx-max_tx_channels = efx-max_channels;
if (WARN_ON(efx-max_channels == 0))
return -EIO;
 
@@ -824,11 +824,12 @@ static int efx_ef10_dimension_resources(struct efx_nic 
*efx)
 {
struct efx_ef10_nic_data *nic_data = efx-nic_data;
unsigned int uc_mem_map_size, wc_mem_map_size;
-   unsigned int min_vis, pio_write_vi_base, max_vis;
+   unsigned int min_vis = max(EFX_TXQ_TYPES, separate_tx_channels ? 2 : 1);
+   unsigned int channel_vis, pio_write_vi_base, max_vis;
void __iomem *membase;
int rc;
 
-   min_vis = max(efx-n_channels, efx-n_tx_channels * EFX_TXQ_TYPES);
+   channel_vis = max(efx-n_channels, efx-n_tx_channels * EFX_TXQ_TYPES);
 
 #ifdef EFX_USE_PIO
/* Try to allocate PIO buffers if wanted and if the full
@@ -862,11 +863,11 @@ static int efx_ef10_dimension_resources(struct efx_nic 
*efx)
 * page size is 4K).  So we may allocate some extra VIs just
 * for writing PIO buffers through.
 *
-* The UC mapping contains (min_vis - 1) complete VIs and the
+* The UC mapping contains (channel_vis - 1) complete VIs and the
 * first half of the next VI.  Then the WC mapping begins with
 * the second half of this last VI.
 */
-   uc_mem_map_size = PAGE_ALIGN((min_vis - 1) * EFX_VI_PAGE_SIZE +
+   uc_mem_map_size = PAGE_ALIGN((channel_vis - 1) * EFX_VI_PAGE_SIZE +
 ER_DZ_TX_PIOBUF);
if (nic_data-n_piobufs) {
/* pio_write_vi_base rounds down to give the number of complete
@@ -881,7 +882,7 @@ static int efx_ef10_dimension_resources(struct efx_nic *efx)
} else {
pio_write_vi_base = 0;
wc_mem_map_size = 0;
-   max_vis = min_vis;
+   max_vis = channel_vis;
}
 
/* In case the last attached driver failed to free VIs, do it now */
@@ -893,6 +894,23 @@ static int efx_ef10_dimension_resources(struct efx_nic 
*efx)
if (rc != 0)
return rc;
 
+   if (nic_data-n_allocated_vis  channel_vis) {
+   netif_info(efx, drv, efx-net_dev,
+  Could not allocate enough VIs to satisfy RSS
+   requirements. Performance may not be optimal.\n);
+   /* We didn't get the VIs to populate our channels.
+* We could keep what we got but then we'd have more
+* interrupts than we need.
+* Instead calculate new max_channels and restart
+*/
+   efx-max_channels = nic_data-n_allocated_vis;
+   efx-max_tx_channels =
+   nic_data-n_allocated_vis / EFX_TXQ_TYPES;
+
+   efx_ef10_free_vis(efx);
+   return -EAGAIN;
+   }
+
/* If we didn't get enough VIs to map all the PIO buffers, free the
 * PIO buffers
 */
diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
index

[PATCH v2 net-next] r8169: Add values missing in @get_stats64 from HW counters

2015-08-21 Thread Corinna Vinschen

The r8169 driver collects statistical information returned by
@get_stats64 by counting them in the driver itself, even though many
(but not all) of the values are already collected by tally counters
(TCs) in the NIC.  Some of these TC values are not returned by
@get_stats64.  Especially the received multicast packages are missing
from /proc/net/dev.

Rectify this by fetching the TCs and returning them from
rtl8169_get_stats64.

The counters collected in the driver obviously disappear as soon as the
driver is unloaded so after a driver is loaded the counters always start
at 0. The TCs on the other hand are only reset by a power cycle.  Without
further considerations the values collected by the driver would not match
up against the TC values.

This patch introduces a new function rtl8169_reset_counters which
resets the TCs.

Unfortunately chip versions prior to RTL_GIGA_MAC_VER_19 don't allow
to reset the TCs programatically.  Therefore introduce an addition to
the rtl8169_private struct and a function rtl8169_init_counter_offsets
to store the TCs at first rtl_open.  Use these values as offsets in
rtl8169_get_stats64.

Signed-off-by: Corinna Vinschen vinsc...@redhat.com
---
 drivers/net/ethernet/realtek/r8169.c | 107 +++
 1 file changed, 107 insertions(+)

diff --git a/drivers/net/ethernet/realtek/r8169.c 
b/drivers/net/ethernet/realtek/r8169.c
index f790f61..f26a48d 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -637,6 +637,9 @@ enum rtl_register_content {
/* _TBICSRBit */
TBILinkOK   = 0x0200,
 
+   /* ResetCounterCommand */
+   CounterReset= 0x1,
+
/* DumpCounterCommand */
CounterDump = 0x8,
 
@@ -747,6 +750,14 @@ struct rtl8169_counters {
__le16  tx_underun;
 };
 
+struct rtl8169_tc_offsets {
+   boolinited;
+   __le64  tx_errors;
+   __le32  tx_multi_collision;
+   __le32  rx_multicast;
+   __le16  tx_aborted;
+};
+
 enum rtl_flag {
RTL_FLAG_TASK_ENABLED,
RTL_FLAG_TASK_SLOW_PENDING,
@@ -824,6 +835,7 @@ struct rtl8169_private {
 
struct mii_if_info mii;
struct rtl8169_counters counters;
+   struct rtl8169_tc_offsets tc_offset;
u32 saved_wolopts;
u32 opts1_mask;
 
@@ -2179,6 +2191,47 @@ static int rtl8169_get_sset_count(struct net_device 
*dev, int sset)
}
 }
 
+DECLARE_RTL_COND(rtl_reset_counters_cond)
+{
+   void __iomem *ioaddr = tp-mmio_addr;
+
+   return RTL_R32(CounterAddrLow)  CounterReset;
+}
+
+static void rtl8169_reset_counters(struct net_device *dev)
+{
+   struct rtl8169_private *tp = netdev_priv(dev);
+   void __iomem *ioaddr = tp-mmio_addr;
+   struct device *d = tp-pci_dev-dev;
+   struct rtl8169_counters *counters;
+   dma_addr_t paddr;
+   u32 cmd;
+
+   /*
+* Versions prior to RTL_GIGA_MAC_VER_19 don't support resetting the
+* tally counters.
+*/
+   if (tp-mac_version  RTL_GIGA_MAC_VER_19)
+   return;
+
+   counters = dma_alloc_coherent(d, sizeof(*counters), paddr, GFP_KERNEL);
+   if (!counters)
+   return;
+
+   RTL_W32(CounterAddrHigh, (u64)paddr  32);
+   cmd = (u64)paddr  DMA_BIT_MASK(32);
+   RTL_W32(CounterAddrLow, cmd);
+   RTL_W32(CounterAddrLow, cmd | CounterReset);
+
+   if (!rtl_udelay_loop_wait_low(tp, rtl_reset_counters_cond, 10, 1000))
+   netif_warn(tp, hw, dev, counter reset failed\n);
+
+   RTL_W32(CounterAddrLow, 0);
+   RTL_W32(CounterAddrHigh, 0);
+
+   dma_free_coherent(d, sizeof(*counters), counters, paddr);
+}
+
 DECLARE_RTL_COND(rtl_counters_cond)
 {
void __iomem *ioaddr = tp-mmio_addr;
@@ -2220,6 +2273,39 @@ static void rtl8169_update_counters(struct net_device 
*dev)
dma_free_coherent(d, sizeof(*counters), counters, paddr);
 }
 
+static void rtl8169_init_counter_offsets(struct net_device *dev)
+{
+   struct rtl8169_private *tp = netdev_priv(dev);
+
+   /*
+* rtl8169_init_counter_offsets is called from rtl_open.  On chip
+* versions prior to RTL_GIGA_MAC_VER_19 the tally counters are only
+* reset by a power cycle, while the counter values collected by the
+* driver are reset at every driver unload/load cycle.
+*
+* To make sure the HW values returned by @get_stats64 match the SW
+* values, we collect the initial values at first open(*) and use them
+* as offsets to normalize the values returned by @get_stats64.
+*
+* (*) We can't call rtl8169_init_counter_offsets from rtl_init_one
+* for the reason stated in rtl8169_update_counters; CmdRxEnb is only
+* set at open time by rtl_hw_start.
+*/
+
+   if (tp-tc_offset.inited)
+   return;
+
+   rtl8169_reset_counters(dev);
+
+   rtl8169_update_counters(dev);
+
+   tp-tc_offset.tx_errors =

Re: [PATCH] lib/Makefile: remove CONFIG_AVERAGE build rule

2015-08-21 Thread Johannes Berg

On Fri, 2015-08-21 at 10:05 +, Valentin Rothberg wrote:
 The Kconfig option AVERAGE and its implementation has been removed by
 commit f4e774f55fe0 (average: remove out-of-line implementation).
 Remove the dead build rule in lib/Makefile.

D'oh, sorry about that.

Reviewed-by: Johannes Berg johan...@sipsolutions.net

[reproducing patch in full for netdev]

 Signed-off-by: Valentin Rothberg valentinrothb...@gmail.com
 ---
 I detected the issue with scripts/checkkconfigsymbols.py
 
  lib/Makefile | 2 --
  1 file changed, 2 deletions(-)
 
 diff --git a/lib/Makefile b/lib/Makefile
 index 51e1d761f0b9..f32d342b75de 100644
 --- a/lib/Makefile
 +++ b/lib/Makefile
 @@ -143,8 +143,6 @@ obj-$(CONFIG_GENERIC_ATOMIC64) += atomic64.o
 
  obj-$(CONFIG_ATOMIC64_SELFTEST) += atomic64_test.o
 
 -obj-$(CONFIG_AVERAGE) += average.o
 -
  obj-$(CONFIG_CPU_RMAP) += cpu_rmap.o
 
  obj-$(CONFIG_CORDIC) += cordic.o
 --
 1.9.1
 
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 15/22] fjes: net_device_ops.ndo_vlan_rx_add/kill_vid

2015-08-21 Thread Taku Izumi

This patch adds net_device_ops.ndo_vlan_rx_add_vid and
net_device_ops.ndo_vlan_rx_kill_vid callback.

Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com
---
 drivers/net/fjes/fjes_hw.c   | 27 +++
 drivers/net/fjes/fjes_hw.h   |  2 ++
 drivers/net/fjes/fjes_main.c | 40 
 3 files changed, 69 insertions(+)

diff --git a/drivers/net/fjes/fjes_hw.c b/drivers/net/fjes/fjes_hw.c
index 3c96d06..4a4b750 100644
--- a/drivers/net/fjes/fjes_hw.c
+++ b/drivers/net/fjes/fjes_hw.c
@@ -825,6 +825,33 @@ bool fjes_hw_check_vlan_id(struct epbuf_handler *epbh, u16 
vlan_id)
return ret;
 }
 
+bool fjes_hw_set_vlan_id(struct epbuf_handler *epbh, u16 vlan_id)
+{
+   union ep_buffer_info *info = epbh-info;
+   int i;
+
+   for (i = 0; i  EP_BUFFER_SUPPORT_VLAN_MAX; i++) {
+   if (info-v1i.vlan_id[i] == 0) {
+   info-v1i.vlan_id[i] = vlan_id;
+   return true;
+   }
+   }
+   return false;
+}
+
+void fjes_hw_del_vlan_id(struct epbuf_handler *epbh, u16 vlan_id)
+{
+   union ep_buffer_info *info = epbh-info;
+   int i;
+
+   if (0 != vlan_id) {
+   for (i = 0; i  EP_BUFFER_SUPPORT_VLAN_MAX; i++) {
+   if (vlan_id == info-v1i.vlan_id[i])
+   info-v1i.vlan_id[i] = 0;
+   }
+   }
+}
+
 bool fjes_hw_epbuf_rx_is_empty(struct epbuf_handler *epbh)
 {
union ep_buffer_info *info = epbh-info;
diff --git a/drivers/net/fjes/fjes_hw.h b/drivers/net/fjes/fjes_hw.h
index 3511db2..95e632b 100644
--- a/drivers/net/fjes/fjes_hw.h
+++ b/drivers/net/fjes/fjes_hw.h
@@ -322,6 +322,8 @@ int fjes_hw_epid_is_shared(struct fjes_device_shared_info 
*, int);
 bool fjes_hw_check_epbuf_version(struct epbuf_handler *, u32);
 bool fjes_hw_check_mtu(struct epbuf_handler *, u32);
 bool fjes_hw_check_vlan_id(struct epbuf_handler *, u16);
+bool fjes_hw_set_vlan_id(struct epbuf_handler *, u16);
+void fjes_hw_del_vlan_id(struct epbuf_handler *, u16);
 bool fjes_hw_epbuf_rx_is_empty(struct epbuf_handler *);
 void *fjes_hw_epbuf_rx_curpkt_get_addr(struct epbuf_handler *, size_t *);
 void fjes_hw_epbuf_rx_curpkt_drop(struct epbuf_handler *);
diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c
index 94ccc11..4a4ce81 100644
--- a/drivers/net/fjes/fjes_main.c
+++ b/drivers/net/fjes/fjes_main.c
@@ -58,6 +58,8 @@ static irqreturn_t fjes_intr(int, void*);
 static struct rtnl_link_stats64 *
 fjes_get_stats64(struct net_device *, struct rtnl_link_stats64 *);
 static int fjes_change_mtu(struct net_device *, int);
+static int fjes_vlan_rx_add_vid(struct net_device *, __be16 proto, u16);
+static int fjes_vlan_rx_kill_vid(struct net_device *, __be16 proto, u16);
 static void fjes_tx_retry(struct net_device *);
 
 static int fjes_acpi_add(struct acpi_device *);
@@ -226,6 +228,8 @@ static const struct net_device_ops fjes_netdev_ops = {
.ndo_get_stats64= fjes_get_stats64,
.ndo_change_mtu = fjes_change_mtu,
.ndo_tx_timeout = fjes_tx_retry,
+   .ndo_vlan_rx_add_vid= fjes_vlan_rx_add_vid,
+   .ndo_vlan_rx_kill_vid = fjes_vlan_rx_kill_vid,
 };
 
 /* fjes_open - Called when a network interface is made active */
@@ -751,6 +755,42 @@ static int fjes_change_mtu(struct net_device *netdev, int 
new_mtu)
return -EINVAL;
 }
 
+static int fjes_vlan_rx_add_vid(struct net_device *netdev,
+   __be16 proto, u16 vid)
+{
+   struct fjes_adapter *adapter = netdev_priv(netdev);
+   bool ret = true;
+   int epid;
+
+   for (epid = 0; epid  adapter-hw.max_epid; epid++) {
+   if (epid == adapter-hw.my_epid)
+   continue;
+
+   if (!fjes_hw_check_vlan_id(
+   adapter-hw.ep_shm_info[epid].tx, vid))
+   ret = fjes_hw_set_vlan_id(
+   adapter-hw.ep_shm_info[epid].tx, vid);
+   }
+
+   return ret ? 0 : -ENOSPC;
+}
+
+static int fjes_vlan_rx_kill_vid(struct net_device *netdev,
+__be16 proto, u16 vid)
+{
+   struct fjes_adapter *adapter = netdev_priv(netdev);
+   int epid;
+
+   for (epid = 0; epid  adapter-hw.max_epid; epid++) {
+   if (epid == adapter-hw.my_epid)
+   continue;
+
+   fjes_hw_del_vlan_id(adapter-hw.ep_shm_info[epid].tx, vid);
+   }
+
+   return 0;
+}
+
 static irqreturn_t fjes_intr(int irq, void *data)
 {
struct fjes_adapter *adapter = data;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 14/22] fjes: net_device_ops.ndo_tx_timeout

2015-08-21 Thread Taku Izumi

This patch adds net_device_ops.ndo_tx_timeout callback.

Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com
---
 drivers/net/fjes/fjes_main.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c
index 519976c..94ccc11 100644
--- a/drivers/net/fjes/fjes_main.c
+++ b/drivers/net/fjes/fjes_main.c
@@ -58,6 +58,7 @@ static irqreturn_t fjes_intr(int, void*);
 static struct rtnl_link_stats64 *
 fjes_get_stats64(struct net_device *, struct rtnl_link_stats64 *);
 static int fjes_change_mtu(struct net_device *, int);
+static void fjes_tx_retry(struct net_device *);
 
 static int fjes_acpi_add(struct acpi_device *);
 static int fjes_acpi_remove(struct acpi_device *);
@@ -224,6 +225,7 @@ static const struct net_device_ops fjes_netdev_ops = {
.ndo_start_xmit = fjes_xmit_frame,
.ndo_get_stats64= fjes_get_stats64,
.ndo_change_mtu = fjes_change_mtu,
+   .ndo_tx_timeout = fjes_tx_retry,
 };
 
 /* fjes_open - Called when a network interface is made active */
@@ -705,6 +707,13 @@ fjes_xmit_frame(struct sk_buff *skb, struct net_device 
*netdev)
return ret;
 }
 
+static void fjes_tx_retry(struct net_device *netdev)
+{
+   struct netdev_queue *queue = netdev_get_tx_queue(netdev, 0);
+
+   netif_tx_wake_queue(queue);
+}
+
 static struct rtnl_link_stats64 *
 fjes_get_stats64(struct net_device *netdev, struct rtnl_link_stats64 *stats)
 {
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 16/22] fjes: interrupt_watch_task

2015-08-21 Thread Taku Izumi

This patch adds interrupt_watch_task.
This task is used to prevent delay of interrupts.

Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com
---
 drivers/net/fjes/fjes.h  |  5 +
 drivers/net/fjes/fjes_main.c | 40 +++-
 2 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/drivers/net/fjes/fjes.h b/drivers/net/fjes/fjes.h
index b04ea9d..1743dbb 100644
--- a/drivers/net/fjes/fjes.h
+++ b/drivers/net/fjes/fjes.h
@@ -32,6 +32,7 @@
 #define FJES_TX_RETRY_TIMEOUT  (100)
 #define FJES_TX_TX_STALL_TIMEOUT   (FJES_TX_RETRY_INTERVAL / 2)
 #define FJES_OPEN_ZONE_UPDATE_WAIT (300) /* msec */
+#define FJES_IRQ_WATCH_DELAY   (HZ)
 
 /* board specific private data structure */
 struct fjes_adapter {
@@ -52,10 +53,14 @@ struct fjes_adapter {
bool irq_registered;
 
struct workqueue_struct *txrx_wq;
+   struct workqueue_struct *control_wq;
 
struct work_struct tx_stall_task;
struct work_struct raise_intr_rxdata_task;
 
+   struct delayed_work interrupt_watch_task;
+   bool interrupt_watch_enable;
+
struct fjes_hw hw;
 };
 
diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c
index 4a4ce81..5fce33d 100644
--- a/drivers/net/fjes/fjes_main.c
+++ b/drivers/net/fjes/fjes_main.c
@@ -71,7 +71,7 @@ static int fjes_remove(struct platform_device *);
 
 static int fjes_sw_init(struct fjes_adapter *);
 static void fjes_netdev_setup(struct net_device *);
-
+static void fjes_irq_watch_task(struct work_struct *);
 static void fjes_rx_irq(struct fjes_adapter *, int);
 static int fjes_poll(struct napi_struct *, int);
 
@@ -197,6 +197,13 @@ static int fjes_request_irq(struct fjes_adapter *adapter)
struct net_device *netdev = adapter-netdev;
int result = -1;
 
+   adapter-interrupt_watch_enable = true;
+   if (!delayed_work_pending(adapter-interrupt_watch_task)) {
+   queue_delayed_work(adapter-control_wq,
+  adapter-interrupt_watch_task,
+  FJES_IRQ_WATCH_DELAY);
+   }
+
if (!adapter-irq_registered) {
result = request_irq(adapter-hw.hw_res.irq, fjes_intr,
 IRQF_SHARED, netdev-name, adapter);
@@ -213,6 +220,9 @@ static void fjes_free_irq(struct fjes_adapter *adapter)
 {
struct fjes_hw *hw = adapter-hw;
 
+   adapter-interrupt_watch_enable = false;
+   cancel_delayed_work_sync(adapter-interrupt_watch_task);
+
fjes_hw_set_irqmask(hw, REG_ICTL_MASK_ALL, true);
 
if (adapter-irq_registered) {
@@ -297,6 +307,7 @@ static int fjes_close(struct net_device *netdev)
 
fjes_free_irq(adapter);
 
+   cancel_delayed_work_sync(adapter-interrupt_watch_task);
cancel_work_sync(adapter-raise_intr_rxdata_task);
cancel_work_sync(adapter-tx_stall_task);
 
@@ -996,11 +1007,15 @@ static int fjes_probe(struct platform_device *plat_dev)
adapter-open_guard = false;
 
adapter-txrx_wq = create_workqueue(DRV_NAME /txrx);
+   adapter-control_wq = create_workqueue(DRV_NAME /control);
 
INIT_WORK(adapter-tx_stall_task, fjes_tx_stall_task);
INIT_WORK(adapter-raise_intr_rxdata_task,
  fjes_raise_intr_rxdata_task);
 
+   INIT_DELAYED_WORK(adapter-interrupt_watch_task, fjes_irq_watch_task);
+   adapter-interrupt_watch_enable = false;
+
res = platform_get_resource(plat_dev, IORESOURCE_MEM, 0);
hw-hw_res.start = res-start;
hw-hw_res.size = res-end - res-start + 1;
@@ -1040,8 +1055,11 @@ static int fjes_remove(struct platform_device *plat_dev)
struct fjes_adapter *adapter = netdev_priv(netdev);
struct fjes_hw *hw = adapter-hw;
 
+   cancel_delayed_work_sync(adapter-interrupt_watch_task);
cancel_work_sync(adapter-raise_intr_rxdata_task);
cancel_work_sync(adapter-tx_stall_task);
+   if (adapter-control_wq)
+   destroy_workqueue(adapter-control_wq);
if (adapter-txrx_wq)
destroy_workqueue(adapter-txrx_wq);
 
@@ -1077,6 +1095,26 @@ static void fjes_netdev_setup(struct net_device *netdev)
netdev-features |= NETIF_F_HW_CSUM | NETIF_F_HW_VLAN_CTAG_FILTER;
 }
 
+static void fjes_irq_watch_task(struct work_struct *work)
+{
+   struct fjes_adapter *adapter = container_of(to_delayed_work(work),
+   struct fjes_adapter, interrupt_watch_task);
+
+   local_irq_disable();
+   fjes_intr(adapter-hw.hw_res.irq, adapter);
+   local_irq_enable();
+
+   if (fjes_rxframe_search_exist(adapter, 0) = 0)
+   napi_schedule(adapter-napi);
+
+   if (adapter-interrupt_watch_enable) {
+   if (!delayed_work_pending(adapter-interrupt_watch_task))
+   queue_delayed_work(adapter-control_wq,
+  adapter-interrupt_watch_task,
+

[PATCH v3 18/22] fjes: unshare_watch_task

2015-08-21 Thread Taku Izumi

This patch adds unshare_watch_task.
Shared buffer's status can be changed into unshared.
This task is used to monitor shared buffer's status.

Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com
---
 drivers/net/fjes/fjes.h  |   3 ++
 drivers/net/fjes/fjes_main.c | 126 +++
 2 files changed, 129 insertions(+)

diff --git a/drivers/net/fjes/fjes.h b/drivers/net/fjes/fjes.h
index d31d4c3..57feee8 100644
--- a/drivers/net/fjes/fjes.h
+++ b/drivers/net/fjes/fjes.h
@@ -59,6 +59,9 @@ struct fjes_adapter {
struct work_struct tx_stall_task;
struct work_struct raise_intr_rxdata_task;
 
+   struct work_struct unshare_watch_task;
+   unsigned long unshare_watch_bitmask;
+
struct delayed_work interrupt_watch_task;
bool interrupt_watch_enable;
 
diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c
index caecfb3..c47ecf3 100644
--- a/drivers/net/fjes/fjes_main.c
+++ b/drivers/net/fjes/fjes_main.c
@@ -73,6 +73,7 @@ static int fjes_remove(struct platform_device *);
 static int fjes_sw_init(struct fjes_adapter *);
 static void fjes_netdev_setup(struct net_device *);
 static void fjes_irq_watch_task(struct work_struct *);
+static void fjes_watch_unshare_task(struct work_struct *);
 static void fjes_rx_irq(struct fjes_adapter *, int);
 static int fjes_poll(struct napi_struct *, int);
 
@@ -309,6 +310,8 @@ static int fjes_close(struct net_device *netdev)
fjes_free_irq(adapter);
 
cancel_delayed_work_sync(adapter-interrupt_watch_task);
+   cancel_work_sync(adapter-unshare_watch_task);
+   adapter-unshare_watch_bitmask = 0;
cancel_work_sync(adapter-raise_intr_rxdata_task);
cancel_work_sync(adapter-tx_stall_task);
 
@@ -1025,6 +1028,8 @@ static int fjes_probe(struct platform_device *plat_dev)
INIT_WORK(adapter-tx_stall_task, fjes_tx_stall_task);
INIT_WORK(adapter-raise_intr_rxdata_task,
  fjes_raise_intr_rxdata_task);
+   INIT_WORK(adapter-unshare_watch_task, fjes_watch_unshare_task);
+   adapter-unshare_watch_bitmask = 0;
 
INIT_DELAYED_WORK(adapter-interrupt_watch_task, fjes_irq_watch_task);
adapter-interrupt_watch_enable = false;
@@ -1069,6 +1074,7 @@ static int fjes_remove(struct platform_device *plat_dev)
struct fjes_hw *hw = adapter-hw;
 
cancel_delayed_work_sync(adapter-interrupt_watch_task);
+   cancel_work_sync(adapter-unshare_watch_task);
cancel_work_sync(adapter-raise_intr_rxdata_task);
cancel_work_sync(adapter-tx_stall_task);
if (adapter-control_wq)
@@ -1128,6 +1134,126 @@ static void fjes_irq_watch_task(struct work_struct 
*work)
}
 }
 
+static void fjes_watch_unshare_task(struct work_struct *work)
+{
+   struct fjes_adapter *adapter =
+   container_of(work, struct fjes_adapter, unshare_watch_task);
+
+   struct net_device *netdev = adapter-netdev;
+   struct fjes_hw *hw = adapter-hw;
+
+   int unshare_watch, unshare_reserve;
+   int max_epid, my_epid, epidx;
+   int stop_req, stop_req_done;
+   ulong unshare_watch_bitmask;
+   int wait_time = 0;
+   int is_shared;
+   int ret;
+
+   my_epid = hw-my_epid;
+   max_epid = hw-max_epid;
+
+   unshare_watch_bitmask = adapter-unshare_watch_bitmask;
+   adapter-unshare_watch_bitmask = 0;
+
+   while ((unshare_watch_bitmask || hw-txrx_stop_req_bit) 
+  (wait_time  3000)) {
+   for (epidx = 0; epidx  hw-max_epid; epidx++) {
+   if (epidx == hw-my_epid)
+   continue;
+
+   is_shared = fjes_hw_epid_is_shared(hw-hw_info.share,
+  epidx);
+
+   stop_req = test_bit(epidx, hw-txrx_stop_req_bit);
+
+   stop_req_done = 
hw-ep_shm_info[epidx].rx.info-v1i.rx_status 
+   FJES_RX_STOP_REQ_DONE;
+
+   unshare_watch = test_bit(epidx, unshare_watch_bitmask);
+
+   unshare_reserve = test_bit(epidx,
+  
hw-hw_info.buffer_unshare_reserve_bit);
+
+   if ((!stop_req ||
+(is_shared  (!is_shared || !stop_req_done))) 
+   (is_shared || !unshare_watch || !unshare_reserve))
+   continue;
+
+   mutex_lock(hw-hw_info.lock);
+   ret = fjes_hw_unregister_buff_addr(hw, epidx);
+   switch (ret) {
+   case 0:
+   break;
+   case -ENOMSG:
+   case -EBUSY:
+   default:
+   if (!work_pending(
+   adapter-force_close_task)) {
+

[PATCH v3 04/22] fjes: platform_driver's .probe and .remove routine

2015-08-21 Thread Taku Izumi

This patch implements platform_driver's .probe and .remove
routine, and also adds board specific private data structure.

This driver registers net_device at platform_driver's .probe
routine and unregisters net_device at its .remove routine.

Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com
---
 drivers/net/fjes/fjes.h  | 25 
 drivers/net/fjes/fjes_main.c | 94 
 2 files changed, 119 insertions(+)

diff --git a/drivers/net/fjes/fjes.h b/drivers/net/fjes/fjes.h
index 15ded96..54bc189 100644
--- a/drivers/net/fjes/fjes.h
+++ b/drivers/net/fjes/fjes.h
@@ -24,7 +24,32 @@
 
 #include linux/acpi.h
 
+#include fjes_hw.h
+
 #define FJES_ACPI_SYMBOL   Extended Socket
+#define FJES_MAX_QUEUES1
+#define FJES_TX_RETRY_INTERVAL (20 * HZ)
+
+/* board specific private data structure */
+struct fjes_adapter {
+   struct net_device *netdev;
+   struct platform_device *plat_dev;
+
+   struct napi_struct napi;
+   struct rtnl_link_stats64 stats64;
+
+   unsigned int tx_retry_count;
+   unsigned long tx_start_jiffies;
+   unsigned long rx_last_jiffies;
+   bool unset_rx_last;
+
+   bool force_reset;
+   bool open_guard;
+
+   bool irq_registered;
+
+   struct fjes_hw hw;
+};
 
 extern char fjes_driver_name[];
 extern char fjes_driver_version[];
diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c
index 9517666..45a8b9c 100644
--- a/drivers/net/fjes/fjes_main.c
+++ b/drivers/net/fjes/fjes_main.c
@@ -23,6 +23,7 @@
 #include linux/types.h
 #include linux/nls.h
 #include linux/platform_device.h
+#include linux/netdevice.h
 
 #include fjes.h
 
@@ -49,6 +50,9 @@ static acpi_status fjes_get_acpi_resource(struct 
acpi_resource *, void*);
 static int fjes_probe(struct platform_device *);
 static int fjes_remove(struct platform_device *);
 
+static int fjes_sw_init(struct fjes_adapter *);
+static void fjes_netdev_setup(struct net_device *);
+
 static const struct acpi_device_id fjes_acpi_ids[] = {
{PNP0C02, 0},
{, 0},
@@ -166,18 +170,108 @@ fjes_get_acpi_resource(struct acpi_resource *acpi_res, 
void *data)
return AE_OK;
 }
 
+static const struct net_device_ops fjes_netdev_ops = {
+};
+
 /* fjes_probe - Device Initialization Routine */
 static int fjes_probe(struct platform_device *plat_dev)
 {
+   struct fjes_adapter *adapter;
+   struct net_device *netdev;
+   struct resource *res;
+   struct fjes_hw *hw;
+   int err;
+
+   err = -ENOMEM;
+   netdev = alloc_netdev_mq(sizeof(struct fjes_adapter), es%d,
+NET_NAME_UNKNOWN, fjes_netdev_setup,
+FJES_MAX_QUEUES);
+
+   if (!netdev)
+   goto err_out;
+
+   SET_NETDEV_DEV(netdev, plat_dev-dev);
+
+   dev_set_drvdata(plat_dev-dev, netdev);
+   adapter = netdev_priv(netdev);
+   adapter-netdev = netdev;
+   adapter-plat_dev = plat_dev;
+   hw = adapter-hw;
+   hw-back = adapter;
+
+   /* setup the private structure */
+   err = fjes_sw_init(adapter);
+   if (err)
+   goto err_free_netdev;
+
+   adapter-force_reset = false;
+   adapter-open_guard = false;
+
+   res = platform_get_resource(plat_dev, IORESOURCE_MEM, 0);
+   hw-hw_res.start = res-start;
+   hw-hw_res.size = res-end - res-start + 1;
+   hw-hw_res.irq = platform_get_irq(plat_dev, 0);
+   err = fjes_hw_init(adapter-hw);
+   if (err)
+   goto err_free_netdev;
+
+   /* setup MAC address (02:00:00:00:00:[epid])*/
+   netdev-dev_addr[0] = 2;
+   netdev-dev_addr[1] = 0;
+   netdev-dev_addr[2] = 0;
+   netdev-dev_addr[3] = 0;
+   netdev-dev_addr[4] = 0;
+   netdev-dev_addr[5] = hw-my_epid; /* EPID */
+
+   err = register_netdev(netdev);
+   if (err)
+   goto err_hw_exit;
+
+   netif_carrier_off(netdev);
+
return 0;
+
+err_hw_exit:
+   fjes_hw_exit(adapter-hw);
+err_free_netdev:
+   free_netdev(netdev);
+err_out:
+   return err;
 }
 
 /* fjes_remove - Device Removal Routine */
 static int fjes_remove(struct platform_device *plat_dev)
 {
+   struct net_device *netdev = dev_get_drvdata(plat_dev-dev);
+   struct fjes_adapter *adapter = netdev_priv(netdev);
+   struct fjes_hw *hw = adapter-hw;
+
+   unregister_netdev(netdev);
+
+   fjes_hw_exit(hw);
+
+   free_netdev(netdev);
+
return 0;
 }
 
+static int fjes_sw_init(struct fjes_adapter *adapter)
+{
+   return 0;
+}
+
+/* fjes_netdev_setup - netdevice initialization routine */
+static void fjes_netdev_setup(struct net_device *netdev)
+{
+   ether_setup(netdev);
+
+   netdev-watchdog_timeo = FJES_TX_RETRY_INTERVAL;
+   netdev-netdev_ops = fjes_netdev_ops;
+   netdev-mtu = fjes_support_mtu[0];
+   netdev-flags |= IFF_BROADCAST;
+   netdev-features |= NETIF_F_HW_CSUM |

[PATCH v3 07/22] fjes: net_device_ops.ndo_open and .ndo_stop

2015-08-21 Thread Taku Izumi

This patch adds net_device_ops.ndo_open and .ndo_stop
callback. These function is called when network device
activation and deactivation.

Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com
---
 drivers/net/fjes/fjes.h  |   1 +
 drivers/net/fjes/fjes_hw.c   | 145 +
 drivers/net/fjes/fjes_hw.h   |  30 ++
 drivers/net/fjes/fjes_main.c | 246 +++
 drivers/net/fjes/fjes_regs.h |  17 +++
 5 files changed, 439 insertions(+)

diff --git a/drivers/net/fjes/fjes.h b/drivers/net/fjes/fjes.h
index 54bc189..f182ed3 100644
--- a/drivers/net/fjes/fjes.h
+++ b/drivers/net/fjes/fjes.h
@@ -29,6 +29,7 @@
 #define FJES_ACPI_SYMBOL   Extended Socket
 #define FJES_MAX_QUEUES1
 #define FJES_TX_RETRY_INTERVAL (20 * HZ)
+#define FJES_OPEN_ZONE_UPDATE_WAIT (300) /* msec */
 
 /* board specific private data structure */
 struct fjes_adapter {
diff --git a/drivers/net/fjes/fjes_hw.c b/drivers/net/fjes/fjes_hw.c
index 1e807df..1935f48 100644
--- a/drivers/net/fjes/fjes_hw.c
+++ b/drivers/net/fjes/fjes_hw.c
@@ -638,6 +638,25 @@ int fjes_hw_unregister_buff_addr(struct fjes_hw *hw, int 
dest_epid)
return result;
 }
 
+int fjes_hw_raise_interrupt(struct fjes_hw *hw, int dest_epid,
+   enum REG_ICTL_MASK  mask)
+{
+   u32 ig = mask | dest_epid;
+
+   wr32(XSCT_IG, cpu_to_le32(ig));
+
+   return 0;
+}
+
+u32 fjes_hw_capture_interrupt_status(struct fjes_hw *hw)
+{
+   u32 cur_is;
+
+   cur_is = rd32(XSCT_IS);
+
+   return cur_is;
+}
+
 void fjes_hw_set_irqmask(struct fjes_hw *hw,
 enum REG_ICTL_MASK intr_mask, bool mask)
 {
@@ -646,3 +665,129 @@ void fjes_hw_set_irqmask(struct fjes_hw *hw,
else
wr32(XSCT_IMC, intr_mask);
 }
+
+bool fjes_hw_epid_is_same_zone(struct fjes_hw *hw, int epid)
+{
+   if (epid = hw-max_epid)
+   return false;
+
+   if ((hw-ep_shm_info[epid].es_status !=
+   FJES_ZONING_STATUS_ENABLE) ||
+   (hw-ep_shm_info[hw-my_epid].zone ==
+   FJES_ZONING_ZONE_TYPE_NONE))
+   return false;
+   else
+   return (hw-ep_shm_info[epid].zone ==
+   hw-ep_shm_info[hw-my_epid].zone);
+}
+
+int fjes_hw_epid_is_shared(struct fjes_device_shared_info *share,
+  int dest_epid)
+{
+   int value = false;
+
+   if (dest_epid  share-epnum)
+   value = share-ep_status[dest_epid];
+
+   return value;
+}
+
+static bool fjes_hw_epid_is_stop_requested(struct fjes_hw *hw, int src_epid)
+{
+   return test_bit(src_epid, hw-txrx_stop_req_bit);
+}
+
+static bool fjes_hw_epid_is_stop_process_done(struct fjes_hw *hw, int src_epid)
+{
+   return (hw-ep_shm_info[src_epid].tx.info-v1i.rx_status 
+   FJES_RX_STOP_REQ_DONE);
+}
+
+enum ep_partner_status
+fjes_hw_get_partner_ep_status(struct fjes_hw *hw, int epid)
+{
+   enum ep_partner_status status;
+
+   if (fjes_hw_epid_is_shared(hw-hw_info.share, epid)) {
+   if (fjes_hw_epid_is_stop_requested(hw, epid)) {
+   status = EP_PARTNER_WAITING;
+   } else {
+   if (fjes_hw_epid_is_stop_process_done(hw, epid))
+   status = EP_PARTNER_COMPLETE;
+   else
+   status = EP_PARTNER_SHARED;
+   }
+   } else {
+   status = EP_PARTNER_UNSHARE;
+   }
+
+   return status;
+}
+
+void fjes_hw_raise_epstop(struct fjes_hw *hw)
+{
+   enum ep_partner_status status;
+   int epidx;
+
+   for (epidx = 0; epidx  hw-max_epid; epidx++) {
+   if (epidx == hw-my_epid)
+   continue;
+
+   status = fjes_hw_get_partner_ep_status(hw, epidx);
+   switch (status) {
+   case EP_PARTNER_SHARED:
+   fjes_hw_raise_interrupt(hw, epidx,
+   REG_ICTL_MASK_TXRX_STOP_REQ);
+   break;
+   default:
+   break;
+   }
+
+   set_bit(epidx, hw-hw_info.buffer_unshare_reserve_bit);
+   set_bit(epidx, hw-txrx_stop_req_bit);
+
+   hw-ep_shm_info[epidx].tx.info-v1i.rx_status |=
+   FJES_RX_STOP_REQ_REQUEST;
+   }
+}
+
+int fjes_hw_wait_epstop(struct fjes_hw *hw)
+{
+   enum ep_partner_status status;
+   union ep_buffer_info *info;
+   int wait_time = 0;
+   int epidx;
+
+   while (hw-hw_info.buffer_unshare_reserve_bit 
+  (wait_time  FJES_COMMAND_EPSTOP_WAIT_TIMEOUT * 1000)) {
+   for (epidx = 0; epidx  hw-max_epid; epidx++) {
+   if (epidx == hw-my_epid)
+   continue;
+   status =

[PATCH v3 03/22] fjes: Hardware cleanup routine

2015-08-21 Thread Taku Izumi

This patch adds hardware cleanup routine to be
invoked at driver's .remove routine.

Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com
---
 drivers/net/fjes/fjes_hw.c | 66 ++
 drivers/net/fjes/fjes_hw.h |  1 +
 2 files changed, 67 insertions(+)

diff --git a/drivers/net/fjes/fjes_hw.c b/drivers/net/fjes/fjes_hw.c
index ae26638..757cece 100644
--- a/drivers/net/fjes/fjes_hw.c
+++ b/drivers/net/fjes/fjes_hw.c
@@ -56,6 +56,12 @@ static u8 *fjes_hw_iomap(struct fjes_hw *hw)
return base;
 }
 
+static void fjes_hw_iounmap(struct fjes_hw *hw)
+{
+   iounmap(hw-base);
+   release_mem_region(hw-hw_res.start, hw-hw_res.size);
+}
+
 int fjes_hw_reset(struct fjes_hw *hw)
 {
union REG_DCTL dctl;
@@ -109,6 +115,12 @@ static int fjes_hw_alloc_shared_status_region(struct 
fjes_hw *hw)
return 0;
 }
 
+static void fjes_hw_free_shared_status_region(struct fjes_hw *hw)
+{
+   kfree(hw-hw_info.share);
+   hw-hw_info.share = NULL;
+}
+
 static int fjes_hw_alloc_epbuf(struct epbuf_handler *epbh)
 {
void *mem;
@@ -126,6 +138,18 @@ static int fjes_hw_alloc_epbuf(struct epbuf_handler *epbh)
return 0;
 }
 
+static void fjes_hw_free_epbuf(struct epbuf_handler *epbh)
+{
+   if (epbh-buffer)
+   vfree(epbh-buffer);
+
+   epbh-buffer = NULL;
+   epbh-size = 0;
+
+   epbh-info = NULL;
+   epbh-ring = NULL;
+}
+
 void fjes_hw_setup_epbuf(struct epbuf_handler *epbh, u8 *mac_addr, u32 mtu)
 {
union ep_buffer_info *info = epbh-info;
@@ -258,6 +282,32 @@ static int fjes_hw_setup(struct fjes_hw *hw)
return 0;
 }
 
+static void fjes_hw_cleanup(struct fjes_hw *hw)
+{
+   int epidx;
+
+   if (!hw-ep_shm_info)
+   return;
+
+   fjes_hw_free_shared_status_region(hw);
+
+   kfree(hw-hw_info.req_buf);
+   hw-hw_info.req_buf = NULL;
+
+   kfree(hw-hw_info.res_buf);
+   hw-hw_info.res_buf = NULL;
+
+   for (epidx = 0; epidx  hw-max_epid ; epidx++) {
+   if (epidx == hw-my_epid)
+   continue;
+   fjes_hw_free_epbuf(hw-ep_shm_info[epidx].tx);
+   fjes_hw_free_epbuf(hw-ep_shm_info[epidx].rx);
+   }
+
+   kfree(hw-ep_shm_info);
+   hw-ep_shm_info = NULL;
+}
+
 int fjes_hw_init(struct fjes_hw *hw)
 {
int ret;
@@ -285,6 +335,22 @@ int fjes_hw_init(struct fjes_hw *hw)
return ret;
 }
 
+void fjes_hw_exit(struct fjes_hw *hw)
+{
+   int ret;
+
+   if (hw-base) {
+   ret = fjes_hw_reset(hw);
+   if (ret)
+   pr_err(%s: reset error, __func__);
+
+   fjes_hw_iounmap(hw);
+   hw-base = NULL;
+   }
+
+   fjes_hw_cleanup(hw);
+}
+
 void fjes_hw_set_irqmask(struct fjes_hw *hw,
 enum REG_ICTL_MASK intr_mask, bool mask)
 {
diff --git a/drivers/net/fjes/fjes_hw.h b/drivers/net/fjes/fjes_hw.h
index 836ebe2..1b3e9ca 100644
--- a/drivers/net/fjes/fjes_hw.h
+++ b/drivers/net/fjes/fjes_hw.h
@@ -241,6 +241,7 @@ struct fjes_hw {
 };
 
 int fjes_hw_init(struct fjes_hw *);
+void fjes_hw_exit(struct fjes_hw *);
 int fjes_hw_reset(struct fjes_hw *);
 
 void fjes_hw_init_command_registers(struct fjes_hw *,
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 20/22] fjes: epstop_task

2015-08-21 Thread Taku Izumi

This patch adds epstop_task.
This task is used to process other receiver's
cancellation request.

Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com
---
 drivers/net/fjes/fjes_hw.c   | 31 +++
 drivers/net/fjes/fjes_hw.h   |  1 +
 drivers/net/fjes/fjes_main.c |  1 +
 3 files changed, 33 insertions(+)

diff --git a/drivers/net/fjes/fjes_hw.c b/drivers/net/fjes/fjes_hw.c
index 4525d36..b5f4a78 100644
--- a/drivers/net/fjes/fjes_hw.c
+++ b/drivers/net/fjes/fjes_hw.c
@@ -23,6 +23,7 @@
 #include fjes.h
 
 static void fjes_hw_update_zone_task(struct work_struct *);
+static void fjes_hw_epstop_task(struct work_struct *);
 
 /* supported MTU list */
 const u32 fjes_support_mtu[] = {
@@ -325,6 +326,7 @@ int fjes_hw_init(struct fjes_hw *hw)
fjes_hw_set_irqmask(hw, REG_ICTL_MASK_ALL, true);
 
INIT_WORK(hw-update_zone_task, fjes_hw_update_zone_task);
+   INIT_WORK(hw-epstop_task, fjes_hw_epstop_task);
 
mutex_init(hw-hw_info.lock);
 
@@ -355,6 +357,7 @@ void fjes_hw_exit(struct fjes_hw *hw)
fjes_hw_cleanup(hw);
 
cancel_work_sync(hw-update_zone_task);
+   cancel_work_sync(hw-epstop_task);
 }
 
 static enum fjes_dev_command_response_e
@@ -1092,3 +1095,31 @@ static void fjes_hw_update_zone_task(struct work_struct 
*work)
   adapter-unshare_watch_task);
}
 }
+
+static void fjes_hw_epstop_task(struct work_struct *work)
+{
+   struct fjes_hw *hw = container_of(work, struct fjes_hw, epstop_task);
+   struct fjes_adapter *adapter = (struct fjes_adapter *)hw-back;
+
+   ulong remain_bit;
+   int epid_bit;
+
+   while ((remain_bit = hw-epstop_req_bit)) {
+   for (epid_bit = 0; remain_bit; remain_bit = 1, epid_bit++) {
+   if (remain_bit  1) {
+   hw-ep_shm_info[epid_bit].
+   tx.info-v1i.rx_status |=
+   FJES_RX_STOP_REQ_DONE;
+
+   clear_bit(epid_bit, hw-epstop_req_bit);
+   set_bit(epid_bit,
+   adapter-unshare_watch_bitmask);
+
+   if (!work_pending(adapter-unshare_watch_task))
+   queue_work(
+   adapter-control_wq,
+   adapter-unshare_watch_task);
+   }
+   }
+   }
+}
diff --git a/drivers/net/fjes/fjes_hw.h b/drivers/net/fjes/fjes_hw.h
index e59b737..6d57b89 100644
--- a/drivers/net/fjes/fjes_hw.h
+++ b/drivers/net/fjes/fjes_hw.h
@@ -283,6 +283,7 @@ struct fjes_hw {
unsigned long txrx_stop_req_bit;
unsigned long epstop_req_bit;
struct work_struct update_zone_task;
+   struct work_struct epstop_task;
 
int my_epid;
int max_epid;
diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c
index 8e3a084..5e77d0c 100644
--- a/drivers/net/fjes/fjes_main.c
+++ b/drivers/net/fjes/fjes_main.c
@@ -316,6 +316,7 @@ static int fjes_close(struct net_device *netdev)
cancel_work_sync(adapter-tx_stall_task);
 
cancel_work_sync(hw-update_zone_task);
+   cancel_work_sync(hw-epstop_task);
 
fjes_hw_wait_epstop(hw);
 
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: RX packet loss on i.MX6Q running 4.2-rc7

2015-08-21 Thread Clemens Gruber

On Fri, Aug 21, 2015 at 06:49:20AM +0200, Jon Nettleton wrote:
 On Fri, Aug 21, 2015 at 12:30 AM, Clemens Gruber
 clemens.gru...@pqgruber.com wrote:
  Hi,
 
  I am experiencing massive RX packet loss on my i.MX6Q (Chip rev 1.3) on 
  Linux
  4.2-rc7 with a Marvell 88E1510 Gigabit Ethernet PHY connected over RGMII.
  I noticed it when doing an UDP benchmark with iperf3. When sending UDP 
  packets
  from a Debian PC to the i.MX6 with a rate of 100 Mbit/s, 99% of the packets 
  are
  lost. With a rate of 10 Mbit/s, we are still losing 93% of all packets. TCP 
  RX
  does suffer from packet loss too, but still achieves about 211 Mbit/s.
  TX is not affected.
 
  Steps to reproduce:
  On the i.MX6: iperf3 -s
  On a desktop PC:  iperf3 -b 10M -u -c MX6IP
 
  The iperf3 results:
  [ ID] Interval   Transfer Bandwidth   JitterLost/Total
  [  4]   0.00-10.00  sec  11.8 MBytes  9.90 Mbits/sec  0.687 ms  1397/1497 
  (93%)
 
  During the 10 Mbit UDP test, the IEEE_rx_macerr counter increased to 5371.
  ifconfig eth0 shows:
   RX packets:9216 errors:5248 dropped:170 overruns:5248 frame:5248
   TX packets:83 errors:0 dropped:0 overruns:0 carrier:0
   collisions:0
 
  Here are the TCP results with iperf3 -c MX6IP:
  [ ID] Interval   Transfer Bandwidth   Retr
  [  4]   0.00-10.00  sec   252 MBytes   211 Mbits/sec  4343 
  sender
  [  4]   0.00-10.00  sec   251 MBytes   211 Mbits/sec  
  receiver
 
  During the TCP test, IEEE_rx_macerr increased to 4059.
  ifconfig eth0 shows:
  RX packets:186368 errors:4206 dropped:50 overruns:4206 frame:4206
  TX packets:41861 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0
 
  Freescale errata entry ERR004512 did mention a RX FIFO overrun. Is this 
  related?
 
  Forcing pause frames via ethtool -A eth0 rx on tx on, does not improve it:
  Same amount of UDP packet loss with reduced TCP throughput of 190 Mbit/s.
  IEEE_rx_macerr increased up to 5232 during UDP 10Mbit and up to 4270 for 
  TCP.
 
  I am already using the MX6QDL_PAD_GPIO_6__ENET_IRQ workaround, which solved 
  the
  ping latency issues from ERR006687 but not the packet loss problem.
 
  I read through the mailing list archives and found a discussion between 
  Russell
  King, Marek Vasut, Eric Nelson, Fugang Duan and others about a similar 
  problem.
  I therefore added you and contributors to fec_main.c to the CC.
 
  One suggestion I found, was adding udelay(210); to fec_enet_rx():
  https://lkml.org/lkml/2014/8/22/88
  But this also did not reduce the packet loss. (I added it to the fec_enet_rx
  function just before return pkt_received; but I still got 93% packet loss)
 
  Does anyone have the equipment/setup to trace an i.MX6Q during UDP RX 
  traffic
  from iperf3 to find the root cause of this packet loss problem?
 
  What else could we do to fix this?
 
 
 This is a bug in iperf3's UDP tests.  Do the same test with iperf2 and
 you will see expected performance.  I believe there is a bug open in
 github about it.
 
 -Jon

Thank you, Jon.
You are right: With iperf2 I get the following results:
10 Mbit/s: 0% packet loss
50 Mbit/s: 0.045% packet loss
100 Mbit/s: 0.31% packet loss
200 Mbit/s: 0.64% packet loss

Much better! :)

Cheers,
Clemens
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 01/22] fjes: Introduce FUJITSU Extended Socket Network Device driver

2015-08-21 Thread Taku Izumi

This patch adds the basic code of FUJITSU Extended Socket
Network Device driver.

When PNP0C02 is found in ACPI DSDT, it evaluates _STR
to check if PNP0C02 is for Extended Socket device driver
and retrieves ACPI resource information. Then creates
platform_device.

Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com
---
 drivers/net/Kconfig  |   7 ++
 drivers/net/Makefile |   2 +
 drivers/net/fjes/Makefile|  30 ++
 drivers/net/fjes/fjes.h  |  32 +++
 drivers/net/fjes/fjes_main.c | 213 +++
 5 files changed, 284 insertions(+)
 create mode 100644 drivers/net/fjes/Makefile
 create mode 100644 drivers/net/fjes/fjes.h
 create mode 100644 drivers/net/fjes/fjes_main.c

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index c18f9e6..c78a81a 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -407,6 +407,13 @@ config VMXNET3
  To compile this driver as a module, choose M here: the
  module will be called vmxnet3.
 
+config FUJITSU_ES
+   tristate FUJITSU Extended Socket Network Device driver
+   depends on ACPI
+   help
+ This driver provides support for Extended Socket network device
+  on Extended Partitioning of FUJITSU PRIMEQUEST 2000 E2 series.
+
 source drivers/net/hyperv/Kconfig
 
 endif # NETDEVICES
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index c12cb22..677c7b4 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -67,3 +67,5 @@ obj-$(CONFIG_USB_NET_DRIVERS) += usb/
 
 obj-$(CONFIG_HYPERV_NET) += hyperv/
 obj-$(CONFIG_NTB_NETDEV) += ntb_netdev.o
+
+obj-$(CONFIG_FUJITSU_ES) += fjes/
diff --git a/drivers/net/fjes/Makefile b/drivers/net/fjes/Makefile
new file mode 100644
index 000..34bccba
--- /dev/null
+++ b/drivers/net/fjes/Makefile
@@ -0,0 +1,30 @@
+
+#
+# FUJITSU Extended Socket Network Device driver
+# Copyright (c) 2015 FUJITSU LIMITED
+#
+# This program is free software; you can redistribute it and/or modify it
+# under the terms and conditions of the GNU General Public License,
+# version 2, as published by the Free Software Foundation.
+#
+# This program is distributed in the hope it will be useful, but WITHOUT
+# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+# more details.
+#
+# You should have received a copy of the GNU General Public License along with
+# this program; if not, see http://www.gnu.org/licenses/.
+#
+# The full GNU General Public License is included in this distribution in
+# the file called COPYING.
+#
+
+
+
+#
+# Makefile for the FUJITSU Extended Socket network device driver
+#
+
+obj-$(CONFIG_FUJITSU_ES) += fjes.o
+
+fjes-objs := fjes_main.o
diff --git a/drivers/net/fjes/fjes.h b/drivers/net/fjes/fjes.h
new file mode 100644
index 000..52eb60b
--- /dev/null
+++ b/drivers/net/fjes/fjes.h
@@ -0,0 +1,32 @@
+/*
+ *  FUJITSU Extended Socket Network Device driver
+ *  Copyright (c) 2015 FUJITSU LIMITED
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, see http://www.gnu.org/licenses/.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called COPYING.
+ *
+ */
+
+#ifndef FJES_H_
+#define FJES_H_
+
+#include linux/acpi.h
+
+#define FJES_ACPI_SYMBOL   Extended Socket
+
+extern char fjes_driver_name[];
+extern char fjes_driver_version[];
+
+#endif /* FJES_H_ */
diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c
new file mode 100644
index 000..9517666
--- /dev/null
+++ b/drivers/net/fjes/fjes_main.c
@@ -0,0 +1,213 @@
+/*
+ *  FUJITSU Extended Socket Network Device driver
+ *  Copyright (c) 2015 FUJITSU LIMITED
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, see

[PATCH v3 00/22] FUJITSU Extended Socket network device driver

2015-08-21 Thread Taku Izumi

This patchsets adds FUJITSU Extended Socket network device driver.
Extended Socket network device is a shared memory based high-speed
network interface between Extended Partitions of PRIMEQUEST 2000 E2
series.

You can get some information about Extended Partition and Extended
Socket by referring the following manual.

http://globalsp.ts.fujitsu.com/dmsp/Publications/public/CA92344-0537.pdf
3.2.1 Extended Partitioning
3.2.2 Extended Socke

v2.2 - v3:
   - Fix up according to David's comment (No functional change)


Taku Izumi (22):
  fjes: Introduce FUJITSU Extended Socket Network Device driver
  fjes: Hardware initialization routine
  fjes: Hardware cleanup routine
  fjes: platform_driver's .probe and .remove routine
  fjes: ES information acquisition routine
  fjes: buffer address regist/unregistration routine
  fjes: net_device_ops.ndo_open and .ndo_stop
  fjes: net_device_ops.ndo_start_xmit
  fjes: raise_intr_rxdata_task
  fjes: tx_stall_task
  fjes: NAPI polling function
  fjes: net_device_ops.ndo_get_stats64
  fjes: net_device_ops.ndo_change_mtu
  fjes: net_device_ops.ndo_tx_timeout
  fjes: net_device_ops.ndo_vlan_rx_add/kill_vid
  fjes: interrupt_watch_task
  fjes: force_close_task
  fjes: unshare_watch_task
  fjes: update_zone_task
  fjes: epstop_task
  fjes: handle receive cancellation request interrupt
  fjes: ethtool support

 drivers/net/Kconfig |7 +
 drivers/net/Makefile|2 +
 drivers/net/fjes/Makefile   |   30 +
 drivers/net/fjes/fjes.h |   77 +++
 drivers/net/fjes/fjes_ethtool.c |  137 
 drivers/net/fjes/fjes_hw.c  | 1125 +++
 drivers/net/fjes/fjes_hw.h  |  334 ++
 drivers/net/fjes/fjes_main.c| 1383 +++
 drivers/net/fjes/fjes_regs.h|  142 
 9 files changed, 3237 insertions(+)
 create mode 100644 drivers/net/fjes/Makefile
 create mode 100644 drivers/net/fjes/fjes.h
 create mode 100644 drivers/net/fjes/fjes_ethtool.c
 create mode 100644 drivers/net/fjes/fjes_hw.c
 create mode 100644 drivers/net/fjes/fjes_hw.h
 create mode 100644 drivers/net/fjes/fjes_main.c
 create mode 100644 drivers/net/fjes/fjes_regs.h

-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 11/22] fjes: NAPI polling function

2015-08-21 Thread Taku Izumi

This patch adds NAPI polling function and receive related work.

Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com
---
 drivers/net/fjes/fjes_hw.c   |  40 ++
 drivers/net/fjes/fjes_hw.h   |   5 ++
 drivers/net/fjes/fjes_main.c | 171 ++-
 3 files changed, 214 insertions(+), 2 deletions(-)

diff --git a/drivers/net/fjes/fjes_hw.c b/drivers/net/fjes/fjes_hw.c
index 487dbc6..3c96d06 100644
--- a/drivers/net/fjes/fjes_hw.c
+++ b/drivers/net/fjes/fjes_hw.c
@@ -825,6 +825,46 @@ bool fjes_hw_check_vlan_id(struct epbuf_handler *epbh, u16 
vlan_id)
return ret;
 }
 
+bool fjes_hw_epbuf_rx_is_empty(struct epbuf_handler *epbh)
+{
+   union ep_buffer_info *info = epbh-info;
+
+   if (info-v1i.count_max == 0)
+   return true;
+
+   return EP_RING_EMPTY(info-v1i.head, info-v1i.tail,
+info-v1i.count_max);
+}
+
+void *fjes_hw_epbuf_rx_curpkt_get_addr(struct epbuf_handler *epbh,
+  size_t *psize)
+{
+   union ep_buffer_info *info = epbh-info;
+   struct esmem_frame *ring_frame;
+   void *frame;
+
+   ring_frame = (struct esmem_frame *)(epbh-ring[EP_RING_INDEX
+(info-v1i.head,
+ info-v1i.count_max) *
+info-v1i.frame_max]);
+
+   *psize = (size_t)ring_frame-frame_size;
+
+   frame = ring_frame-frame_data;
+
+   return frame;
+}
+
+void fjes_hw_epbuf_rx_curpkt_drop(struct epbuf_handler *epbh)
+{
+   union ep_buffer_info *info = epbh-info;
+
+   if (fjes_hw_epbuf_rx_is_empty(epbh))
+   return;
+
+   EP_RING_INDEX_INC(epbh-info-v1i.head, info-v1i.count_max);
+}
+
 int fjes_hw_epbuf_tx_pkt_send(struct epbuf_handler *epbh,
  void *frame, size_t size)
 {
diff --git a/drivers/net/fjes/fjes_hw.h b/drivers/net/fjes/fjes_hw.h
index 07e1226..3511db2 100644
--- a/drivers/net/fjes/fjes_hw.h
+++ b/drivers/net/fjes/fjes_hw.h
@@ -69,6 +69,8 @@ struct fjes_hw;
((_num) = EP_RING_INDEX((_num) + 1, (_max)))
 #define EP_RING_FULL(_head, _tail, _max)   \
(0 == EP_RING_INDEX(((_tail) - (_head)), (_max)))
+#define EP_RING_EMPTY(_head, _tail, _max) \
+   (1 == EP_RING_INDEX(((_tail) - (_head)), (_max)))
 
 #define FJES_MTU_TO_BUFFER_SIZE(mtu) \
(ETH_HLEN + VLAN_HLEN + (mtu) + ETH_FCS_LEN)
@@ -320,6 +322,9 @@ int fjes_hw_epid_is_shared(struct fjes_device_shared_info 
*, int);
 bool fjes_hw_check_epbuf_version(struct epbuf_handler *, u32);
 bool fjes_hw_check_mtu(struct epbuf_handler *, u32);
 bool fjes_hw_check_vlan_id(struct epbuf_handler *, u16);
+bool fjes_hw_epbuf_rx_is_empty(struct epbuf_handler *);
+void *fjes_hw_epbuf_rx_curpkt_get_addr(struct epbuf_handler *, size_t *);
+void fjes_hw_epbuf_rx_curpkt_drop(struct epbuf_handler *);
 int fjes_hw_epbuf_tx_pkt_send(struct epbuf_handler *, void *, size_t);
 
 #endif /* FJES_HW_H_ */
diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c
index ac1e076..6194962 100644
--- a/drivers/net/fjes/fjes_main.c
+++ b/drivers/net/fjes/fjes_main.c
@@ -66,6 +66,9 @@ static int fjes_remove(struct platform_device *);
 static int fjes_sw_init(struct fjes_adapter *);
 static void fjes_netdev_setup(struct net_device *);
 
+static void fjes_rx_irq(struct fjes_adapter *, int);
+static int fjes_poll(struct napi_struct *, int);
+
 static const struct acpi_device_id fjes_acpi_ids[] = {
{PNP0C02, 0},
{, 0},
@@ -235,6 +238,8 @@ static int fjes_open(struct net_device *netdev)
hw-txrx_stop_req_bit = 0;
hw-epstop_req_bit = 0;
 
+   napi_enable(adapter-napi);
+
fjes_hw_capture_interrupt_status(hw);
 
result = fjes_request_irq(adapter);
@@ -250,6 +255,7 @@ static int fjes_open(struct net_device *netdev)
 
 err_req_irq:
fjes_free_irq(adapter);
+   napi_disable(adapter-napi);
 
 err_setup_res:
fjes_free_resources(adapter);
@@ -268,6 +274,8 @@ static int fjes_close(struct net_device *netdev)
 
fjes_hw_raise_epstop(hw);
 
+   napi_disable(adapter-napi);
+
for (epidx = 0; epidx  hw-max_epid; epidx++) {
if (epidx == hw-my_epid)
continue;
@@ -701,14 +709,167 @@ static irqreturn_t fjes_intr(int irq, void *data)
 
icr = fjes_hw_capture_interrupt_status(hw);
 
-   if (icr  REG_IS_MASK_IS_ASSERT)
+   if (icr  REG_IS_MASK_IS_ASSERT) {
+   if (icr  REG_ICTL_MASK_RX_DATA)
+   fjes_rx_irq(adapter, icr  REG_IS_MASK_EPID);
+
ret = IRQ_HANDLED;
-   else
+   } else {
ret = IRQ_NONE;
+   }
 
return ret;
 }
 
+static int fjes_rxframe_search_exist(struct fjes_adapter *adapter,
+int start_epid)
+{
+   struct fjes_hw *hw = adapter-hw;
+   enum

[PATCH v3 13/22] fjes: net_device_ops.ndo_change_mtu

2015-08-21 Thread Taku Izumi

This patch adds net_device_ops.ndo_change_mtu.

Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com
---
 drivers/net/fjes/fjes_main.c | 29 +
 1 file changed, 29 insertions(+)

diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c
index 20feb3e..519976c 100644
--- a/drivers/net/fjes/fjes_main.c
+++ b/drivers/net/fjes/fjes_main.c
@@ -57,6 +57,7 @@ static void fjes_tx_stall_task(struct work_struct *);
 static irqreturn_t fjes_intr(int, void*);
 static struct rtnl_link_stats64 *
 fjes_get_stats64(struct net_device *, struct rtnl_link_stats64 *);
+static int fjes_change_mtu(struct net_device *, int);
 
 static int fjes_acpi_add(struct acpi_device *);
 static int fjes_acpi_remove(struct acpi_device *);
@@ -222,6 +223,7 @@ static const struct net_device_ops fjes_netdev_ops = {
.ndo_stop   = fjes_close,
.ndo_start_xmit = fjes_xmit_frame,
.ndo_get_stats64= fjes_get_stats64,
+   .ndo_change_mtu = fjes_change_mtu,
 };
 
 /* fjes_open - Called when a network interface is made active */
@@ -713,6 +715,33 @@ fjes_get_stats64(struct net_device *netdev, struct 
rtnl_link_stats64 *stats)
return stats;
 }
 
+static int fjes_change_mtu(struct net_device *netdev, int new_mtu)
+{
+   bool running = netif_running(netdev);
+   int ret = 0;
+   int idx;
+
+   for (idx = 0; fjes_support_mtu[idx] != 0; idx++) {
+   if (new_mtu = fjes_support_mtu[idx]) {
+   new_mtu = fjes_support_mtu[idx];
+   if (new_mtu == netdev-mtu)
+   return 0;
+
+   if (running)
+   fjes_close(netdev);
+
+   netdev-mtu = new_mtu;
+
+   if (running)
+   ret = fjes_open(netdev);
+
+   return ret;
+   }
+   }
+
+   return -EINVAL;
+}
+
 static irqreturn_t fjes_intr(int irq, void *data)
 {
struct fjes_adapter *adapter = data;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 12/22] fjes: net_device_ops.ndo_get_stats64

2015-08-21 Thread Taku Izumi

This patch adds net_device_ops.ndo_get_stats64 callback.

Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com
---
 drivers/net/fjes/fjes_main.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c
index 6194962..20feb3e 100644
--- a/drivers/net/fjes/fjes_main.c
+++ b/drivers/net/fjes/fjes_main.c
@@ -55,6 +55,8 @@ static netdev_tx_t fjes_xmit_frame(struct sk_buff *, struct 
net_device *);
 static void fjes_raise_intr_rxdata_task(struct work_struct *);
 static void fjes_tx_stall_task(struct work_struct *);
 static irqreturn_t fjes_intr(int, void*);
+static struct rtnl_link_stats64 *
+fjes_get_stats64(struct net_device *, struct rtnl_link_stats64 *);
 
 static int fjes_acpi_add(struct acpi_device *);
 static int fjes_acpi_remove(struct acpi_device *);
@@ -219,6 +221,7 @@ static const struct net_device_ops fjes_netdev_ops = {
.ndo_open   = fjes_open,
.ndo_stop   = fjes_close,
.ndo_start_xmit = fjes_xmit_frame,
+   .ndo_get_stats64= fjes_get_stats64,
 };
 
 /* fjes_open - Called when a network interface is made active */
@@ -700,6 +703,16 @@ fjes_xmit_frame(struct sk_buff *skb, struct net_device 
*netdev)
return ret;
 }
 
+static struct rtnl_link_stats64 *
+fjes_get_stats64(struct net_device *netdev, struct rtnl_link_stats64 *stats)
+{
+   struct fjes_adapter *adapter = netdev_priv(netdev);
+
+   memcpy(stats, adapter-stats64, sizeof(struct rtnl_link_stats64));
+
+   return stats;
+}
+
 static irqreturn_t fjes_intr(int irq, void *data)
 {
struct fjes_adapter *adapter = data;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 10/22] fjes: tx_stall_task

2015-08-21 Thread Taku Izumi

This patch adds tx_stall_task.
When receiver's buffer is full, sender stops
its tx queue. This task is used to monitor
receiver's status and when receiver's buffer
is avairable, it resumes tx queue.

Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com
---
 drivers/net/fjes/fjes.h  |  2 ++
 drivers/net/fjes/fjes_main.c | 61 
 2 files changed, 63 insertions(+)

diff --git a/drivers/net/fjes/fjes.h b/drivers/net/fjes/fjes.h
index 8e9899e..b04ea9d 100644
--- a/drivers/net/fjes/fjes.h
+++ b/drivers/net/fjes/fjes.h
@@ -30,6 +30,7 @@
 #define FJES_MAX_QUEUES1
 #define FJES_TX_RETRY_INTERVAL (20 * HZ)
 #define FJES_TX_RETRY_TIMEOUT  (100)
+#define FJES_TX_TX_STALL_TIMEOUT   (FJES_TX_RETRY_INTERVAL / 2)
 #define FJES_OPEN_ZONE_UPDATE_WAIT (300) /* msec */
 
 /* board specific private data structure */
@@ -52,6 +53,7 @@ struct fjes_adapter {
 
struct workqueue_struct *txrx_wq;
 
+   struct work_struct tx_stall_task;
struct work_struct raise_intr_rxdata_task;
 
struct fjes_hw hw;
diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c
index 80e180f..ac1e076 100644
--- a/drivers/net/fjes/fjes_main.c
+++ b/drivers/net/fjes/fjes_main.c
@@ -53,6 +53,7 @@ static int fjes_setup_resources(struct fjes_adapter *);
 static void fjes_free_resources(struct fjes_adapter *);
 static netdev_tx_t fjes_xmit_frame(struct sk_buff *, struct net_device *);
 static void fjes_raise_intr_rxdata_task(struct work_struct *);
+static void fjes_tx_stall_task(struct work_struct *);
 static irqreturn_t fjes_intr(int, void*);
 
 static int fjes_acpi_add(struct acpi_device *);
@@ -278,6 +279,7 @@ static int fjes_close(struct net_device *netdev)
fjes_free_irq(adapter);
 
cancel_work_sync(adapter-raise_intr_rxdata_task);
+   cancel_work_sync(adapter-tx_stall_task);
 
fjes_hw_wait_epstop(hw);
 
@@ -407,6 +409,59 @@ static void fjes_free_resources(struct fjes_adapter 
*adapter)
}
 }
 
+static void fjes_tx_stall_task(struct work_struct *work)
+{
+   struct fjes_adapter *adapter = container_of(work,
+   struct fjes_adapter, tx_stall_task);
+   struct net_device *netdev = adapter-netdev;
+   struct fjes_hw *hw = adapter-hw;
+   int all_queue_available, sendable;
+   enum ep_partner_status pstatus;
+   int max_epid, my_epid, epid;
+   union ep_buffer_info *info;
+   int i;
+
+   if (((long)jiffies -
+   (long)(netdev-trans_start))  FJES_TX_TX_STALL_TIMEOUT) {
+   netif_wake_queue(netdev);
+   return;
+   }
+
+   my_epid = hw-my_epid;
+   max_epid = hw-max_epid;
+
+   for (i = 0; i  5; i++) {
+   all_queue_available = 1;
+
+   for (epid = 0; epid  max_epid; epid++) {
+   if (my_epid == epid)
+   continue;
+
+   pstatus = fjes_hw_get_partner_ep_status(hw, epid);
+   sendable = (pstatus == EP_PARTNER_SHARED);
+   if (!sendable)
+   continue;
+
+   info = adapter-hw.ep_shm_info[epid].tx.info;
+
+   if (EP_RING_FULL(info-v1i.head, info-v1i.tail,
+info-v1i.count_max)) {
+   all_queue_available = 0;
+   break;
+   }
+   }
+
+   if (all_queue_available) {
+   netif_wake_queue(netdev);
+   return;
+   }
+   }
+
+   usleep_range(50, 100);
+
+   queue_work(adapter-txrx_wq, adapter-tx_stall_task);
+}
+
 static void fjes_raise_intr_rxdata_task(struct work_struct *work)
 {
struct fjes_adapter *adapter = container_of(work,
@@ -602,6 +657,10 @@ fjes_xmit_frame(struct sk_buff *skb, struct net_device 
*netdev)
netdev-trans_start = jiffies;
netif_tx_stop_queue(cur_queue);
 
+   if 
(!work_pending(adapter-tx_stall_task))
+   queue_work(adapter-txrx_wq,
+  
adapter-tx_stall_task);
+
ret = NETDEV_TX_BUSY;
}
} else {
@@ -686,6 +745,7 @@ static int fjes_probe(struct platform_device *plat_dev)
 
adapter-txrx_wq = create_workqueue(DRV_NAME /txrx);
 
+   INIT_WORK(adapter-tx_stall_task, fjes_tx_stall_task);
INIT_WORK(adapter-raise_intr_rxdata_task,
  fjes_raise_intr_rxdata_task);
 
@@ -729,6 +789,7 @@ static int fjes_remove(struct platform_device *plat_dev)
struct fjes_hw *hw = adapter-hw;
 
cancel_work_sync(adapter-raise_intr_rxdata_task);
+

Re: [lkp] [rhashtable] 9d901bc0515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/ioremap.c:63 __ioremap_check_ram+0x6a/0x99()

2015-08-21 Thread Herbert Xu

On Fri, Aug 21, 2015 at 03:09:42PM +0800, Huang Ying wrote:
 
 Sorry, my fault.  There are OOM for parent commit too, just some dmesg
 difference, which I miss understood.  Please ignore this report.  I
 will be more careful next time.

Thanks for the confirmation.
-- 
Email: Herbert Xu herb...@gondor.apana.org.au
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] net: phy: add interrupt support for aquantia phy

2015-08-21 Thread shh.xie

From: Shaohui Xie shaohui@freescale.com

By implementing config_intr  ack_interrupt, now the phy can support
link connect/disconnect interrupt.

Signed-off-by: Shaohui Xie shaohui@freescale.com
---
 drivers/net/phy/aquantia.c | 49 ++
 1 file changed, 49 insertions(+)

diff --git a/drivers/net/phy/aquantia.c b/drivers/net/phy/aquantia.c
index 73d347d..d6111af 100644
--- a/drivers/net/phy/aquantia.c
+++ b/drivers/net/phy/aquantia.c
@@ -44,6 +44,43 @@ static int aquantia_aneg_done(struct phy_device *phydev)
return (reg  0) ? reg : (reg  BMSR_ANEGCOMPLETE);
 }
 
+static int aquantia_config_intr(struct phy_device *phydev)
+{
+   int err;
+
+   if (phydev-interrupts == PHY_INTERRUPT_ENABLED) {
+   err = phy_write_mmd(phydev, MDIO_MMD_AN, 0xd401, 1);
+   if (err  0)
+   return err;
+
+   err = phy_write_mmd(phydev, MDIO_MMD_VEND1, 0xff00, 1);
+   if (err  0)
+   return err;
+
+   err = phy_write_mmd(phydev, MDIO_MMD_VEND1, 0xff01, 0x1001);
+   } else {
+   err = phy_write_mmd(phydev, MDIO_MMD_AN, 0xd401, 0);
+   if (err  0)
+   return err;
+
+   err = phy_write_mmd(phydev, MDIO_MMD_VEND1, 0xff00, 0);
+   if (err  0)
+   return err;
+
+   err = phy_write_mmd(phydev, MDIO_MMD_VEND1, 0xff01, 0);
+   }
+
+   return err;
+}
+
+static int aquantia_ack_interrupt(struct phy_device *phydev)
+{
+   int reg;
+
+   reg = phy_read_mmd(phydev, MDIO_MMD_AN, 0xcc01);
+   return (reg  0) ? reg : 0;
+}
+
 static int aquantia_read_status(struct phy_device *phydev)
 {
int reg;
@@ -85,8 +122,11 @@ static struct phy_driver aquantia_driver[] = {
.phy_id_mask= 0xfff0,
.name   = Aquantia AQ1202,
.features   = PHY_AQUANTIA_FEATURES,
+   .flags  = PHY_HAS_INTERRUPT,
.aneg_done  = aquantia_aneg_done,
.config_aneg= aquantia_config_aneg,
+   .config_intr= aquantia_config_intr,
+   .ack_interrupt  = aquantia_ack_interrupt,
.read_status= aquantia_read_status,
.driver = { .owner = THIS_MODULE,},
 },
@@ -95,8 +135,11 @@ static struct phy_driver aquantia_driver[] = {
.phy_id_mask= 0xfff0,
.name   = Aquantia AQ2104,
.features   = PHY_AQUANTIA_FEATURES,
+   .flags  = PHY_HAS_INTERRUPT,
.aneg_done  = aquantia_aneg_done,
.config_aneg= aquantia_config_aneg,
+   .config_intr= aquantia_config_intr,
+   .ack_interrupt  = aquantia_ack_interrupt,
.read_status= aquantia_read_status,
.driver = { .owner = THIS_MODULE,},
 },
@@ -105,8 +148,11 @@ static struct phy_driver aquantia_driver[] = {
.phy_id_mask= 0xfff0,
.name   = Aquantia AQR105,
.features   = PHY_AQUANTIA_FEATURES,
+   .flags  = PHY_HAS_INTERRUPT,
.aneg_done  = aquantia_aneg_done,
.config_aneg= aquantia_config_aneg,
+   .config_intr= aquantia_config_intr,
+   .ack_interrupt  = aquantia_ack_interrupt,
.read_status= aquantia_read_status,
.driver = { .owner = THIS_MODULE,},
 },
@@ -115,8 +161,11 @@ static struct phy_driver aquantia_driver[] = {
.phy_id_mask= 0xfff0,
.name   = Aquantia AQR405,
.features   = PHY_AQUANTIA_FEATURES,
+   .flags  = PHY_HAS_INTERRUPT,
.aneg_done  = aquantia_aneg_done,
.config_aneg= aquantia_config_aneg,
+   .config_intr= aquantia_config_intr,
+   .ack_interrupt  = aquantia_ack_interrupt,
.read_status= aquantia_read_status,
.driver = { .owner = THIS_MODULE,},
 },
-- 
2.1.0.27.g96db324

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 05/22] fjes: ES information acquisition routine

2015-08-21 Thread Taku Izumi

This patch adds ES information acquisition routine.
ES information can be retrieved issuing information
request command. ES information includes which
receiver is same zone.

Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com
---
 drivers/net/fjes/fjes_hw.c   | 101 +++
 drivers/net/fjes/fjes_hw.h   |  24 ++
 drivers/net/fjes/fjes_regs.h |  23 ++
 3 files changed, 148 insertions(+)

diff --git a/drivers/net/fjes/fjes_hw.c b/drivers/net/fjes/fjes_hw.c
index 757cece..c31be7f 100644
--- a/drivers/net/fjes/fjes_hw.c
+++ b/drivers/net/fjes/fjes_hw.c
@@ -351,6 +351,107 @@ void fjes_hw_exit(struct fjes_hw *hw)
fjes_hw_cleanup(hw);
 }
 
+static enum fjes_dev_command_response_e
+fjes_hw_issue_request_command(struct fjes_hw *hw,
+ enum fjes_dev_command_request_type type)
+{
+   enum fjes_dev_command_response_e ret = FJES_CMD_STATUS_UNKNOWN;
+   union REG_CR cr;
+   union REG_CS cs;
+   int timeout;
+
+   cr.reg = 0;
+   cr.bits.req_start = 1;
+   cr.bits.req_code = type;
+   wr32(XSCT_CR, cr.reg);
+   cr.reg = rd32(XSCT_CR);
+
+   if (cr.bits.error == 0) {
+   timeout = FJES_COMMAND_REQ_TIMEOUT * 1000;
+   cs.reg = rd32(XSCT_CS);
+
+   while ((cs.bits.complete != 1)  timeout  0) {
+   msleep(1000);
+   cs.reg = rd32(XSCT_CS);
+   timeout -= 1000;
+   }
+
+   if (cs.bits.complete == 1)
+   ret = FJES_CMD_STATUS_NORMAL;
+   else if (timeout = 0)
+   ret = FJES_CMD_STATUS_TIMEOUT;
+
+   } else {
+   switch (cr.bits.err_info) {
+   case FJES_CMD_REQ_ERR_INFO_PARAM:
+   ret = FJES_CMD_STATUS_ERROR_PARAM;
+   break;
+   case FJES_CMD_REQ_ERR_INFO_STATUS:
+   ret = FJES_CMD_STATUS_ERROR_STATUS;
+   break;
+   default:
+   ret = FJES_CMD_STATUS_UNKNOWN;
+   break;
+   }
+   }
+
+   return ret;
+}
+
+int fjes_hw_request_info(struct fjes_hw *hw)
+{
+   union fjes_device_command_req *req_buf = hw-hw_info.req_buf;
+   union fjes_device_command_res *res_buf = hw-hw_info.res_buf;
+   enum fjes_dev_command_response_e ret;
+   int result;
+
+   memset(req_buf, 0, hw-hw_info.req_buf_size);
+   memset(res_buf, 0, hw-hw_info.res_buf_size);
+
+   req_buf-info.length = FJES_DEV_COMMAND_INFO_REQ_LEN;
+
+   res_buf-info.length = 0;
+   res_buf-info.code = 0;
+
+   ret = fjes_hw_issue_request_command(hw, FJES_CMD_REQ_INFO);
+
+   result = 0;
+
+   if (FJES_DEV_COMMAND_INFO_RES_LEN((*hw-hw_info.max_epid)) !=
+   res_buf-info.length) {
+   result = -ENOMSG;
+   } else if (ret == FJES_CMD_STATUS_NORMAL) {
+   switch (res_buf-info.code) {
+   case FJES_CMD_REQ_RES_CODE_NORMAL:
+   result = 0;
+   break;
+   default:
+   result = -EPERM;
+   break;
+   }
+   } else {
+   switch (ret) {
+   case FJES_CMD_STATUS_UNKNOWN:
+   result = -EPERM;
+   break;
+   case FJES_CMD_STATUS_TIMEOUT:
+   result = -EBUSY;
+   break;
+   case FJES_CMD_STATUS_ERROR_PARAM:
+   result = -EPERM;
+   break;
+   case FJES_CMD_STATUS_ERROR_STATUS:
+   result = -EPERM;
+   break;
+   default:
+   result = -EPERM;
+   break;
+   }
+   }
+
+   return result;
+}
+
 void fjes_hw_set_irqmask(struct fjes_hw *hw,
 enum REG_ICTL_MASK intr_mask, bool mask)
 {
diff --git a/drivers/net/fjes/fjes_hw.h b/drivers/net/fjes/fjes_hw.h
index 1b3e9ca..cc1ef21 100644
--- a/drivers/net/fjes/fjes_hw.h
+++ b/drivers/net/fjes/fjes_hw.h
@@ -34,6 +34,12 @@ struct fjes_hw;
 #define EP_BUFFER_INFO_SIZE 4096
 
 #define FJES_DEVICE_RESET_TIMEOUT  ((17 + 1) * 3) /* sec */
+#define FJES_COMMAND_REQ_TIMEOUT  (5 + 1) /* sec */
+
+#define FJES_CMD_REQ_ERR_INFO_PARAM  (0x0001)
+#define FJES_CMD_REQ_ERR_INFO_STATUS (0x0002)
+
+#define FJES_CMD_REQ_RES_CODE_NORMAL (0)
 
 #define EP_BUFFER_SIZE \
(((sizeof(union ep_buffer_info) + (128 * (64 * 1024))) \
@@ -50,6 +56,7 @@ struct fjes_hw;
((size) - sizeof(struct esmem_frame) - \
(ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN))
 
+#define FJES_DEV_COMMAND_INFO_REQ_LEN  (4)
 #define FJES_DEV_COMMAND_INFO_RES_LEN(epnum) (8 + 2 * (epnum))
 #define FJES_DEV_COMMAND_SHARE_BUFFER_REQ_LEN(txb, rxb) \
(24 + (8 * ((txb) / EP_BUFFER_INFO_SIZE + (rxb)

Re: DEBUG_LOCKS_WARN_ON(in_interrupt()) triggering in socket code

2015-08-21 Thread Jason A. Donenfeld

Bueller? ... Bueller?

On Thu, Aug 20, 2015 at 2:39 AM, Jason A. Donenfeld ja...@zx2c4.com wrote:
 Hi folks,

 In setting up a socket, there are two functions I make use of that in
 turn wind up calling static_key_slow_inc: setup_udp_tunnel_sock and
 sk_set_memalloc. These both make use of static_key_slow_inc because
 they selectively enable certain important code paths.

 This is all fine, except it poses some problems when calling these
 functions inside of .ndo_open. In that case, I get ugly (debug)
 warnings like this:

 WARNING: CPU: 1 PID: 2002 at kernel/locking/mutex.c:526
 mutex_lock_nested+0x39b/0x3b0()
 DEBUG_LOCKS_WARN_ON(in_interrupt())
  [81621d0e] dump_stack+0x45/0x57
  [810505ca] warn_slowpath_common+0x8a/0xc0
  [81050655] warn_slowpath_fmt+0x55/0x70
  [8162513b] mutex_lock_nested+0x39b/0x3b0
  [8113d699] static_key_slow_inc+0x59/0xc0
  [8154ebc0] udp_encap_enable+0x20/0x30
  [8157a885] setup_udp_tunnel_sock+0x55/0x70
  [816028ac] socket_init+0x1cc/0x3a0
  [81600341] open+0x21/0x1b0
  [81476af0] __dev_open+0xb0/0x110
  [81476e01] __dev_change_flags+0xa1/0x160
  [81476ee9] dev_change_flags+0x29/0x70
  [8148652a] do_setlink+0x5da/0xa80
  [81487bed] rtnl_newlink+0x50d/0x8a0
  [81485141] rtnetlink_rcv_msg+0xa1/0x240
  [8149f1fb] netlink_rcv_skb+0x9b/0xc0
  [8148508e] rtnetlink_rcv+0x2e/0x40
  [8149ec3f] netlink_unicast+0x16f/0x200
  [8149f009] netlink_sendmsg+0x339/0x380
  [814559d9] ___sys_sendmsg+0x2f9/0x310
  [814566d7] __sys_sendmsg+0x57/0xa0
  [81456732] SyS_sendmsg+0x12/0x20
  [816295b2] entry_SYSCALL_64_fastpath+0x16/0x7a

 The reason is that the static key code makes use of mutexes. And the
 mutex debug code ensures that in_interrupt() is zero; otherwise it
 prints that warning. In this case, in_interrupt() has a value of 512.

 So, questions:

 1. Is the best thing to do just move my socket creation routine into a
 workqueue, and avoid this issue all together?
 2. Is it, in fact, incorrect to check for in_interrupt(), and the
 debug assertion is actually wrong?
 3. Is it a bug that in_interrupt() is returning non-zero in relation
 to a syscall?

 Thanks,
 Jason
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [net-next:master 1179/1189] include/linux/compiler.h:447:38: error: call to '__compiletime_assert_243' declared with attribute error: BUILD_BUG_ON failed: offsetof(struct dst_entry, __refcnt) 63

2015-08-21 Thread David Miller


Yeah, I should have predicted this would happen on 32-bit builds when I saw
the adjustment of __pad_to_align_refcnt[] for 64-bit.

Jiri, you might not have any reasonable options to fix this I'm afraid.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

90 matches

Mail list logo