Re: [PATCH] Adding Agile-SD TCP module and modifying Kconfig and Makefile

2017-08-01 Thread David Miller
From: mohamedalrshah 
Date: Wed,  2 Aug 2017 12:37:01 +0800

> +//#define NS_PROTOCOL "tcp_agilesd.c"
> +//#include "../ns-linux-c.h"
> +//#include "../ns-linux-util.h"
> +//#include 

There is no way this submission will be considered seriously with all of
this cruft in it.

Your new module needs significant cleanups.


Re: [PATCH] iwlwifi: Demote messages about fw flags size to info

2017-08-01 Thread Luca Coelho
Hi João Paulo,


On Tue, 2017-08-01 at 15:58 -0700, João Paulo Rechi Vita wrote:
> Hello Luca,
> 
> On Mon, Jul 24, 2017 at 4:01 AM, Coelho, Luciano
>  wrote:
> > On Fri, 2017-07-21 at 07:51 -0700, João Paulo Rechi Vita wrote:
> 
> (...)
> 
> > > Currently these messages are presented to the user during boot if there
> > > is no bootsplash covering the console, sometimes even if the boot splash
> > > is enabled but has not started yet by the time this message is shown.
> > > 
> 
> I should have provided another piece of information here: all of this
> happens even when booting with the 'quiet' kernel parameter.

Oh, okay, that's annoying.


> > This specific case is harmless, but I'd rather keep this message as an
> > error, because in other situations it could lead to unexpected
> > behavioir, so I prefer to keep it very visible.
> > 
> > 
> 
> I see your point, and I understand the purpose of these messages. I'm
> wondering if perhaps having them at the warning level would give them
> enough visibility, while still keeping a clean boot process to the end
> user. If so, I can send an updated patch.
> 
> Thanks for your reply and for pointing to the fix for the root cause
> of that message.

Sure, I agree it's better to make it use IWL_WARN(), which will generate
a dev_warn() instead of a dev_err().


--
Cheers,
Luca.


[PATCH] Adding Agile-SD TCP module and modifying Kconfig and Makefile

2017-08-01 Thread mohamedalrshah
Published:
Alrshah, M.A., Othman, M., Ali, B. and Hanapi, Z.M., 2015. Agile-SD: a 
Linux-based
TCP congestion control algorithm for supporting high-speed and short-distance 
networks.
Journal of Network and Computer Applications, 55, pp.181-190. 

Agile-SD is a new loss-based and RTT-independent TCP congestion
control algorithm designed to support high-speed and Short-Distance (SD)
networks. It mainly contributes the agility factor mechanism, which allows
Agile-SD to deal with small buffer sizes while reducing its sensitivity to
packet loss. Due to the use of this mechanism, Agile-SD improves the
throughput of TCP up to 50% compared to Cubic-TCP and Compound-TCP. Its
performance was evaluated using the well-known NS2 simulator to measure
the average throughput, loss ratio and fairness.

Agile-SD is designed and implemented by Mohamed A. Alrshah et al. (2015) as
a Linux kernel module in the real Linux operating system (openSUSE 42.1 Leap),
which is implemented at the Network, Parallel and Distributed Computing 
Laboratory,
A2.10, second floor, Block A, Faculty of Computer Science and Information 
Technology,
Universiti Putra Malaysia, over real PCs connected to the Internet for daily 
uses and
evaluation purposes.

The main motivation behind Agile-SD is to support the short-distance networks 
such
as local area networks and data center networks in order to increase the 
bandwidth
utilization over high-speed networks. Moreover, Agile-SD is introduced in 
support
of the open source community, where it can be used under the OpenGL agreement.
---
 net/ipv4/Kconfig   |  15 
 net/ipv4/Makefile  |   1 +
 net/ipv4/tcp_agilesd.c | 221 +
 3 files changed, 237 insertions(+)
 create mode 100755 net/ipv4/tcp_agilesd.c

diff --git a/net/ipv4/Kconfig b/net/ipv4/Kconfig
index 91a25579..22d824b1 100644
--- a/net/ipv4/Kconfig
+++ b/net/ipv4/Kconfig
@@ -677,6 +677,17 @@ config TCP_CONG_BBR
bufferbloat, policers, or AQM schemes that do not provide a delay
signal. It requires the fq ("Fair Queue") pacing packet scheduler.
 
+config TCP_CONG_AGILESD
+tristate "Agile-SD Congestion control"
+default n
+---help---
+
+This is version 1.0 of Agile-SD TCP. It is a sender-side only. 
+It contributes the Agility Factor (AF) to shorten the epoch time 
+and to make TCP independent from RTT. AF reduces the sensitivity 
+to packet losses, which in turn Agile-SD to achieve better throughput 
+over high-speed networks.
+
 choice
prompt "Default TCP congestion control"
default DEFAULT_CUBIC
@@ -713,6 +724,9 @@ choice
 
config DEFAULT_BBR
bool "BBR" if TCP_CONG_BBR=y
+   
+config DEFAULT_AGILESD
+   bool "AGILESD" if TCP_CONG_AGILESD=y
 
config DEFAULT_RENO
bool "Reno"
@@ -738,6 +752,7 @@ config DEFAULT_TCP_CONG
default "dctcp" if DEFAULT_DCTCP
default "cdg" if DEFAULT_CDG
default "bbr" if DEFAULT_BBR
+   default "agilesd" if DEFAULT_AGILESD
default "cubic"
 
 config TCP_MD5SIG
diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile
index f83de23a..33d398b5 100644
--- a/net/ipv4/Makefile
+++ b/net/ipv4/Makefile
@@ -44,6 +44,7 @@ obj-$(CONFIG_INET_UDP_DIAG) += udp_diag.o
 obj-$(CONFIG_INET_RAW_DIAG) += raw_diag.o
 obj-$(CONFIG_NET_TCPPROBE) += tcp_probe.o
 obj-$(CONFIG_TCP_CONG_BBR) += tcp_bbr.o
+obj-$(CONFIG_TCP_CONG_AGILESD) += tcp_agilesd.o
 obj-$(CONFIG_TCP_CONG_BIC) += tcp_bic.o
 obj-$(CONFIG_TCP_CONG_CDG) += tcp_cdg.o
 obj-$(CONFIG_TCP_CONG_CUBIC) += tcp_cubic.o
diff --git a/net/ipv4/tcp_agilesd.c b/net/ipv4/tcp_agilesd.c
new file mode 100755
index ..fd040ff2
--- /dev/null
+++ b/net/ipv4/tcp_agilesd.c
@@ -0,0 +1,221 @@
+/* agilesd is a Loss-Based Congestion Control Algorithm for TCP v1.0.
+ * agilesd has been created by Mohamed A. Alrshah,
+ * at Faculty of Computer Science and Information Technology,
+ * Universiti Putra Malaysia.
+ * agilesd is based on the article, which is published in 2015 as below:
+ * 
+ * Alrshah, M.A., Othman, M., Ali, B. and Hanapi, Z.M., 2015. 
+ * Agile-SD: a Linux-based TCP congestion control algorithm for supporting 
high-speed and short-distance networks. 
+ * Journal of Network and Computer Applications, 55, pp.181-190. 
+ */
+
+/* These includes are very important to operate the algorithm under NS2. */
+//#define NS_PROTOCOL "tcp_agilesd.c"
+//#include "../ns-linux-c.h"
+//#include "../ns-linux-util.h"
+//#include 
+/* These includes are very important to operate the algorithm under NS2. */
+
+/* These includes are very important to operate the algorithm under Linux OS. 
*/
+#include 
+#include 
+#include 
+#include 
+//#include // optional
+//#include  // optional
+/* These includes are very important to operate the algorithm under Linux OS. 
*/
+
+#define SCALE   1000   /* Scale factor to avoid fractions 

[PATCH] Adding Agile-SD TCP module and modifying Kconfig and Makefile

2017-08-01 Thread mohamedalrshah
Published:
Alrshah, M.A., Othman, M., Ali, B. and Hanapi, Z.M., 2015. Agile-SD: a 
Linux-based
TCP congestion control algorithm for supporting high-speed and short-distance 
networks.
Journal of Network and Computer Applications, 55, pp.181-190. 

Agile-SD is a new loss-based and RTT-independent TCP congestion
control algorithm designed to support high-speed and Short-Distance (SD)
networks. It mainly contributes the agility factor mechanism, which allows
Agile-SD to deal with small buffer sizes while reducing its sensitivity to
packet loss. Due to the use of this mechanism, Agile-SD improves the
throughput of TCP up to 50% compared to Cubic-TCP and Compound-TCP. Its
performance was evaluated using the well-known NS2 simulator to measure
the average throughput, loss ratio and fairness.

Agile-SD is designed and implemented by Mohamed A. Alrshah et al. (2015) as
a Linux kernel module in the real Linux operating system (openSUSE 42.1 Leap),
which is implemented at the Network, Parallel and Distributed Computing 
Laboratory,
A2.10, second floor, Block A, Faculty of Computer Science and Information 
Technology,
Universiti Putra Malaysia, over real PCs connected to the Internet for daily 
uses and
evaluation purposes.

The main motivation behind Agile-SD is to support the short-distance networks 
such
as local area networks and data center networks in order to increase the 
bandwidth
utilization over high-speed networks. Moreover, Agile-SD is introduced in 
support
of the open source community, where it can be used under the OpenGL agreement.
---
 net/ipv4/Kconfig   |  15 
 net/ipv4/Makefile  |   1 +
 net/ipv4/tcp_agilesd.c | 221 +
 3 files changed, 237 insertions(+)
 create mode 100755 net/ipv4/tcp_agilesd.c

diff --git a/net/ipv4/Kconfig b/net/ipv4/Kconfig
index 91a25579..22d824b1 100644
--- a/net/ipv4/Kconfig
+++ b/net/ipv4/Kconfig
@@ -677,6 +677,17 @@ config TCP_CONG_BBR
bufferbloat, policers, or AQM schemes that do not provide a delay
signal. It requires the fq ("Fair Queue") pacing packet scheduler.
 
+config TCP_CONG_AGILESD
+tristate "Agile-SD Congestion control"
+default n
+---help---
+
+This is version 1.0 of Agile-SD TCP. It is a sender-side only. 
+It contributes the Agility Factor (AF) to shorten the epoch time 
+and to make TCP independent from RTT. AF reduces the sensitivity 
+to packet losses, which in turn Agile-SD to achieve better throughput 
+over high-speed networks.
+
 choice
prompt "Default TCP congestion control"
default DEFAULT_CUBIC
@@ -713,6 +724,9 @@ choice
 
config DEFAULT_BBR
bool "BBR" if TCP_CONG_BBR=y
+   
+config DEFAULT_AGILESD
+   bool "AGILESD" if TCP_CONG_AGILESD=y
 
config DEFAULT_RENO
bool "Reno"
@@ -738,6 +752,7 @@ config DEFAULT_TCP_CONG
default "dctcp" if DEFAULT_DCTCP
default "cdg" if DEFAULT_CDG
default "bbr" if DEFAULT_BBR
+   default "agilesd" if DEFAULT_AGILESD
default "cubic"
 
 config TCP_MD5SIG
diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile
index f83de23a..33d398b5 100644
--- a/net/ipv4/Makefile
+++ b/net/ipv4/Makefile
@@ -44,6 +44,7 @@ obj-$(CONFIG_INET_UDP_DIAG) += udp_diag.o
 obj-$(CONFIG_INET_RAW_DIAG) += raw_diag.o
 obj-$(CONFIG_NET_TCPPROBE) += tcp_probe.o
 obj-$(CONFIG_TCP_CONG_BBR) += tcp_bbr.o
+obj-$(CONFIG_TCP_CONG_AGILESD) += tcp_agilesd.o
 obj-$(CONFIG_TCP_CONG_BIC) += tcp_bic.o
 obj-$(CONFIG_TCP_CONG_CDG) += tcp_cdg.o
 obj-$(CONFIG_TCP_CONG_CUBIC) += tcp_cubic.o
diff --git a/net/ipv4/tcp_agilesd.c b/net/ipv4/tcp_agilesd.c
new file mode 100755
index ..fd040ff2
--- /dev/null
+++ b/net/ipv4/tcp_agilesd.c
@@ -0,0 +1,221 @@
+/* agilesd is a Loss-Based Congestion Control Algorithm for TCP v1.0.
+ * agilesd has been created by Mohamed A. Alrshah,
+ * at Faculty of Computer Science and Information Technology,
+ * Universiti Putra Malaysia.
+ * agilesd is based on the article, which is published in 2015 as below:
+ * 
+ * Alrshah, M.A., Othman, M., Ali, B. and Hanapi, Z.M., 2015. 
+ * Agile-SD: a Linux-based TCP congestion control algorithm for supporting 
high-speed and short-distance networks. 
+ * Journal of Network and Computer Applications, 55, pp.181-190. 
+ */
+
+/* These includes are very important to operate the algorithm under NS2. */
+//#define NS_PROTOCOL "tcp_agilesd.c"
+//#include "../ns-linux-c.h"
+//#include "../ns-linux-util.h"
+//#include 
+/* These includes are very important to operate the algorithm under NS2. */
+
+/* These includes are very important to operate the algorithm under Linux OS. 
*/
+#include 
+#include 
+#include 
+#include 
+//#include // optional
+//#include  // optional
+/* These includes are very important to operate the algorithm under Linux OS. 
*/
+
+#define SCALE   1000   /* Scale factor to avoid fractions 

[no subject]

2017-08-01 Thread системы администратор
внимания;

Ваши сообщения превысил лимит памяти, который составляет 5 Гб, определенных 
администратором, который в настоящее время работает на 10.9GB, Вы не сможете 
отправить или получить новую почту, пока вы повторно не проверить ваш почтовый 
ящик почты. Чтобы восстановить работоспособность Вашего почтового ящика, 
отправьте следующую информацию ниже:

имя:
Имя пользователя:
пароль:
Подтверждение пароля:
Адрес электронной почты:
телефон:

Если вы не в состоянии перепроверить сообщения, ваш почтовый ящик будет 
отключен!

Приносим извинения за неудобства.
Проверочный код: EN: Ru...776774990..2017
Почты технической поддержки ©2017

спасибо
системы администратор


Re: [PATCH v2 05/11] net: stmmac: dwmac-rk: Add internal phy support

2017-08-01 Thread Chen-Yu Tsai
Hi David, Florian, Andrew

(resent in plain text)

On Fri, Jul 28, 2017 at 2:56 PM, David.Wu  wrote:
>
> Hi Florian,
>
> 在 2017/7/28 0:54, Florian Fainelli 写道:
>>
>> - if you need knowledge about this PHY connection type prior to binding
>> the PHY device and its driver (that is, before of_phy_connect()) we
>> could add a boolean property e.g: "phy-is-internal" that allows us to
>> know that, or we can have a new phy-mode value, e.g: "internal-rmii"
>> which describes that, either way would probably be fine, but the former
>> scales better
>>
>
> Using "phy-is-internal" is very helpful, but it is easy to confuse with
> the real internal PHY, may we use the other words like phy is inlined.

If "internal" is confusing, would "phy-is-integrated" in the MAC node work?

Either way we would like to have a definitive solution to this. Our
dwmac-sun8i driver is already in v4.13-rc1, with a somewhat flaky
method of knowing whether the internal PHY is used (phy-mode = "mii").

We really want a fix for this release, otherwise we would be force
to revert either the internal PHY part or the whole driver.

ChenYu

>
>> Then again, using phy-mode = "internal" even though this is Reduced MII
>> is not big of a deal IMHO as long as there is no loss of information and
>> that internal de-facto means internal reduced MII for instance.
>> --
>
>
>
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


Re: Extremely important and Urgent

2017-08-01 Thread Singer Valve
I am getting in touch with you regarding an extremely important and urgent 
matter.

If you would oblige me the opportunity, I shall provide you with details upon 
your response.

Faithfully, 
Ms. Singer Valve


Re: [PATCH 1/2] [for 4.13] net: qcom/emac: disable flow control autonegotiation by default

2017-08-01 Thread Timur Tabi

On 8/1/17 9:58 PM, Andrew Lunn wrote:

  The PHY does not participate directly in flow control/pause frames except by
  making sure that the SUPPORTED_Pause and SUPPORTED_AsymPause bits are set in
  MII_ADVERTISE to indicate towards the link partner that the Ethernet MAC
  controller supports such a thing. Since flow control/pause frames generation
  involves the Ethernet MAC driver, it is recommended that this driver takes 
care
  of properly indicating advertisement and support for such features by setting
  the SUPPORTED_Pause and SUPPORTED_AsymPause bits accordingly. This can be done
  either before or after phy_connect() and/or as a result of implementing the
  ethtool::set_pauseparam feature.

So just check if the MAC driver is setting SUPPORTED_Pause and
SUPPORTED_AsymPause.


I think this was covered in this thread: 
http://www.spinics.net/lists/netdev/msg408978.html


--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, hosted by The Linux Foundation.


[PATCH v2 net-next 2/4] sock: ULP infrastructure

2017-08-01 Thread Tom Herbert
Generalize the TCP ULP infrastructure recently introduced to support
kTLS. This adds a SO_ULP socket option and creates new fields in
sock structure for ULP ops and ULP data. Also, the interface allows
additional per ULP parameters to be set so that a ULP can be pushed
and operations started in one shot.

Signed-off-by: Tom Herbert 
---
 arch/alpha/include/uapi/asm/socket.h   |   2 +
 arch/frv/include/uapi/asm/socket.h |   2 +
 arch/ia64/include/uapi/asm/socket.h|   2 +
 arch/m32r/include/uapi/asm/socket.h|   2 +
 arch/mips/include/uapi/asm/socket.h|   2 +
 arch/mn10300/include/uapi/asm/socket.h |   2 +
 arch/parisc/include/uapi/asm/socket.h  |   2 +
 arch/s390/include/uapi/asm/socket.h|   2 +
 arch/sparc/include/uapi/asm/socket.h   |   2 +
 arch/xtensa/include/uapi/asm/socket.h  |   2 +
 include/linux/socket.h |   9 ++
 include/net/sock.h |   5 +
 include/net/ulp_sock.h |  75 +
 include/uapi/asm-generic/socket.h  |   2 +
 net/Kconfig|   4 +
 net/core/Makefile  |   1 +
 net/core/sock.c|  14 +++
 net/core/sysctl_net_core.c |  25 +
 net/core/ulp_sock.c| 194 +
 19 files changed, 349 insertions(+)
 create mode 100644 include/net/ulp_sock.h
 create mode 100644 net/core/ulp_sock.c

diff --git a/arch/alpha/include/uapi/asm/socket.h 
b/arch/alpha/include/uapi/asm/socket.h
index 7b285dd4fe05..885e8fca79b0 100644
--- a/arch/alpha/include/uapi/asm/socket.h
+++ b/arch/alpha/include/uapi/asm/socket.h
@@ -109,4 +109,6 @@
 
 #define SO_PEERGROUPS  59
 
+#define SO_ULP 60
+
 #endif /* _UAPI_ASM_SOCKET_H */
diff --git a/arch/frv/include/uapi/asm/socket.h 
b/arch/frv/include/uapi/asm/socket.h
index f1e3b20dce9f..8ba71f2a3bf3 100644
--- a/arch/frv/include/uapi/asm/socket.h
+++ b/arch/frv/include/uapi/asm/socket.h
@@ -102,5 +102,7 @@
 
 #define SO_PEERGROUPS  59
 
+#define SO_ULP 60
+
 #endif /* _ASM_SOCKET_H */
 
diff --git a/arch/ia64/include/uapi/asm/socket.h 
b/arch/ia64/include/uapi/asm/socket.h
index 5dd5c5d0d642..2de1c53f88b5 100644
--- a/arch/ia64/include/uapi/asm/socket.h
+++ b/arch/ia64/include/uapi/asm/socket.h
@@ -111,4 +111,6 @@
 
 #define SO_PEERGROUPS  59
 
+#define SO_ULP 60
+
 #endif /* _ASM_IA64_SOCKET_H */
diff --git a/arch/m32r/include/uapi/asm/socket.h 
b/arch/m32r/include/uapi/asm/socket.h
index f8f7b47e247f..b2d394381787 100644
--- a/arch/m32r/include/uapi/asm/socket.h
+++ b/arch/m32r/include/uapi/asm/socket.h
@@ -102,4 +102,6 @@
 
 #define SO_PEERGROUPS  59
 
+#define SO_ULP 60
+
 #endif /* _ASM_M32R_SOCKET_H */
diff --git a/arch/mips/include/uapi/asm/socket.h 
b/arch/mips/include/uapi/asm/socket.h
index 882823bec153..d0bdf8c78220 100644
--- a/arch/mips/include/uapi/asm/socket.h
+++ b/arch/mips/include/uapi/asm/socket.h
@@ -120,4 +120,6 @@
 
 #define SO_PEERGROUPS  59
 
+#define SO_ULP 60
+
 #endif /* _UAPI_ASM_SOCKET_H */
diff --git a/arch/mn10300/include/uapi/asm/socket.h 
b/arch/mn10300/include/uapi/asm/socket.h
index c710db354ff2..686fbf497a13 100644
--- a/arch/mn10300/include/uapi/asm/socket.h
+++ b/arch/mn10300/include/uapi/asm/socket.h
@@ -102,4 +102,6 @@
 
 #define SO_PEERGROUPS  59
 
+#define SO_ULP 60
+
 #endif /* _ASM_SOCKET_H */
diff --git a/arch/parisc/include/uapi/asm/socket.h 
b/arch/parisc/include/uapi/asm/socket.h
index a0d4dc9f4eb2..d6e99deca976 100644
--- a/arch/parisc/include/uapi/asm/socket.h
+++ b/arch/parisc/include/uapi/asm/socket.h
@@ -101,4 +101,6 @@
 
 #define SO_PEERGROUPS  0x4034
 
+#define SO_ULP 0x4035
+
 #endif /* _UAPI_ASM_SOCKET_H */
diff --git a/arch/s390/include/uapi/asm/socket.h 
b/arch/s390/include/uapi/asm/socket.h
index 52a63f4175cb..6b52f162369a 100644
--- a/arch/s390/include/uapi/asm/socket.h
+++ b/arch/s390/include/uapi/asm/socket.h
@@ -108,4 +108,6 @@
 
 #define SO_PEERGROUPS  59
 
+#define SO_ULP 60
+
 #endif /* _ASM_SOCKET_H */
diff --git a/arch/sparc/include/uapi/asm/socket.h 
b/arch/sparc/include/uapi/asm/socket.h
index 186fd8199f54..e765bf781107 100644
--- a/arch/sparc/include/uapi/asm/socket.h
+++ b/arch/sparc/include/uapi/asm/socket.h
@@ -98,6 +98,8 @@
 
 #define SO_PEERGROUPS  0x003d
 
+#define SO_ULP 0x003e
+
 /* Security levels - as per NRL IPv6 - don't actually do anything */
 #define SO_SECURITY_AUTHENTICATION 0x5001
 #define SO_SECURITY_ENCRYPTION_TRANSPORT   0x5002
diff --git a/arch/xtensa/include/uapi/asm/socket.h 
b/arch/xtensa/include/uapi/asm/socket.h
index 3eed2761c149..8eaa2e9e27b6 100644
--- a/arch/xtensa/include/uapi/asm/socket.h
+++ b/arch/xtensa/include/uapi/asm/socket.h
@@ -113,4 +113,6 @@
 
 #define SO_PEERGROUPS  59
 
+#define SO_ULP 60

[PATCH v2 net-next 4/4] ulp: Documention for ULP infrastructure

2017-08-01 Thread Tom Herbert
Add a doc in Documentation/networking

Signed-off-by: Tom Herbert 
---
 Documentation/networking/ulp.txt | 82 
 1 file changed, 82 insertions(+)
 create mode 100644 Documentation/networking/ulp.txt

diff --git a/Documentation/networking/ulp.txt b/Documentation/networking/ulp.txt
new file mode 100644
index ..4d830314b0ff
--- /dev/null
+++ b/Documentation/networking/ulp.txt
@@ -0,0 +1,82 @@
+Upper Layer Protocol (ULP) Infrastructure
+=
+
+The ULP kernel infrastructure provides a means to hook upper layer
+protocol support on a socket. A module may register a ULP hook
+in the kernel. ULP processing is enabled by a setsockopt on a socket
+that specifies the name of the registered ULP to invoked. An
+initialization function is defined for each ULP that can change the
+function entry points of the socket (sendmsg, rcvmsg, etc.) or change
+the socket in other fundamental ways.
+
+Note, no synchronization is enforced between the setsockopt to enable
+a ULP and ongoing asynchronous operations on the socket (such as a
+blocked read). If synchronization is required this must be handled by
+the ULP and caller.
+
+User interface
+==
+
+The structure for the socket SOL_ULP options is defined in socket.h.
+
+Example to enable "my_ulp" ULP on a socket:
+
+struct ulp_config ulpc = {
+.ulp_name = "my_ulp",
+};
+
+setsockopt(sock, SOL_SOCKET, SO_ULP, , sizeof(ulpc))
+
+The ulp_config includes a "__u8 ulp_params[0]" filled that may be used
+to refer ULP specific parameters being set.
+
+Kernel interface
+
+
+The interface for ULP infrastructure is defined in net/ulp_sock.h.
+
+ULP registration functions
+--
+
+int ulp_register(struct ulp_ops *type)
+
+ Called to register a ULP. The ulp_ops structure is described below.
+
+void ulp_unregister(struct ulp_ops *type);
+
+ Called to unregister a ULP.
+
+ulp_ops structure
+-
+
+int (*init)(struct sock *sk, char __user *optval, int len)
+
+ Initialization function for the ULP. This is called from setsockopt
+ when the ULP name in the ulp_config argument matches the registered
+ ULP. optval is a userspace pointer to the ULP specific parameters.
+ len is the length of the ULP specific parameters.
+
+void (*release)(struct sock *sk)
+
+ Called when socket is being destroyed. The ULP implementation
+ should cancel any asynchronous operations (such as timers) and
+ release any acquired resources.
+
+int (*get_params)(struct sock *sk, char __user *optval, int *optlen)
+
+ Get the ULP specific parameters previous set in the init function
+ for the ULP. Note that optlen is a pointer to kernel memory.
+
+char name[ULP_NAME_MAX]
+
+ Name of the ULP. Must be NULL terminated.
+
+struct module *owner
+
+ Corresponding owner for ref count.
+
+Author
+==
+
+Tom Herbert (t...@quantonium.net)
+
-- 
2.11.0



[PATCH v2 net-next 0/4] ulp: Generalize ULP infrastructure

2017-08-01 Thread Tom Herbert
Generalize the ULP infrastructure that was recently introduced to
support kTLS. This adds a SO_ULP socket option and creates new fields in
sock structure for ULP ops and ULP data. Also, the interface allows
additional per ULP parameters to be set so that a ULP can be pushed
and operations started in one shot.

This patch sets:
  - Minor dependency fix in inet_common.h
  - Implement ULP infrastructure as a socket mechanism
  - Fixes TCP and TLS to use the new method (maintaining backwards
API compatibility)
  - Adds a ulp.txt document

Tested: Ran simple ULP.

- v2: Fix compliation errors when CONFIG_ULP_SOCK not set.

Tom Herbert (4):
  inet: include net/sock.h in inet_common.h
  sock: ULP infrastructure
  tcp: Adjust TCP ULP to defer to sockets ULP
  ulp: Documention for ULP infrastructure

 Documentation/networking/tls.txt   |   6 +-
 Documentation/networking/ulp.txt   |  82 ++
 arch/alpha/include/uapi/asm/socket.h   |   2 +
 arch/frv/include/uapi/asm/socket.h |   2 +
 arch/ia64/include/uapi/asm/socket.h|   2 +
 arch/m32r/include/uapi/asm/socket.h|   2 +
 arch/mips/include/uapi/asm/socket.h|   2 +
 arch/mn10300/include/uapi/asm/socket.h |   2 +
 arch/parisc/include/uapi/asm/socket.h  |   2 +
 arch/s390/include/uapi/asm/socket.h|   2 +
 arch/sparc/include/uapi/asm/socket.h   |   2 +
 arch/xtensa/include/uapi/asm/socket.h  |   2 +
 include/linux/socket.h |   9 ++
 include/net/inet_common.h  |   2 +
 include/net/inet_connection_sock.h |   4 -
 include/net/sock.h |   5 +
 include/net/tcp.h  |  25 -
 include/net/tls.h  |   4 +-
 include/net/ulp_sock.h |  75 +
 include/uapi/asm-generic/socket.h  |   2 +
 net/Kconfig|   4 +
 net/core/Makefile  |   1 +
 net/core/sock.c|  14 +++
 net/core/sysctl_net_core.c |  25 +
 net/core/ulp_sock.c| 194 +
 net/ipv4/sysctl_net_ipv4.c |   9 +-
 net/ipv4/tcp.c |  40 ---
 net/ipv4/tcp_ipv4.c|   2 -
 net/ipv4/tcp_ulp.c | 135 ---
 net/tls/Kconfig|   1 +
 net/tls/tls_main.c |  21 ++--
 31 files changed, 480 insertions(+), 200 deletions(-)
 create mode 100644 Documentation/networking/ulp.txt
 create mode 100644 include/net/ulp_sock.h
 create mode 100644 net/core/ulp_sock.c
 delete mode 100644 net/ipv4/tcp_ulp.c

-- 
2.11.0



[PATCH v2 net-next 3/4] tcp: Adjust TCP ULP to defer to sockets ULP

2017-08-01 Thread Tom Herbert
Fix TCP and TLS to use the newer ULP infrastructure in sockets.

Signed-off-by: Tom Herbert 
---
 Documentation/networking/tls.txt   |   6 +-
 include/net/inet_connection_sock.h |   4 --
 include/net/tcp.h  |  25 ---
 include/net/tls.h  |   4 +-
 net/ipv4/sysctl_net_ipv4.c |   9 ++-
 net/ipv4/tcp.c |  40 ++-
 net/ipv4/tcp_ipv4.c|   2 -
 net/ipv4/tcp_ulp.c | 135 -
 net/tls/Kconfig|   1 +
 net/tls/tls_main.c |  21 +++---
 10 files changed, 47 insertions(+), 200 deletions(-)
 delete mode 100644 net/ipv4/tcp_ulp.c

diff --git a/Documentation/networking/tls.txt b/Documentation/networking/tls.txt
index 77ed00631c12..b70309df4709 100644
--- a/Documentation/networking/tls.txt
+++ b/Documentation/networking/tls.txt
@@ -12,8 +12,12 @@ Creating a TLS connection
 
 First create a new TCP socket and set the TLS ULP.
 
+struct ulp_config ulpc = {
+   .ulp_name = "tls",
+};
+
   sock = socket(AF_INET, SOCK_STREAM, 0);
-  setsockopt(sock, SOL_TCP, TCP_ULP, "tls", sizeof("tls"));
+  setsockopt(sock, SOL_SOCKET, SO_ULP, , sizeof(ulpc))
 
 Setting the TLS ULP allows us to set/get TLS socket options. Currently
 only the symmetric encryption is handled in the kernel.  After the TLS
diff --git a/include/net/inet_connection_sock.h 
b/include/net/inet_connection_sock.h
index 13e4c89a8231..c7a577976bec 100644
--- a/include/net/inet_connection_sock.h
+++ b/include/net/inet_connection_sock.h
@@ -75,8 +75,6 @@ struct inet_connection_sock_af_ops {
  * @icsk_pmtu_cookie  Last pmtu seen by socket
  * @icsk_ca_ops   Pluggable congestion control hook
  * @icsk_af_ops   Operations which are AF_INET{4,6} specific
- * @icsk_ulp_ops  Pluggable ULP control hook
- * @icsk_ulp_data ULP private data
  * @icsk_ca_state:Congestion control state
  * @icsk_retransmits: Number of unrecovered [RTO] timeouts
  * @icsk_pending: Scheduled timer event
@@ -99,8 +97,6 @@ struct inet_connection_sock {
__u32 icsk_pmtu_cookie;
const struct tcp_congestion_ops *icsk_ca_ops;
const struct inet_connection_sock_af_ops *icsk_af_ops;
-   const struct tcp_ulp_ops  *icsk_ulp_ops;
-   void  *icsk_ulp_data;
unsigned int  (*icsk_sync_mss)(struct sock *sk, u32 pmtu);
__u8  icsk_ca_state:6,
  icsk_ca_setsockopt:1,
diff --git a/include/net/tcp.h b/include/net/tcp.h
index bb1881b4ce48..65c462da3740 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1968,31 +1968,6 @@ static inline void tcp_listendrop(const struct sock *sk)
 
 enum hrtimer_restart tcp_pace_kick(struct hrtimer *timer);
 
-/*
- * Interface for adding Upper Level Protocols over TCP
- */
-
-#define TCP_ULP_NAME_MAX   16
-#define TCP_ULP_MAX128
-#define TCP_ULP_BUF_MAX(TCP_ULP_NAME_MAX*TCP_ULP_MAX)
-
-struct tcp_ulp_ops {
-   struct list_headlist;
-
-   /* initialize ulp */
-   int (*init)(struct sock *sk);
-   /* cleanup ulp */
-   void (*release)(struct sock *sk);
-
-   charname[TCP_ULP_NAME_MAX];
-   struct module   *owner;
-};
-int tcp_register_ulp(struct tcp_ulp_ops *type);
-void tcp_unregister_ulp(struct tcp_ulp_ops *type);
-int tcp_set_ulp(struct sock *sk, const char *name);
-void tcp_get_available_ulp(char *buf, size_t len);
-void tcp_cleanup_ulp(struct sock *sk);
-
 /* Call BPF_SOCK_OPS program that returns an int. If the return value
  * is < 0, then the BPF op failed (for example if the loaded BPF
  * program does not support the chosen operation or there is no BPF
diff --git a/include/net/tls.h b/include/net/tls.h
index b89d397dd62f..7d88a6e2f5a7 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -214,9 +214,7 @@ static inline void tls_fill_prepend(struct tls_context *ctx,
 
 static inline struct tls_context *tls_get_ctx(const struct sock *sk)
 {
-   struct inet_connection_sock *icsk = inet_csk(sk);
-
-   return icsk->icsk_ulp_data;
+   return sk->sk_ulp_data;
 }
 
 static inline struct tls_sw_context *tls_sw_ctx(
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 0d3c038d7b04..9ab0c278b7ba 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -372,13 +373,15 @@ static int proc_tcp_available_ulp(struct ctl_table *ctl,
  void __user *buffer, size_t *lenp,
  loff_t *ppos)
 {
-   struct ctl_table tbl = { .maxlen = TCP_ULP_BUF_MAX, };
+   struct ctl_table tbl = { .maxlen = ULP_BUF_MAX, };
int ret;
 
tbl.data = kmalloc(tbl.maxlen, GFP_USER);
if 

[PATCH v2 net-next 1/4] inet: include net/sock.h in inet_common.h

2017-08-01 Thread Tom Herbert
inet_common.h has a dependency on sock.h so it should include that.

Signed-off-by: Tom Herbert 
---
 include/net/inet_common.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/net/inet_common.h b/include/net/inet_common.h
index f39ae697347f..df0119a317aa 100644
--- a/include/net/inet_common.h
+++ b/include/net/inet_common.h
@@ -1,6 +1,8 @@
 #ifndef _INET_COMMON_H
 #define _INET_COMMON_H
 
+#include 
+
 extern const struct proto_ops inet_stream_ops;
 extern const struct proto_ops inet_dgram_ops;
 
-- 
2.11.0



Re: [PATCH net-next v2 00/11] net: dsa: rework EEE support

2017-08-01 Thread David Miller
From: Vivien Didelot 
Date: Tue,  1 Aug 2017 16:32:30 -0400

> EEE implies configuring the port's PHY and MAC of both ends of the wire.
> 
> The current EEE support in DSA mixes PHY and MAC configuration, which is
> bad because PHYs must be configured through a proper PHY driver. The DSA
> switch operations for EEE are only meant for configuring the port's MAC,
> which are integrated in the Ethernet switch device.
> 
> This patchset fixes the EEE support in qca8k driver, makes the DSA layer
> call phy_init_eee for all drivers, and remove the EEE support from the
> mv88e6xxx driver since the Marvell PHY driver should be enough for it.
> 
> Changes in v2:
>  - make PHY device and DSA EEE ops mandatory for slave EEE operations.
>  - simply return 0 in drivers which don't need to do anything to
>configure the port' MAC. Subsequent PHY calls will be enough.

Series applied, thanks Vivien.


Re: [PATCH net 0/7] drivers: net: Fix 64-bit statistics seqcount init

2017-08-01 Thread David Miller
From: Florian Fainelli 
Date: Tue,  1 Aug 2017 12:11:05 -0700

> This patch series fixes a bunch of drivers to have their 64-bit statistics
> seqcount cookie be initialized correctly. Most of these drivers (except b44,
> gtp) are probably used on 64-bit only hosts and so the lockdep splat might 
> have
> never been seen.

Series applied, thanks.


[PATCH net-next v3 1/3] netvsc: transparent VF management

2017-08-01 Thread Stephen Hemminger
This patch implements transparent fail over from synthetic NIC to
SR-IOV virtual function NIC in Hyper-V environment. It is a better
alternative to using bonding as is done now. Instead, the receive and
transmit fail over is done internally inside the driver.

Using bonding driver has lots of issues because it depends on the
script being run early enough in the boot process and with sufficient
information to make the association. This patch moves all that
functionality into the kernel.

Signed-off-by: Stephen Hemminger 
---
v3 - fix merge conflict (due to comment change) 

 drivers/net/hyperv/hyperv_net.h |  12 ++
 drivers/net/hyperv/netvsc_drv.c | 419 +++-
 2 files changed, 342 insertions(+), 89 deletions(-)

diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
index f2cef5aaed1f..c701b059c5ac 100644
--- a/drivers/net/hyperv/hyperv_net.h
+++ b/drivers/net/hyperv/hyperv_net.h
@@ -680,6 +680,15 @@ struct netvsc_ethtool_stats {
unsigned long tx_busy;
 };
 
+struct netvsc_vf_pcpu_stats {
+   u64 rx_packets;
+   u64 rx_bytes;
+   u64 tx_packets;
+   u64 tx_bytes;
+   struct u64_stats_sync   syncp;
+   u32 tx_dropped;
+};
+
 struct netvsc_reconfig {
struct list_head list;
u32 event;
@@ -713,6 +722,9 @@ struct net_device_context {
 
/* State to manage the associated VF interface. */
struct net_device __rcu *vf_netdev;
+   struct netvsc_vf_pcpu_stats __percpu *vf_stats;
+   struct work_struct vf_takeover;
+   struct work_struct vf_notify;
 
/* 1: allocated, serial number is valid. 0: not allocated */
u32 vf_alloc;
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 9453eef6d09f..c71728d82049 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -34,6 +34,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -71,6 +72,7 @@ static void netvsc_set_multicast_list(struct net_device *net)
 static int netvsc_open(struct net_device *net)
 {
struct net_device_context *ndev_ctx = netdev_priv(net);
+   struct net_device *vf_netdev = rtnl_dereference(ndev_ctx->vf_netdev);
struct netvsc_device *nvdev = rtnl_dereference(ndev_ctx->nvdev);
struct rndis_device *rdev;
int ret = 0;
@@ -87,15 +89,29 @@ static int netvsc_open(struct net_device *net)
netif_tx_wake_all_queues(net);
 
rdev = nvdev->extension;
-   if (!rdev->link_state && !ndev_ctx->datapath)
+
+   if (!rdev->link_state)
netif_carrier_on(net);
 
-   return ret;
+   if (vf_netdev) {
+   /* Setting synthetic device up transparently sets
+* slave as up. If open fails, then slave will be
+* still be offline (and not used).
+*/
+   ret = dev_open(vf_netdev);
+   if (ret)
+   netdev_warn(net,
+   "unable to open slave: %s: %d\n",
+   vf_netdev->name, ret);
+   }
+   return 0;
 }
 
 static int netvsc_close(struct net_device *net)
 {
struct net_device_context *net_device_ctx = netdev_priv(net);
+   struct net_device *vf_netdev
+   = rtnl_dereference(net_device_ctx->vf_netdev);
struct netvsc_device *nvdev = rtnl_dereference(net_device_ctx->nvdev);
int ret;
u32 aread, i, msec = 10, retry = 0, retry_max = 20;
@@ -141,6 +157,9 @@ static int netvsc_close(struct net_device *net)
ret = -ETIMEDOUT;
}
 
+   if (vf_netdev)
+   dev_close(vf_netdev);
+
return ret;
 }
 
@@ -224,13 +243,11 @@ static inline int netvsc_get_tx_queue(struct net_device 
*ndev,
  *
  * TODO support XPS - but get_xps_queue not exported
  */
-static u16 netvsc_select_queue(struct net_device *ndev, struct sk_buff *skb,
-   void *accel_priv, select_queue_fallback_t fallback)
+static u16 netvsc_pick_tx(struct net_device *ndev, struct sk_buff *skb)
 {
-   unsigned int num_tx_queues = ndev->real_num_tx_queues;
int q_idx = sk_tx_queue_get(skb->sk);
 
-   if (q_idx < 0 || skb->ooo_okay) {
+   if (q_idx < 0 || skb->ooo_okay || q_idx >= ndev->real_num_tx_queues) {
/* If forwarding a packet, we use the recorded queue when
 * available for better cache locality.
 */
@@ -240,12 +257,33 @@ static u16 netvsc_select_queue(struct net_device *ndev, 
struct sk_buff *skb,
q_idx = netvsc_get_tx_queue(ndev, skb, q_idx);
}
 
-   while (unlikely(q_idx >= num_tx_queues))
-   q_idx -= num_tx_queues;
-
return q_idx;
 }
 
+static u16 netvsc_select_queue(struct net_device *ndev, struct sk_buff *skb,
+  void *accel_priv,
+  

[PATCH net-next v3 2/3] netvsc: add documentation

2017-08-01 Thread Stephen Hemminger
Add some background documentation on netvsc device options
and limitations.

Signed-off-by: Stephen Hemminger 
---
 Documentation/networking/netvsc.txt | 63 +
 MAINTAINERS |  1 +
 2 files changed, 64 insertions(+)
 create mode 100644 Documentation/networking/netvsc.txt

diff --git a/Documentation/networking/netvsc.txt 
b/Documentation/networking/netvsc.txt
new file mode 100644
index ..4ddb4e4b0426
--- /dev/null
+++ b/Documentation/networking/netvsc.txt
@@ -0,0 +1,63 @@
+Hyper-V network driver
+==
+
+Compatibility
+=
+
+This driver is compatible with Windows Server 2012 R2, 2016 and
+Windows 10.
+
+Features
+
+
+  Checksum offload
+  
+  The netvsc driver supports checksum offload as long as the
+  Hyper-V host version does. Windows Server 2016 and Azure
+  support checksum offload for TCP and UDP for both IPv4 and
+  IPv6. Windows Server 2012 only supports checksum offload for TCP.
+
+  Receive Side Scaling
+  
+  Hyper-V supports receive side scaling. For TCP, packets are
+  distributed among available queues based on IP address and port
+  number. Current versions of Hyper-V host, only distribute UDP
+  packets based on the IP source and destination address.
+  The port number is not used as part of the hash value for UDP.
+  Fragmented IP packets are not distributed between queues;
+  all fragmented packets arrive on the first channel.
+
+  Generic Receive Offload, aka GRO
+  
+  The driver supports GRO and it is enabled by default. GRO coalesces
+  like packets and significantly reduces CPU usage under heavy Rx
+  load.
+
+  SR-IOV support
+  --
+  Hyper-V supports SR-IOV as a hardware acceleration option. If SR-IOV
+  is enabled in both the vSwitch and the guest configuration, then the
+  Virtual Function (VF) device is passed to the guest as a PCI
+  device. In this case, both a synthetic (netvsc) and VF device are
+  visible in the guest OS and both NIC's have the same MAC address.
+
+  The VF is enslaved by netvsc device.  The netvsc driver will transparently
+  switch the data path to the VF when it is available and up.
+  Network state (addresses, firewall, etc) should be applied only to the
+  netvsc device; the slave device should not be accessed directly in
+  most cases.  The exceptions are if some special queue discipline or
+  flow direction is desired, these should be applied directly to the
+  VF slave device.
+
+  Receive Buffer
+  --
+  Packets are received into a receive area which is created when device
+  is probed. The receive area is broken into MTU sized chunks and each may
+  contain one or more packets. The number of receive sections may be changed
+  via ethtool Rx ring parameters.
+
+  There is a similar send buffer which is used to aggregate packets for 
sending.
+  The send area is broken into chunks of 6144 bytes, each of section may
+  contain one or more packets. The send buffer is an optimization, the driver
+  will use slower method to handle very large packets or if the send buffer
+  area is exhausted.
diff --git a/MAINTAINERS b/MAINTAINERS
index 207e45310620..448f2f67802f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6258,6 +6258,7 @@ M:Haiyang Zhang 
 M: Stephen Hemminger 
 L: de...@linuxdriverproject.org
 S: Maintained
+F: Documentation/networking/netvsc.txt
 F: arch/x86/include/asm/mshyperv.h
 F: arch/x86/include/uapi/asm/hyperv.h
 F: arch/x86/kernel/cpu/mshyperv.c
-- 
2.11.0



[PATCH net-next v3 3/3] netvsc: remove bonding setup script

2017-08-01 Thread Stephen Hemminger
No longer needed, now all managed by transparent VF logic.

Signed-off-by: Stephen Hemminger 
---
 tools/hv/bondvf.sh | 255 -
 1 file changed, 255 deletions(-)
 delete mode 100755 tools/hv/bondvf.sh

diff --git a/tools/hv/bondvf.sh b/tools/hv/bondvf.sh
deleted file mode 100755
index 80f102860cf8..
--- a/tools/hv/bondvf.sh
+++ /dev/null
@@ -1,255 +0,0 @@
-#!/bin/bash
-
-# This example script creates bonding network devices based on synthetic NIC
-# (the virtual network adapter usually provided by Hyper-V) and the matching
-# VF NIC (SRIOV virtual function). So the synthetic NIC and VF NIC can
-# function as one network device, and fail over to the synthetic NIC if VF is
-# down.
-#
-# Usage:
-# - After configured vSwitch and vNIC with SRIOV, start Linux virtual
-#   machine (VM)
-# - Run this scripts on the VM. It will create configuration files in
-#   distro specific directory.
-# - Reboot the VM, so that the bonding config are enabled.
-#
-# The config files are DHCP by default. You may edit them if you need to change
-# to Static IP or change other settings.
-#
-
-sysdir=/sys/class/net
-netvsc_cls={f8615163-df3e-46c5-913f-f2d2f965ed0e}
-bondcnt=0
-
-# Detect Distro
-if [ -f /etc/redhat-release ];
-then
-   cfgdir=/etc/sysconfig/network-scripts
-   distro=redhat
-elif grep -q 'Ubuntu' /etc/issue
-then
-   cfgdir=/etc/network
-   distro=ubuntu
-elif grep -q 'SUSE' /etc/issue
-then
-   cfgdir=/etc/sysconfig/network
-   distro=suse
-else
-   echo "Unsupported Distro"
-   exit 1
-fi
-
-echo Detected Distro: $distro, or compatible
-
-# Get a list of ethernet names
-list_eth=(`cd $sysdir && ls -d */ | cut -d/ -f1 | grep -v bond`)
-eth_cnt=${#list_eth[@]}
-
-echo List of net devices:
-
-# Get the MAC addresses
-for (( i=0; i < $eth_cnt; i++ ))
-do
-   list_mac[$i]=`cat $sysdir/${list_eth[$i]}/address`
-   echo ${list_eth[$i]}, ${list_mac[$i]}
-done
-
-# Find NIC with matching MAC
-for (( i=0; i < $eth_cnt-1; i++ ))
-do
-   for (( j=i+1; j < $eth_cnt; j++ ))
-   do
-   if [ "${list_mac[$i]}" = "${list_mac[$j]}" ]
-   then
-   list_match[$i]=${list_eth[$j]}
-   break
-   fi
-   done
-done
-
-function create_eth_cfg_redhat {
-   local fn=$cfgdir/ifcfg-$1
-
-   rm -f $fn
-   echo DEVICE=$1 >>$fn
-   echo TYPE=Ethernet >>$fn
-   echo BOOTPROTO=none >>$fn
-   echo UUID=`uuidgen` >>$fn
-   echo ONBOOT=yes >>$fn
-   echo PEERDNS=yes >>$fn
-   echo IPV6INIT=yes >>$fn
-   echo MASTER=$2 >>$fn
-   echo SLAVE=yes >>$fn
-}
-
-function create_eth_cfg_pri_redhat {
-   create_eth_cfg_redhat $1 $2
-}
-
-function create_bond_cfg_redhat {
-   local fn=$cfgdir/ifcfg-$1
-
-   rm -f $fn
-   echo DEVICE=$1 >>$fn
-   echo TYPE=Bond >>$fn
-   echo BOOTPROTO=dhcp >>$fn
-   echo UUID=`uuidgen` >>$fn
-   echo ONBOOT=yes >>$fn
-   echo PEERDNS=yes >>$fn
-   echo IPV6INIT=yes >>$fn
-   echo BONDING_MASTER=yes >>$fn
-   echo BONDING_OPTS=\"mode=active-backup miimon=100 primary=$2\" >>$fn
-}
-
-function del_eth_cfg_ubuntu {
-   local mainfn=$cfgdir/interfaces
-   local fnlist=( $mainfn )
-
-   local dirlist=(`awk '/^[ \t]*source/{print $2}' $mainfn`)
-
-   local i
-   for i in "${dirlist[@]}"
-   do
-   fnlist+=(`ls $i 2>/dev/null`)
-   done
-
-   local tmpfl=$(mktemp)
-
-   local nic_start='^[ \t]*(auto|iface|mapping|allow-.*)[ \t]+'$1
-   local nic_end='^[ \t]*(auto|iface|mapping|allow-.*|source)'
-
-   local fn
-   for fn in "${fnlist[@]}"
-   do
-   awk "/$nic_end/{x=0} x{next} /$nic_start/{x=1;next} 1" \
-   $fn >$tmpfl
-
-   cp $tmpfl $fn
-   done
-
-   rm $tmpfl
-}
-
-function create_eth_cfg_ubuntu {
-   local fn=$cfgdir/interfaces
-
-   del_eth_cfg_ubuntu $1
-   echo $'\n'auto $1 >>$fn
-   echo iface $1 inet manual >>$fn
-   echo bond-master $2 >>$fn
-}
-
-function create_eth_cfg_pri_ubuntu {
-   local fn=$cfgdir/interfaces
-
-   del_eth_cfg_ubuntu $1
-   echo $'\n'allow-hotplug $1 >>$fn
-   echo iface $1 inet manual >>$fn
-   echo bond-master $2 >>$fn
-   echo bond-primary $1 >>$fn
-}
-
-function create_bond_cfg_ubuntu {
-   local fn=$cfgdir/interfaces
-
-   del_eth_cfg_ubuntu $1
-
-   echo $'\n'auto $1 >>$fn
-   echo iface $1 inet dhcp >>$fn
-   echo bond-mode active-backup >>$fn
-   echo bond-miimon 100 >>$fn
-   echo bond-slaves none >>$fn
-}
-
-function create_eth_cfg_suse {
-local fn=$cfgdir/ifcfg-$1
-
-rm -f $fn
-   echo BOOTPROTO=none >>$fn
-   echo STARTMODE=auto >>$fn
-}
-
-function create_eth_cfg_pri_suse {
-   local fn=$cfgdir/ifcfg-$1
-
-   rm -f $fn
-   echo BOOTPROTO=none >>$fn
-   echo 

[PATCH net-next v3 0/3] netvsc: transparent VF support

2017-08-01 Thread Stephen Hemminger
This patch set changes how SR-IOV Virtual Function devices are managed
in the Hyper-V network driver. This version is rebased onto current net-next.

Background

In Hyper-V SR-IOV can be enabled (and disabled) by changing guest settings
on host. When SR-IOV is enabled a matching PCI device is hot plugged and
visible on guest. The VF device is an add-on to an existing netvsc
device, and has the same MAC address.

How is this different?

The original support of VF relied on using bonding driver in active
standby mode to handle the VF device.

With the new netvsc VF logic, the Linux hyper-V network
virtual driver will directly manage the link to SR-IOV VF device.
When VF device is detected (hot plug) it is automatically made a
slave device of the netvsc device. The VF device state reflects
the state of the netvsc device; i.e. if netvsc is set down, then
VF is set down. If netvsc is set up, then VF is brought up.
 
Packet flow is independent of VF status; all packets are sent and
received as if they were associated with the netvsc device. If VF is
removed or link is down then the synthetic VMBUS path is used.
 
What was wrong with using bonding script?

A lot of work went into getting the bonding script to work on all
distributions, but it was a major struggle. Linux network devices
can be configured many, many ways and there is no one solution from
userspace to make it all work. What is really hard is when
configuration is attached to synthetic device during boot (eth0) and
then the same addresses and firewall rules needs to also work later if
doing bonding. The new code gets around all of this.
 
How does VF work during initialization?

Since all packets are sent and received through the logical netvsc
device, initialization is much easier. Just configure the regular
netvsc Ethernet device; when/if SR-IOV is enabled it just
works. Provisioning and cloud init only need to worry about setting up
netvsc device (eth0). If SR-IOV is enabled (even as a later step), the
address and rules stay the same.
 
What devices show up?

Both netvsc and PCI devices are visible in the system. The netvsc
device is active and named in usual manner (eth0). The PCI device is
visible to Linux and gets renamed by udev to a persistent name
(enP2p3s0). The PCI device name is now irrelevant now.

The logic also sets the PCI VF device SLAVE flag on the network
device so network tools can see the relationship if they are smart
enough to understand how layered devices work.
 
This is a lot like how I see Windows working.
The VF device is visible in Device Manager, but is not configured.
 
Is there any performance impact?
There is no visible change in performance. The bonding
and netvsc driver both have equivalent steps.
 
Is it compatible with old bonding script?

It turns out that if you use the old bonding script, then everything
still works but in a sub-optimum manner. What happens is that bonding
is unable to steal the VF from the netvsc device so it creates a one
legged bond.  Packet flow then is:
bond0 <--> eth0 <- -> VF (enP2p3s0).
In other words, if you get it wrong it still works, just
awkward and slower.
 
What if I add address or firewall rule onto the VF?

Same problems occur with now as already occur with bonding, bridging,
teaming on Linux if user incorrectly does configuration onto
an underlying slave device. It will sort of work, packets will come in
and out but the Linux kernel gets confused and things like ARP don’t
work right.  There is no way to block manipulation of the slave
device, and I am sure someone will find some special use case where
they want it.

Stephen Hemminger (3):
  netvsc: transparent VF management
  netvsc: add documentation
  netvsc: remove bonding setup script

 Documentation/networking/netvsc.txt |  63 ++
 MAINTAINERS |   1 +
 drivers/net/hyperv/hyperv_net.h |  12 ++
 drivers/net/hyperv/netvsc_drv.c | 419 
 tools/hv/bondvf.sh  | 255 --
 5 files changed, 406 insertions(+), 344 deletions(-)
 create mode 100644 Documentation/networking/netvsc.txt
 delete mode 100755 tools/hv/bondvf.sh

-- 
2.11.0



Re: [PATCH 1/2] [for 4.13] net: qcom/emac: disable flow control autonegotiation by default

2017-08-01 Thread Andrew Lunn
On Tue, Aug 01, 2017 at 07:56:31PM -0500, Timur Tabi wrote:
> On 8/1/17 6:15 PM, Andrew Lunn wrote:
> >Pause frames are something you can auto-negotiate at the PHY
> >level. Should you also be clearing some bits in the phydev, so the
> >peer knows pause frames are not supported?
> 
> When pause frame autonegotiation is enabled in the driver, that only means
> that the driver looks at what the PHY has autonegotiated, and then
> configures the MAC to match that.
> 
> The driver doesn't touch the PHY at all.  It leaves all that to phylib.
> 
> Now if autonegotiation is disabled in the driver, then it just hard-codes
> those TX/RX settings in the driver.  Are you saying I should program the PHY
> at the point to disable autonegotiation on the PHY level?  If so, then I
> don't know how to do that.  I just assumed that the MAC never tells the PHY
> what to do.

Documentation/networking/phy.txt says:

Pause frames / flow control

 The PHY does not participate directly in flow control/pause frames except by
 making sure that the SUPPORTED_Pause and SUPPORTED_AsymPause bits are set in
 MII_ADVERTISE to indicate towards the link partner that the Ethernet MAC
 controller supports such a thing. Since flow control/pause frames generation
 involves the Ethernet MAC driver, it is recommended that this driver takes care
 of properly indicating advertisement and support for such features by setting
 the SUPPORTED_Pause and SUPPORTED_AsymPause bits accordingly. This can be done
 either before or after phy_connect() and/or as a result of implementing the
 ethtool::set_pauseparam feature.

So just check if the MAC driver is setting SUPPORTED_Pause and
SUPPORTED_AsymPause.

Andrew


[PATCH net-next 4/4] ulp: Documention for ULP infrastructure

2017-08-01 Thread Tom Herbert
Add a doc in Documentation/networking

Signed-off-by: Tom Herbert 
---
 Documentation/networking/ulp.txt | 82 
 1 file changed, 82 insertions(+)
 create mode 100644 Documentation/networking/ulp.txt

diff --git a/Documentation/networking/ulp.txt b/Documentation/networking/ulp.txt
new file mode 100644
index ..4d830314b0ff
--- /dev/null
+++ b/Documentation/networking/ulp.txt
@@ -0,0 +1,82 @@
+Upper Layer Protocol (ULP) Infrastructure
+=
+
+The ULP kernel infrastructure provides a means to hook upper layer
+protocol support on a socket. A module may register a ULP hook
+in the kernel. ULP processing is enabled by a setsockopt on a socket
+that specifies the name of the registered ULP to invoked. An
+initialization function is defined for each ULP that can change the
+function entry points of the socket (sendmsg, rcvmsg, etc.) or change
+the socket in other fundamental ways.
+
+Note, no synchronization is enforced between the setsockopt to enable
+a ULP and ongoing asynchronous operations on the socket (such as a
+blocked read). If synchronization is required this must be handled by
+the ULP and caller.
+
+User interface
+==
+
+The structure for the socket SOL_ULP options is defined in socket.h.
+
+Example to enable "my_ulp" ULP on a socket:
+
+struct ulp_config ulpc = {
+.ulp_name = "my_ulp",
+};
+
+setsockopt(sock, SOL_SOCKET, SO_ULP, , sizeof(ulpc))
+
+The ulp_config includes a "__u8 ulp_params[0]" filled that may be used
+to refer ULP specific parameters being set.
+
+Kernel interface
+
+
+The interface for ULP infrastructure is defined in net/ulp_sock.h.
+
+ULP registration functions
+--
+
+int ulp_register(struct ulp_ops *type)
+
+ Called to register a ULP. The ulp_ops structure is described below.
+
+void ulp_unregister(struct ulp_ops *type);
+
+ Called to unregister a ULP.
+
+ulp_ops structure
+-
+
+int (*init)(struct sock *sk, char __user *optval, int len)
+
+ Initialization function for the ULP. This is called from setsockopt
+ when the ULP name in the ulp_config argument matches the registered
+ ULP. optval is a userspace pointer to the ULP specific parameters.
+ len is the length of the ULP specific parameters.
+
+void (*release)(struct sock *sk)
+
+ Called when socket is being destroyed. The ULP implementation
+ should cancel any asynchronous operations (such as timers) and
+ release any acquired resources.
+
+int (*get_params)(struct sock *sk, char __user *optval, int *optlen)
+
+ Get the ULP specific parameters previous set in the init function
+ for the ULP. Note that optlen is a pointer to kernel memory.
+
+char name[ULP_NAME_MAX]
+
+ Name of the ULP. Must be NULL terminated.
+
+struct module *owner
+
+ Corresponding owner for ref count.
+
+Author
+==
+
+Tom Herbert (t...@quantonium.net)
+
-- 
2.11.0



[PATCH net-next 3/4] tcp: Adjust TCP ULP to defer to sockets ULP

2017-08-01 Thread Tom Herbert
Fix TCP and TLS to use the newer ULP infrastructure in sockets.

Signed-off-by: Tom Herbert 
---
 Documentation/networking/tls.txt   |   6 +-
 include/net/inet_connection_sock.h |   4 --
 include/net/tcp.h  |  25 ---
 include/net/tls.h  |   4 +-
 net/ipv4/sysctl_net_ipv4.c |   9 ++-
 net/ipv4/tcp.c |  40 ++-
 net/ipv4/tcp_ipv4.c|   2 -
 net/ipv4/tcp_ulp.c | 135 -
 net/tls/Kconfig|   1 +
 net/tls/tls_main.c |  21 +++---
 10 files changed, 47 insertions(+), 200 deletions(-)
 delete mode 100644 net/ipv4/tcp_ulp.c

diff --git a/Documentation/networking/tls.txt b/Documentation/networking/tls.txt
index 77ed00631c12..b70309df4709 100644
--- a/Documentation/networking/tls.txt
+++ b/Documentation/networking/tls.txt
@@ -12,8 +12,12 @@ Creating a TLS connection
 
 First create a new TCP socket and set the TLS ULP.
 
+struct ulp_config ulpc = {
+   .ulp_name = "tls",
+};
+
   sock = socket(AF_INET, SOCK_STREAM, 0);
-  setsockopt(sock, SOL_TCP, TCP_ULP, "tls", sizeof("tls"));
+  setsockopt(sock, SOL_SOCKET, SO_ULP, , sizeof(ulpc))
 
 Setting the TLS ULP allows us to set/get TLS socket options. Currently
 only the symmetric encryption is handled in the kernel.  After the TLS
diff --git a/include/net/inet_connection_sock.h 
b/include/net/inet_connection_sock.h
index 13e4c89a8231..c7a577976bec 100644
--- a/include/net/inet_connection_sock.h
+++ b/include/net/inet_connection_sock.h
@@ -75,8 +75,6 @@ struct inet_connection_sock_af_ops {
  * @icsk_pmtu_cookie  Last pmtu seen by socket
  * @icsk_ca_ops   Pluggable congestion control hook
  * @icsk_af_ops   Operations which are AF_INET{4,6} specific
- * @icsk_ulp_ops  Pluggable ULP control hook
- * @icsk_ulp_data ULP private data
  * @icsk_ca_state:Congestion control state
  * @icsk_retransmits: Number of unrecovered [RTO] timeouts
  * @icsk_pending: Scheduled timer event
@@ -99,8 +97,6 @@ struct inet_connection_sock {
__u32 icsk_pmtu_cookie;
const struct tcp_congestion_ops *icsk_ca_ops;
const struct inet_connection_sock_af_ops *icsk_af_ops;
-   const struct tcp_ulp_ops  *icsk_ulp_ops;
-   void  *icsk_ulp_data;
unsigned int  (*icsk_sync_mss)(struct sock *sk, u32 pmtu);
__u8  icsk_ca_state:6,
  icsk_ca_setsockopt:1,
diff --git a/include/net/tcp.h b/include/net/tcp.h
index bb1881b4ce48..65c462da3740 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1968,31 +1968,6 @@ static inline void tcp_listendrop(const struct sock *sk)
 
 enum hrtimer_restart tcp_pace_kick(struct hrtimer *timer);
 
-/*
- * Interface for adding Upper Level Protocols over TCP
- */
-
-#define TCP_ULP_NAME_MAX   16
-#define TCP_ULP_MAX128
-#define TCP_ULP_BUF_MAX(TCP_ULP_NAME_MAX*TCP_ULP_MAX)
-
-struct tcp_ulp_ops {
-   struct list_headlist;
-
-   /* initialize ulp */
-   int (*init)(struct sock *sk);
-   /* cleanup ulp */
-   void (*release)(struct sock *sk);
-
-   charname[TCP_ULP_NAME_MAX];
-   struct module   *owner;
-};
-int tcp_register_ulp(struct tcp_ulp_ops *type);
-void tcp_unregister_ulp(struct tcp_ulp_ops *type);
-int tcp_set_ulp(struct sock *sk, const char *name);
-void tcp_get_available_ulp(char *buf, size_t len);
-void tcp_cleanup_ulp(struct sock *sk);
-
 /* Call BPF_SOCK_OPS program that returns an int. If the return value
  * is < 0, then the BPF op failed (for example if the loaded BPF
  * program does not support the chosen operation or there is no BPF
diff --git a/include/net/tls.h b/include/net/tls.h
index b89d397dd62f..7d88a6e2f5a7 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -214,9 +214,7 @@ static inline void tls_fill_prepend(struct tls_context *ctx,
 
 static inline struct tls_context *tls_get_ctx(const struct sock *sk)
 {
-   struct inet_connection_sock *icsk = inet_csk(sk);
-
-   return icsk->icsk_ulp_data;
+   return sk->sk_ulp_data;
 }
 
 static inline struct tls_sw_context *tls_sw_ctx(
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 0d3c038d7b04..9ab0c278b7ba 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -372,13 +373,15 @@ static int proc_tcp_available_ulp(struct ctl_table *ctl,
  void __user *buffer, size_t *lenp,
  loff_t *ppos)
 {
-   struct ctl_table tbl = { .maxlen = TCP_ULP_BUF_MAX, };
+   struct ctl_table tbl = { .maxlen = ULP_BUF_MAX, };
int ret;
 
tbl.data = kmalloc(tbl.maxlen, GFP_USER);
if 

[PATCH net-next 2/4] sock: ULP infrastructure

2017-08-01 Thread Tom Herbert
Generalize the TCP ULP infrastructure recently introduced to support
kTLS. This adds a SO_ULP socket option and creates new fields in
sock structure for ULP ops and ULP data. Also, the interface allows
additional per ULP parameters to be set so that a ULP can be pushed
and operations started in one shot.

Signed-off-by: Tom Herbert 
---
 arch/alpha/include/uapi/asm/socket.h   |   2 +
 arch/frv/include/uapi/asm/socket.h |   2 +
 arch/ia64/include/uapi/asm/socket.h|   2 +
 arch/m32r/include/uapi/asm/socket.h|   2 +
 arch/mips/include/uapi/asm/socket.h|   2 +
 arch/mn10300/include/uapi/asm/socket.h |   2 +
 arch/parisc/include/uapi/asm/socket.h  |   2 +
 arch/s390/include/uapi/asm/socket.h|   2 +
 arch/sparc/include/uapi/asm/socket.h   |   2 +
 arch/xtensa/include/uapi/asm/socket.h  |   2 +
 include/linux/socket.h |   9 ++
 include/net/sock.h |   5 +
 include/net/ulp_sock.h |  65 +++
 include/uapi/asm-generic/socket.h  |   2 +
 net/Kconfig|   4 +
 net/core/Makefile  |   1 +
 net/core/sock.c|  14 +++
 net/core/sysctl_net_core.c |  25 +
 net/core/ulp_sock.c| 194 +
 19 files changed, 339 insertions(+)
 create mode 100644 include/net/ulp_sock.h
 create mode 100644 net/core/ulp_sock.c

diff --git a/arch/alpha/include/uapi/asm/socket.h 
b/arch/alpha/include/uapi/asm/socket.h
index 7b285dd4fe05..885e8fca79b0 100644
--- a/arch/alpha/include/uapi/asm/socket.h
+++ b/arch/alpha/include/uapi/asm/socket.h
@@ -109,4 +109,6 @@
 
 #define SO_PEERGROUPS  59
 
+#define SO_ULP 60
+
 #endif /* _UAPI_ASM_SOCKET_H */
diff --git a/arch/frv/include/uapi/asm/socket.h 
b/arch/frv/include/uapi/asm/socket.h
index f1e3b20dce9f..8ba71f2a3bf3 100644
--- a/arch/frv/include/uapi/asm/socket.h
+++ b/arch/frv/include/uapi/asm/socket.h
@@ -102,5 +102,7 @@
 
 #define SO_PEERGROUPS  59
 
+#define SO_ULP 60
+
 #endif /* _ASM_SOCKET_H */
 
diff --git a/arch/ia64/include/uapi/asm/socket.h 
b/arch/ia64/include/uapi/asm/socket.h
index 5dd5c5d0d642..2de1c53f88b5 100644
--- a/arch/ia64/include/uapi/asm/socket.h
+++ b/arch/ia64/include/uapi/asm/socket.h
@@ -111,4 +111,6 @@
 
 #define SO_PEERGROUPS  59
 
+#define SO_ULP 60
+
 #endif /* _ASM_IA64_SOCKET_H */
diff --git a/arch/m32r/include/uapi/asm/socket.h 
b/arch/m32r/include/uapi/asm/socket.h
index f8f7b47e247f..b2d394381787 100644
--- a/arch/m32r/include/uapi/asm/socket.h
+++ b/arch/m32r/include/uapi/asm/socket.h
@@ -102,4 +102,6 @@
 
 #define SO_PEERGROUPS  59
 
+#define SO_ULP 60
+
 #endif /* _ASM_M32R_SOCKET_H */
diff --git a/arch/mips/include/uapi/asm/socket.h 
b/arch/mips/include/uapi/asm/socket.h
index 882823bec153..d0bdf8c78220 100644
--- a/arch/mips/include/uapi/asm/socket.h
+++ b/arch/mips/include/uapi/asm/socket.h
@@ -120,4 +120,6 @@
 
 #define SO_PEERGROUPS  59
 
+#define SO_ULP 60
+
 #endif /* _UAPI_ASM_SOCKET_H */
diff --git a/arch/mn10300/include/uapi/asm/socket.h 
b/arch/mn10300/include/uapi/asm/socket.h
index c710db354ff2..686fbf497a13 100644
--- a/arch/mn10300/include/uapi/asm/socket.h
+++ b/arch/mn10300/include/uapi/asm/socket.h
@@ -102,4 +102,6 @@
 
 #define SO_PEERGROUPS  59
 
+#define SO_ULP 60
+
 #endif /* _ASM_SOCKET_H */
diff --git a/arch/parisc/include/uapi/asm/socket.h 
b/arch/parisc/include/uapi/asm/socket.h
index a0d4dc9f4eb2..d6e99deca976 100644
--- a/arch/parisc/include/uapi/asm/socket.h
+++ b/arch/parisc/include/uapi/asm/socket.h
@@ -101,4 +101,6 @@
 
 #define SO_PEERGROUPS  0x4034
 
+#define SO_ULP 0x4035
+
 #endif /* _UAPI_ASM_SOCKET_H */
diff --git a/arch/s390/include/uapi/asm/socket.h 
b/arch/s390/include/uapi/asm/socket.h
index 52a63f4175cb..6b52f162369a 100644
--- a/arch/s390/include/uapi/asm/socket.h
+++ b/arch/s390/include/uapi/asm/socket.h
@@ -108,4 +108,6 @@
 
 #define SO_PEERGROUPS  59
 
+#define SO_ULP 60
+
 #endif /* _ASM_SOCKET_H */
diff --git a/arch/sparc/include/uapi/asm/socket.h 
b/arch/sparc/include/uapi/asm/socket.h
index 186fd8199f54..e765bf781107 100644
--- a/arch/sparc/include/uapi/asm/socket.h
+++ b/arch/sparc/include/uapi/asm/socket.h
@@ -98,6 +98,8 @@
 
 #define SO_PEERGROUPS  0x003d
 
+#define SO_ULP 0x003e
+
 /* Security levels - as per NRL IPv6 - don't actually do anything */
 #define SO_SECURITY_AUTHENTICATION 0x5001
 #define SO_SECURITY_ENCRYPTION_TRANSPORT   0x5002
diff --git a/arch/xtensa/include/uapi/asm/socket.h 
b/arch/xtensa/include/uapi/asm/socket.h
index 3eed2761c149..8eaa2e9e27b6 100644
--- a/arch/xtensa/include/uapi/asm/socket.h
+++ b/arch/xtensa/include/uapi/asm/socket.h
@@ -113,4 +113,6 @@
 
 #define SO_PEERGROUPS  59
 
+#define SO_ULP 60
+

[PATCH net-next 1/4] inet: include net/sock.h in inet_common.h

2017-08-01 Thread Tom Herbert
inet_common.h has a dependency on sock.h so it should include that.

Signed-off-by: Tom Herbert 
---
 include/net/inet_common.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/net/inet_common.h b/include/net/inet_common.h
index f39ae697347f..df0119a317aa 100644
--- a/include/net/inet_common.h
+++ b/include/net/inet_common.h
@@ -1,6 +1,8 @@
 #ifndef _INET_COMMON_H
 #define _INET_COMMON_H
 
+#include 
+
 extern const struct proto_ops inet_stream_ops;
 extern const struct proto_ops inet_dgram_ops;
 
-- 
2.11.0



[PATCH net-next 0/4] ulp: Generalize ULP infrastructure

2017-08-01 Thread Tom Herbert
Generalize the ULP infrastructure that was recently introduced to
support kTLS. This adds a SO_ULP socket option and creates new fields in
sock structure for ULP ops and ULP data. Also, the interface allows
additional per ULP parameters to be set so that a ULP can be pushed
and operations started in one shot.

This patch sets:
  - Minor dependency fix in inet_common.h
  - Implement ULP infrastructure as a socket mechanism
  - Fixes TCP and TLS to use the new method (maintaining backwards
API compatibility)
  - Adds a ulp.txt document

Tested: Ran simple ULP.

Tom Herbert (4):
  inet: include net/sock.h in inet_common.h
  sock: ULP infrastructure
  tcp: Adjust TCP ULP to defer to sockets ULP
  ulp: Documention for ULP infrastructure

 Documentation/networking/tls.txt   |   6 +-
 Documentation/networking/ulp.txt   |  82 ++
 arch/alpha/include/uapi/asm/socket.h   |   2 +
 arch/frv/include/uapi/asm/socket.h |   2 +
 arch/ia64/include/uapi/asm/socket.h|   2 +
 arch/m32r/include/uapi/asm/socket.h|   2 +
 arch/mips/include/uapi/asm/socket.h|   2 +
 arch/mn10300/include/uapi/asm/socket.h |   2 +
 arch/parisc/include/uapi/asm/socket.h  |   2 +
 arch/s390/include/uapi/asm/socket.h|   2 +
 arch/sparc/include/uapi/asm/socket.h   |   2 +
 arch/xtensa/include/uapi/asm/socket.h  |   2 +
 include/linux/socket.h |   9 ++
 include/net/inet_common.h  |   2 +
 include/net/inet_connection_sock.h |   4 -
 include/net/sock.h |   5 +
 include/net/tcp.h  |  25 -
 include/net/tls.h  |   4 +-
 include/net/ulp_sock.h |  65 +++
 include/uapi/asm-generic/socket.h  |   2 +
 net/Kconfig|   4 +
 net/core/Makefile  |   1 +
 net/core/sock.c|  14 +++
 net/core/sysctl_net_core.c |  25 +
 net/core/ulp_sock.c| 194 +
 net/ipv4/sysctl_net_ipv4.c |   9 +-
 net/ipv4/tcp.c |  40 ---
 net/ipv4/tcp_ipv4.c|   2 -
 net/ipv4/tcp_ulp.c | 135 ---
 net/tls/Kconfig|   1 +
 net/tls/tls_main.c |  21 ++--
 31 files changed, 470 insertions(+), 200 deletions(-)
 create mode 100644 Documentation/networking/ulp.txt
 create mode 100644 include/net/ulp_sock.h
 create mode 100644 net/core/ulp_sock.c
 delete mode 100644 net/ipv4/tcp_ulp.c

-- 
2.11.0



[RFC 1/1] constify tcp congestion

2017-08-01 Thread Stephen Hemminger
Split the TCP congestion ops structure into const and mutable portions.
Put the list pointers, key and a copy of the flags in new tcp_congestion_entry
structure.

Signed-off-by: Stephen Hemminger 

---
 include/net/tcp.h|  10 ++-
 net/ipv4/tcp.c   |   2 -
 net/ipv4/tcp_bbr.c   |   2 +-
 net/ipv4/tcp_bic.c   |   2 +-
 net/ipv4/tcp_cdg.c   |   2 +-
 net/ipv4/tcp_cong.c  | 162 +--
 net/ipv4/tcp_cubic.c |   2 +-
 net/ipv4/tcp_dctcp.c |   6 +-
 net/ipv4/tcp_highspeed.c |   2 +-
 net/ipv4/tcp_htcp.c  |   2 +-
 net/ipv4/tcp_hybla.c |   2 +-
 net/ipv4/tcp_illinois.c  |   2 +-
 net/ipv4/tcp_lp.c|   2 +-
 net/ipv4/tcp_nv.c|   2 +-
 net/ipv4/tcp_scalable.c  |   2 +-
 net/ipv4/tcp_vegas.c |   2 +-
 net/ipv4/tcp_veno.c  |   2 +-
 net/ipv4/tcp_westwood.c  |   2 +-
 net/ipv4/tcp_yeah.c  |   2 +-
 19 files changed, 124 insertions(+), 86 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index bb1881b4ce48..725395a7c6d1 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -931,8 +931,6 @@ struct rate_sample {
 };
 
 struct tcp_congestion_ops {
-   struct list_headlist;
-   u32 key;
u32 flags;
 
/* initialize private data (optional) */
@@ -970,8 +968,8 @@ struct tcp_congestion_ops {
struct module   *owner;
 };
 
-int tcp_register_congestion_control(struct tcp_congestion_ops *type);
-void tcp_unregister_congestion_control(struct tcp_congestion_ops *type);
+int tcp_register_congestion_control(const struct tcp_congestion_ops *type);
+void tcp_unregister_congestion_control(const struct tcp_congestion_ops *type);
 
 void tcp_assign_congestion_control(struct sock *sk);
 void tcp_init_congestion_control(struct sock *sk);
@@ -990,9 +988,9 @@ void tcp_cong_avoid_ai(struct tcp_sock *tp, u32 w, u32 
acked);
 u32 tcp_reno_ssthresh(struct sock *sk);
 u32 tcp_reno_undo_cwnd(struct sock *sk);
 void tcp_reno_cong_avoid(struct sock *sk, u32 ack, u32 acked);
-extern struct tcp_congestion_ops tcp_reno;
+extern const struct tcp_congestion_ops tcp_reno;
 
-struct tcp_congestion_ops *tcp_ca_find_key(u32 key);
+const struct tcp_congestion_ops *tcp_ca_find_key(u32 key);
 u32 tcp_ca_get_key_by_name(const char *name, bool *ecn_ca);
 #ifdef CONFIG_INET
 char *tcp_ca_get_name_by_key(u32 key, char *buffer);
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 9dd6f4dba9b1..b26eeff90cff 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3344,8 +3344,6 @@ int tcp_abort(struct sock *sk, int err)
 }
 EXPORT_SYMBOL_GPL(tcp_abort);
 
-extern struct tcp_congestion_ops tcp_reno;
-
 static __initdata unsigned long thash_entries;
 static int __init set_thash_entries(char *str)
 {
diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c
index 69ee877574d0..353574dd1eb8 100644
--- a/net/ipv4/tcp_bbr.c
+++ b/net/ipv4/tcp_bbr.c
@@ -917,7 +917,7 @@ static void bbr_set_state(struct sock *sk, u8 new_state)
}
 }
 
-static struct tcp_congestion_ops tcp_bbr_cong_ops __read_mostly = {
+static const struct tcp_congestion_ops tcp_bbr_cong_ops = {
.flags  = TCP_CONG_NON_RESTRICTED,
.name   = "bbr",
.owner  = THIS_MODULE,
diff --git a/net/ipv4/tcp_bic.c b/net/ipv4/tcp_bic.c
index 609965f0e298..8a8a3a6bde1a 100644
--- a/net/ipv4/tcp_bic.c
+++ b/net/ipv4/tcp_bic.c
@@ -209,7 +209,7 @@ static void bictcp_acked(struct sock *sk, const struct 
ack_sample *sample)
}
 }
 
-static struct tcp_congestion_ops bictcp __read_mostly = {
+static const struct tcp_congestion_ops bictcp = {
.init   = bictcp_init,
.ssthresh   = bictcp_recalc_ssthresh,
.cong_avoid = bictcp_cong_avoid,
diff --git a/net/ipv4/tcp_cdg.c b/net/ipv4/tcp_cdg.c
index 50a0f3e51d5b..32e8631dd128 100644
--- a/net/ipv4/tcp_cdg.c
+++ b/net/ipv4/tcp_cdg.c
@@ -399,7 +399,7 @@ static void tcp_cdg_release(struct sock *sk)
kfree(ca->gradients);
 }
 
-struct tcp_congestion_ops tcp_cdg __read_mostly = {
+static const struct tcp_congestion_ops tcp_cdg = {
.cong_avoid = tcp_cdg_cong_avoid,
.cwnd_event = tcp_cdg_cwnd_event,
.pkts_acked = tcp_cdg_acked,
diff --git a/net/ipv4/tcp_cong.c b/net/ipv4/tcp_cong.c
index fde983f6376b..0d43eef045f4 100644
--- a/net/ipv4/tcp_cong.c
+++ b/net/ipv4/tcp_cong.c
@@ -19,13 +19,21 @@
 static DEFINE_SPINLOCK(tcp_cong_list_lock);
 static LIST_HEAD(tcp_cong_list);
 
+struct tcp_congestion_entry {
+   struct list_headlist;
+   u32 key;
+   u32 flags;
+   const struct tcp_congestion_ops *ops;
+   struct rcu_head rcu;
+};
+
 /* Simple linear search, don't expect many entries! */
-static struct tcp_congestion_ops *tcp_ca_find(const char *name)
+static struct tcp_congestion_entry *tcp_ca_find(const char *name)
 {
-   struct tcp_congestion_ops *e;
+   struct tcp_congestion_entry *e;
 
list_for_each_entry_rcu(e, _cong_list, list) {
-  

[RFC 0/1] tcp: constify congestion_ops

2017-08-01 Thread Stephen Hemminger
This is a proposed method of making TCP congestion_ops structure const.

I wonder if restricting congestion control choices is still necessary?
It seems like being overly paranoid, and better enforced by having a more
limited kernel config, seccomp or other mechanism.

Stephen Hemminger (1):
  constify tcp congestion

 include/net/tcp.h|  10 ++-
 net/ipv4/tcp.c   |   2 -
 net/ipv4/tcp_bbr.c   |   2 +-
 net/ipv4/tcp_bic.c   |   2 +-
 net/ipv4/tcp_cdg.c   |   2 +-
 net/ipv4/tcp_cong.c  | 162 +--
 net/ipv4/tcp_cubic.c |   2 +-
 net/ipv4/tcp_dctcp.c |   6 +-
 net/ipv4/tcp_highspeed.c |   2 +-
 net/ipv4/tcp_htcp.c  |   2 +-
 net/ipv4/tcp_hybla.c |   2 +-
 net/ipv4/tcp_illinois.c  |   2 +-
 net/ipv4/tcp_lp.c|   2 +-
 net/ipv4/tcp_nv.c|   2 +-
 net/ipv4/tcp_scalable.c  |   2 +-
 net/ipv4/tcp_vegas.c |   2 +-
 net/ipv4/tcp_veno.c  |   2 +-
 net/ipv4/tcp_westwood.c  |   2 +-
 net/ipv4/tcp_yeah.c  |   2 +-
 19 files changed, 124 insertions(+), 86 deletions(-)

-- 
2.11.0



Re: [PATCH net 6/7] netvsc: Initialize 64-bit stats seqcount

2017-08-01 Thread Stephen Hemminger
On Tue,  1 Aug 2017 12:11:12 -0700
Florian Fainelli  wrote:

> On 32-bit hosts and with CONFIG_DEBUG_LOCK_ALLOC we should be seeing a
> lockdep splat indicating this seqcount is not correctly initialized, fix
> that. In commit 6c80f3fc2398 ("netvsc: report per-channel stats in
> ethtool statistics") netdev_alloc_pcpu_stats() was removed in favor of
> open-coding the 64-bits statistics, except that u64_stats_init() was
> missed.
> 
> Fixes: 6c80f3fc2398 ("netvsc: report per-channel stats in ethtool statistics")
> Signed-off-by: Florian Fainelli 
> ---
>  drivers/net/hyperv/netvsc.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
> index 0a9167dd72fb..96f90c75d1b7 100644
> --- a/drivers/net/hyperv/netvsc.c
> +++ b/drivers/net/hyperv/netvsc.c
> @@ -1302,6 +1302,8 @@ int netvsc_device_add(struct hv_device *device,
>   struct netvsc_channel *nvchan = _device->chan_table[i];
>  
>   nvchan->channel = device->channel;
> + u64_stats_init(>tx_stats.syncp);
> + u64_stats_init(>rx_stats.syncp);
>   }
>  
>   /* Enable NAPI handler before init callbacks */


Looks good, thanks. 32 bit guests are still supported but rarely tested.

Signed-off-by: Stephen Hemminger 


Re: sysctl, argument parsing, possible bug

2017-08-01 Thread Stephen Hemminger
On Tue, 1 Aug 2017 14:27:37 -0700
Cong Wang  wrote:

> On Tue, Aug 1, 2017 at 1:47 PM, Massimo Sala  
> wrote:
> > cat /proc/sys/net/ipv4/conf/eth0.100/forwarding
> > 0
> >
> > sysctl net.ipv4.conf.eth0.100.forwarding
> > error: "net.ipv4.conf.eth0.100.forwarding" is an unknown key
> >  
> 
> Use echo instead, sysctl doesn't understand eth0.100
> is a netdev name, sigh.

sysctl happily accepts / as a separator see man sysctl(8)

PARAMETERS
   variable
  The  name  of  a  key to read from.  An example is kernel.ostype. 
 The '/' separator is also
  accepted in place of a '.'.



Re: [PATCH 1/6] [net-next]net: sched: act_mirred: Extend redirect action to accept a traffic class

2017-08-01 Thread Nambiar, Amritha

On 8/1/2017 4:12 AM, Jiri Pirko wrote:
> Tue, Aug 01, 2017 at 02:37:37AM CEST, amritha.namb...@intel.com wrote:
>> The Mirred/redirect action is extended to forward to a traffic
>> class on the device. The traffic class index needs to be
>> provided in addition to the device's ifindex.
>>
>> Example:
>> # tc filter add dev eth0 protocol ip parent : prio 1 flower\
>>  dst_ip 192.168.1.1/32 ip_proto udp dst_port 22\
>>  skip_sw indev eth0 action mirred ingress redirect dev eth0 tc 1
> 
> You need to make sure that the current offloaders fill forbid to add
> this rule, not just silently ignore the tc value.

I will fix this in the next version, probably using the 'flags' field
I've defined.

> 
> 
>>
>> Signed-off-by: Amritha Nambiar 
>> ---
>> include/net/tc_act/tc_mirred.h|7 +++
>> include/uapi/linux/tc_act/tc_mirred.h |5 +
>> net/sched/act_mirred.c|   17 +
>> 3 files changed, 29 insertions(+)
>>
>> diff --git a/include/net/tc_act/tc_mirred.h b/include/net/tc_act/tc_mirred.h
>> index 604bc31..60058c4 100644
>> --- a/include/net/tc_act/tc_mirred.h
>> +++ b/include/net/tc_act/tc_mirred.h
>> @@ -9,6 +9,8 @@ struct tcf_mirred {
>>  int tcfm_eaction;
>>  int tcfm_ifindex;
>>  booltcfm_mac_header_xmit;
>> +u8  tcfm_tc;
>> +u32 flags;
>>  struct net_device __rcu *tcfm_dev;
>>  struct list_headtcfm_list;
>> };
>> @@ -37,4 +39,9 @@ static inline int tcf_mirred_ifindex(const struct 
>> tc_action *a)
>>  return to_mirred(a)->tcfm_ifindex;
>> }
>>
>> +static inline int tcf_mirred_tc(const struct tc_action *a)
>> +{
>> +return to_mirred(a)->tcfm_tc;
>> +}
>> +
>> #endif /* __NET_TC_MIR_H */
>> diff --git a/include/uapi/linux/tc_act/tc_mirred.h 
>> b/include/uapi/linux/tc_act/tc_mirred.h
>> index 3d7a2b3..8ff4d76 100644
>> --- a/include/uapi/linux/tc_act/tc_mirred.h
>> +++ b/include/uapi/linux/tc_act/tc_mirred.h
>> @@ -9,6 +9,10 @@
>> #define TCA_EGRESS_MIRROR 2 /* mirror packet to EGRESS */
>> #define TCA_INGRESS_REDIR 3  /* packet redirect to INGRESS*/
>> #define TCA_INGRESS_MIRROR 4 /* mirror packet to INGRESS */
>> +
>> +#define MIRRED_F_TC_MAP 0x1
>> +#define MIRRED_TC_MAP_MAX   0x10
>> +#define MIRRED_TC_MAP_MASK  0xF
> 
> I'm completely lost. Why do you have these values here? and in fact one
> twice?

I'll fix this to remove the defines for the TC max range and its bitmap
here and reuse the existing ones defined in linux/netdevice.h

> 
> 
>>  
>>
>> struct tc_mirred {
>>  tc_gen;
>> @@ -21,6 +25,7 @@ enum {
>>  TCA_MIRRED_TM,
>>  TCA_MIRRED_PARMS,
>>  TCA_MIRRED_PAD,
>> +TCA_MIRRED_TC_MAP,
>>  __TCA_MIRRED_MAX
>> };
>> #define TCA_MIRRED_MAX (__TCA_MIRRED_MAX - 1)
>> diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
>> index 1b5549a..f9801de 100644
>> --- a/net/sched/act_mirred.c
>> +++ b/net/sched/act_mirred.c
>> @@ -67,6 +67,7 @@ static void tcf_mirred_release(struct tc_action *a, int 
>> bind)
>>
>> static const struct nla_policy mirred_policy[TCA_MIRRED_MAX + 1] = {
>>  [TCA_MIRRED_PARMS]  = { .len = sizeof(struct tc_mirred) },
>> +[TCA_MIRRED_TC_MAP] = { .type = NLA_U8 },
>> };
>>
>> static unsigned int mirred_net_id;
>> @@ -83,6 +84,8 @@ static int tcf_mirred_init(struct net *net, struct nlattr 
>> *nla,
>>  struct tcf_mirred *m;
>>  struct net_device *dev;
>>  bool exists = false;
>> +u8 *tc_map = NULL;
>> +u32 flags = 0;
>>  int ret;
>>
>>  if (nla == NULL)
>> @@ -92,6 +95,14 @@ static int tcf_mirred_init(struct net *net, struct nlattr 
>> *nla,
>>  return ret;
>>  if (tb[TCA_MIRRED_PARMS] == NULL)
>>  return -EINVAL;
>> +
>> +if (tb[TCA_MIRRED_TC_MAP]) {
>> +tc_map = nla_data(tb[TCA_MIRRED_TC_MAP]);
>> +if (*tc_map >= MIRRED_TC_MAP_MAX)
>> +return -EINVAL;
>> +flags |= MIRRED_F_TC_MAP;
>> +}
>> +
>>  parm = nla_data(tb[TCA_MIRRED_PARMS]);
>>
>>  exists = tcf_hash_check(tn, parm->index, a, bind);
>> @@ -139,6 +150,7 @@ static int tcf_mirred_init(struct net *net, struct 
>> nlattr *nla,
>>  ASSERT_RTNL();
>>  m->tcf_action = parm->action;
>>  m->tcfm_eaction = parm->eaction;
>> +m->flags = flags;
>>  if (dev != NULL) {
>>  m->tcfm_ifindex = parm->ifindex;
>>  if (ret != ACT_P_CREATED)
>> @@ -146,6 +158,8 @@ static int tcf_mirred_init(struct net *net, struct 
>> nlattr *nla,
>>  dev_hold(dev);
>>  rcu_assign_pointer(m->tcfm_dev, dev);
>>  m->tcfm_mac_header_xmit = mac_header_xmit;
>> +if (flags & MIRRED_F_TC_MAP)
>> +m->tcfm_tc = *tc_map & MIRRED_TC_MAP_MASK;
>>  }
>>
>>  if (ret == 

Re: [PATCH 6/6] [net-next]net: i40e: Enable cloud filters in i40e via tc/flower classifier

2017-08-01 Thread Nambiar, Amritha

On 8/1/2017 3:56 AM, Jamal Hadi Salim wrote:
> On 17-07-31 08:38 PM, Amritha Nambiar wrote:
>> This patch enables tc-flower based hardware offloads. tc/flower
>> filter provided by the kernel is configured as driver specific
>> cloud filter. The patch implements functions and admin queue
>> commands needed to support cloud filters in the driver and
>> adds cloud filters to configure these tc-flower filters.
>>
>> The only action supported is to redirect packets to a traffic class
>> on the same device.
>>
>> # tc qdisc add dev eth0 ingress
>> # ethtool -K eth0 hw-tc-offload on
>>
>> # tc filter add dev eth0 protocol ip parent :\
>>prio 1 flower dst_mac 3c:fd:fe:a0:d6:70 skip_sw indev eth0\
>>action mirred ingress redirect dev eth0 tc 0
>>
> 
> Out of curiosity - did you need to say "indev eth0" there?

It looks like I don't need to specify "indev eth0". I will need to look
up how this part is offloaded and probably validate in the driver when
this is specified.

> Also: Is it possible to add an skbmark? Example something like
> these that directs two flows to the same queue but different
> skb marks:
> 
> # tc filter add dev eth0 protocol ip parent : \
>prio 2 flower dst_ip 192.168.3.5/32 \
>ip_proto udp dst_port 2a skip_sw \
>action skbedit mark 11 \
>action mirred ingress redirect dev eth0 tcqueue 1
> 
> # tc filter add dev eth0 protocol ip parent : \
>  prio 1 flower dst_mac 3c:fd:fe:a0:d6:70 skip_sw \
>  action skbedit mark 12 \
>  action mirred ingress redirect dev eth0 tcqueue 1
> 

It is possible to support the skbedit mark action for the first rule
here (L3 and L4) which I can take up in a subsequent patch, but this
cannot be supported on our device for L2 based match in the second rule.

> cheers,
> jamal
> 


Re: [PATCH 1/6] [net-next]net: sched: act_mirred: Extend redirect action to accept a traffic class

2017-08-01 Thread Nambiar, Amritha

On 8/1/2017 3:44 AM, Jamal Hadi Salim wrote:
> On 17-07-31 08:37 PM, Amritha Nambiar wrote:
>> The Mirred/redirect action is extended to forward to a traffic
>> class on the device. The traffic class index needs to be
>> provided in addition to the device's ifindex.
>>
>> Example:
>> # tc filter add dev eth0 protocol ip parent : prio 1 flower\
>>dst_ip 192.168.1.1/32 ip_proto udp dst_port 22\
>>skip_sw indev eth0 action mirred ingress redirect dev eth0 tc 1
>>
>> Signed-off-by: Amritha Nambiar 
>> ---
>>   include/net/tc_act/tc_mirred.h|7 +++
>>   include/uapi/linux/tc_act/tc_mirred.h |5 +
>>   net/sched/act_mirred.c|   17 +
>>   3 files changed, 29 insertions(+)
>>
>> diff --git a/include/net/tc_act/tc_mirred.h b/include/net/tc_act/tc_mirred.h
>> index 604bc31..60058c4 100644
>> --- a/include/net/tc_act/tc_mirred.h
>> +++ b/include/net/tc_act/tc_mirred.h
>> @@ -9,6 +9,8 @@ struct tcf_mirred {
>>  int tcfm_eaction;
>>  int tcfm_ifindex;
>>  booltcfm_mac_header_xmit;
>> +u8  tcfm_tc;
>> +u32 flags;
>>  struct net_device __rcu *tcfm_dev;
>>  struct list_headtcfm_list;
>>   };
>> @@ -37,4 +39,9 @@ static inline int tcf_mirred_ifindex(const struct 
>> tc_action *a)
>>  return to_mirred(a)->tcfm_ifindex;
>>   }
>>   
>> +static inline int tcf_mirred_tc(const struct tc_action *a)
>> +{
>> +return to_mirred(a)->tcfm_tc;
>> +}
>> +
>>   #endif /* __NET_TC_MIR_H */
>> diff --git a/include/uapi/linux/tc_act/tc_mirred.h 
>> b/include/uapi/linux/tc_act/tc_mirred.h
>> index 3d7a2b3..8ff4d76 100644
>> --- a/include/uapi/linux/tc_act/tc_mirred.h
>> +++ b/include/uapi/linux/tc_act/tc_mirred.h
>> @@ -9,6 +9,10 @@
>>   #define TCA_EGRESS_MIRROR 2 /* mirror packet to EGRESS */
>>   #define TCA_INGRESS_REDIR 3  /* packet redirect to INGRESS*/
>>   #define TCA_INGRESS_MIRROR 4 /* mirror packet to INGRESS */
>> +
>> +#define MIRRED_F_TC_MAP 0x1
>> +#define MIRRED_TC_MAP_MAX   0x10
>> +#define MIRRED_TC_MAP_MASK  0xF
>>  
>>  
>>   struct tc_mirred {
>>  tc_gen;
>> @@ -21,6 +25,7 @@ enum {
>>  TCA_MIRRED_TM,
>>  TCA_MIRRED_PARMS,
>>  TCA_MIRRED_PAD,
>> +TCA_MIRRED_TC_MAP,
>>  __TCA_MIRRED_MAX
>>   };
>>   #define TCA_MIRRED_MAX (__TCA_MIRRED_MAX - 1)
>> diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
>> index 1b5549a..f9801de 100644
>> --- a/net/sched/act_mirred.c
>> +++ b/net/sched/act_mirred.c
>> @@ -67,6 +67,7 @@ static void tcf_mirred_release(struct tc_action *a, int 
>> bind)
>>   
>>   static const struct nla_policy mirred_policy[TCA_MIRRED_MAX + 1] = {
>>  [TCA_MIRRED_PARMS]  = { .len = sizeof(struct tc_mirred) },
>> +[TCA_MIRRED_TC_MAP] = { .type = NLA_U8 },
>>   };
>>   
>>   static unsigned int mirred_net_id;
>> @@ -83,6 +84,8 @@ static int tcf_mirred_init(struct net *net, struct nlattr 
>> *nla,
>>  struct tcf_mirred *m;
>>  struct net_device *dev;
>>  bool exists = false;
>> +u8 *tc_map = NULL;
>> +u32 flags = 0;
>>  int ret;
>>   
>>  if (nla == NULL)
>> @@ -92,6 +95,14 @@ static int tcf_mirred_init(struct net *net, struct nlattr 
>> *nla,
>>  return ret;
>>  if (tb[TCA_MIRRED_PARMS] == NULL)
>>  return -EINVAL;
>> +
>> +if (tb[TCA_MIRRED_TC_MAP]) {
>> +tc_map = nla_data(tb[TCA_MIRRED_TC_MAP]);
>> +if (*tc_map >= MIRRED_TC_MAP_MAX)
>> +return -EINVAL;
>> +flags |= MIRRED_F_TC_MAP;
> 
> 
>> +}
>> +
>>  parm = nla_data(tb[TCA_MIRRED_PARMS]);
>>   
>>  exists = tcf_hash_check(tn, parm->index, a, bind);
>> @@ -139,6 +150,7 @@ static int tcf_mirred_init(struct net *net, struct 
>> nlattr *nla,
>>  ASSERT_RTNL();
>>  m->tcf_action = parm->action;
>>  m->tcfm_eaction = parm->eaction;
>> +m->flags = flags;
>>  if (dev != NULL) {
>>  m->tcfm_ifindex = parm->ifindex;
>>  if (ret != ACT_P_CREATED)
>> @@ -146,6 +158,8 @@ static int tcf_mirred_init(struct net *net, struct 
>> nlattr *nla,
>>  dev_hold(dev);
>>  rcu_assign_pointer(m->tcfm_dev, dev);
>>  m->tcfm_mac_header_xmit = mac_header_xmit;
>> +if (flags & MIRRED_F_TC_MAP)
>> +m->tcfm_tc = *tc_map & MIRRED_TC_MAP_MASK;
>>  }
>>   
> Is the mask a hardware limit. I dont know how these queues are
> allocated - I am assuming each of these "tc queues" maps to a rx
> DMA ring?

This is the bitmask for TC max range again defined in linux/netdevice.h.
I'll fix this to remove the new definition I have (MIRRED_TC_MAP_MASK)
here and replace with the existing TC_BITMASK. These are just the
traffic class index that are offloaded to the device. I had submitted
another 

Re: [PATCH 1/6] [net-next]net: sched: act_mirred: Extend redirect action to accept a traffic class

2017-08-01 Thread Nambiar, Amritha

On 8/1/2017 3:22 AM, Jamal Hadi Salim wrote:
> On 17-07-31 08:37 PM, Amritha Nambiar wrote:
>> The Mirred/redirect action is extended to forward to a traffic
>> class on the device. The traffic class index needs to be
>> provided in addition to the device's ifindex.
>>
>> Example:
>> # tc filter add dev eth0 protocol ip parent : prio 1 flower\
>>dst_ip 192.168.1.1/32 ip_proto udp dst_port 22\
>>skip_sw indev eth0 action mirred ingress redirect dev eth0 tc 1
>>
>> Signed-off-by: Amritha Nambiar 
>> ---
>>   include/net/tc_act/tc_mirred.h|7 +++
>>   include/uapi/linux/tc_act/tc_mirred.h |5 +
>>   net/sched/act_mirred.c|   17 +
>>   3 files changed, 29 insertions(+)
>>
>> diff --git a/include/net/tc_act/tc_mirred.h b/include/net/tc_act/tc_mirred.h
>> index 604bc31..60058c4 100644
>> --- a/include/net/tc_act/tc_mirred.h
>> +++ b/include/net/tc_act/tc_mirred.h
>> @@ -9,6 +9,8 @@ struct tcf_mirred {
>>  int tcfm_eaction;
>>  int tcfm_ifindex;
>>  booltcfm_mac_header_xmit;
>> +u8  tcfm_tc;
>> +u32 flags;
>>  struct net_device __rcu *tcfm_dev;
>>  struct list_headtcfm_list;
>>   };
>> @@ -37,4 +39,9 @@ static inline int tcf_mirred_ifindex(const struct 
>> tc_action *a)
>>  return to_mirred(a)->tcfm_ifindex;
>>   }
>>   
>> +static inline int tcf_mirred_tc(const struct tc_action *a)
>> +{
>> +return to_mirred(a)->tcfm_tc;
>> +}
>> +
>>   #endif /* __NET_TC_MIR_H */
>> diff --git a/include/uapi/linux/tc_act/tc_mirred.h 
>> b/include/uapi/linux/tc_act/tc_mirred.h
>> index 3d7a2b3..8ff4d76 100644
>> --- a/include/uapi/linux/tc_act/tc_mirred.h
>> +++ b/include/uapi/linux/tc_act/tc_mirred.h
>> @@ -9,6 +9,10 @@
>>   #define TCA_EGRESS_MIRROR 2 /* mirror packet to EGRESS */
>>   #define TCA_INGRESS_REDIR 3  /* packet redirect to INGRESS*/
>>   #define TCA_INGRESS_MIRROR 4 /* mirror packet to INGRESS */
>> +
>> +#define MIRRED_F_TC_MAP 0x1
>> +#define MIRRED_TC_MAP_MAX   0x10
> 
> Assuming this is the max number of queues?
> Where does this upper bound come from? Is it a spec
> or an intel thing? If spec - mentioning which
> spec and section would be useful.

This is the max number of TCs. The Linux upper bound for this is defined
in linux/netdevice.h. I will fix this part to remove the definition here
and reuse the existing one.

> 
> cheers,
> jamal
> 


Re: [PATCH net-next RFC 0/6] Configure cloud filters in i40e via tc/flower classifier

2017-08-01 Thread Nambiar, Amritha

On 8/1/2017 3:15 AM, Jamal Hadi Salim wrote:
> On 17-07-31 08:36 PM, Amritha Nambiar wrote:
>> This patch series enables configuring cloud filters in i40e
>> using the tc/flower classifier. The only tc-filter action
>> supported is to redirect packets to a traffic class on the
>> same device. The tc/mirred:redirect action is extended to
>> accept a traffic class to achieve this.
>>
>> The cloud filters are added for a VSI and are cleaned up when
>> the VSI is deleted. The filters that match on L4 ports needs
>> enhanced admin queue functions with big buffer support for
>> extended general fields in Add/Remove Cloud filters command.
>>
>> Example:
>> # tc qdisc add dev eth0 ingress
>>
>> # ethtool -K eth0 hw-tc-offload on
>>
>> # tc filter add dev eth0 protocol ip parent : prio 1 flower\
>>dst_ip 192.168.1.1/32 ip_proto udp dst_port 22\
>>skip_sw indev eth0 action mirred ingress redirect dev eth0 tc 1
>>
> 
> I think "queue 1" sounds better than "tc 1".
> "tc" is  already a keyword in a few places (even within that declaration
> above).

The idea is to redirect to a traffic class that has queues assigned to
it and not a single queue i.e. these are actually queue groups and not a
single queue. So may be "qgroup 1" or "tcqgroup 1" fits better.

> 
> cheers,
> jamal
> 


Re: [PATCH 1/2] [for 4.13] net: qcom/emac: disable flow control autonegotiation by default

2017-08-01 Thread Timur Tabi

On 8/1/17 6:15 PM, Andrew Lunn wrote:

Pause frames are something you can auto-negotiate at the PHY
level. Should you also be clearing some bits in the phydev, so the
peer knows pause frames are not supported?


When pause frame autonegotiation is enabled in the driver, that only 
means that the driver looks at what the PHY has autonegotiated, and then 
configures the MAC to match that.


The driver doesn't touch the PHY at all.  It leaves all that to phylib.

Now if autonegotiation is disabled in the driver, then it just 
hard-codes those TX/RX settings in the driver.  Are you saying I should 
program the PHY at the point to disable autonegotiation on the PHY 
level?  If so, then I don't know how to do that.  I just assumed that 
the MAC never tells the PHY what to do.


--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, hosted by The Linux Foundation.


Re: [PATCH v3 0/2] ravb: add wake-on-lan support via magic packet

2017-08-01 Thread David Miller
From: Niklas Söderlund 
Date: Tue,  1 Aug 2017 12:14:35 +0200

> WoL is enabled in the suspend callback by setting MagicPacket detection
> and disabling all interrupts expect MagicPacket. In the resume path the
> driver needs to reset the hardware to rearm the WoL logic, this prevents
> the driver from simply restoring the registers and to take advantage of
> that ravb was not suspended to reduce resume time. To reset the
> hardware the driver closes the device, sets it in reset mode and reopens
> the device just like it would do in a normal suspend/resume scenario
> without WoL enabled, but it both closes and opens the device in the
> resume callback since the device needs to be reset for WoL to work.
> 
> One quirk needed for WoL is that the module clock needs to be prevented
> from being switched off by Runtime PM. To keep the clock alive the
> suspend callback need to call clk_enable() directly to increase the
> usage count of the clock. Then when Runtime PM decreases the clock usage
> count it won't reach 0 and be switched off.
> 
> Changes since v2
> - Only do the clock dance to workaround PSCI sleep when resuming if WoL 
>   is enabled. This was a bug in v2 which resulted in a WARN if resuming 
>   from PSCI sleep with WoL disabled, thanks Sergei for pointing this 
>   out!
> - Break out clock dance workaround in separate patch to make it easier 
>   to revert once a fix is upstream for the clock driver as suggested by 
>   Sergei.
> 
> Changes since v1
> - Fix issue where device would fail to resume from PSCI suspend if WoL
>   was enabled, reported by Geert. The fault was that the clock driver
>   thinks the clock is on, but PSCI have disabled it, added workaround
>   for this in ravb driver which can be removed once the clock driver is
>   aware of the PSCI behavior.
> - Only try to restore from wol wake up if netif is running, since this
>   is a condition to enable wol in the first place this was a bug in v1.

Series applied, thanks.


Re: pull request: bluetooth-next 2017-08-01

2017-08-01 Thread David Miller
From: Johan Hedberg 
Date: Tue, 1 Aug 2017 12:41:56 +0300

> Here's our first batch of Bluetooth patches for the 4.14 kernel:
> 
>  - Several new USB IDs for the btusb driver
>  - Memory leak fix in btusb driver
>  - Cleanups & fixes to hci_nokia, hci_serdev and hci_bcm drivers
>  - Fixed cleanup path in mrf24j40 (802.15.4) driver probe function
>  - A few other smaller cleanups & fixes to drivers
> 
> Please let me know if there are any issues pulling. Thanks.

Pulled, thank you.


Re: [PATCH net-next 00/10] net: l3mdev: Support for sockets bound to enslaved device

2017-08-01 Thread David Miller
From: David Ahern 
Date: Mon, 31 Jul 2017 20:13:16 -0700

> Existing code for socket lookups already pass in 6+ arguments. Rather
> than add another for the enslaved device index, the existing lookups
> are converted to use a new sk_lookup struct. From there, the enslaved
> device index becomes another element of the struct.

Sorry, not gonna happen :-)

I know it's difficult, but maybe we should think about why we're
passing so much crap into each lookup.

And perhaps, why it can't (for example) be constituted in the lookup
function itself given sufficient (relevant) context.

I think passing a big struct into the lookups, by reference, is a big
step backwards.

For one thing, if you pass it by pointer then the compiler can't
potentially pass parts in registers even if it could.  However
if you pass it by value, that's actually a possibility.

But I'd like to avoid this on-stack blob altogether if possible.

Thanks.


Re: [PATCH] drivers/net/wan/z85230.c: Use designated initializers

2017-08-01 Thread Kees Cook
On Tue, Aug 1, 2017 at 3:29 PM, David Miller  wrote:
> From: Kees Cook 
> Date: Sun, 30 Jul 2017 18:31:17 -0700
>
>> In preparation for the randstruct gcc plugin performing randomization of
>> structures that are entirely function pointers, use designated initializers
>> so the compiler doesn't get angry.
>>
>> Reported-by: kbuild test robot 
>> Signed-off-by: Kees Cook 
>> ---
>> This is a prerequisite for the future randstruct fptr randomization. I'd
>> prefer to carry this in my gcc-plugin tree for v4.14 with an Ack from
>> someone on net-dev, or if possible, have it applied to v4.13 via net-dev.
>
> Please queue this up into the gcc-plugin tree then:
>
> Acked-by: David S. Miller 
>
> It isn't a bug fix so I would only have put it into 'net-next' rather
> than 'net'.

Okay, sounds good. I'll carry it. Thanks!

-Kees

-- 
Kees Cook
Pixel Security


[PATCH 1/2] atm: adummy: constify attribute_group structure

2017-08-01 Thread Amitoj Kaur Chawla
Functions working with attribute_groups provided by 
work with const attribute_group. These attribute_group structures do not
change at runtime so mark them as const.

File size before:
 text  data bss dec hex filename
 2033  1448   03481 d99 drivers/atm/adummy.o

File size after:
 text  data bss dec hex filename
 2129  1352   03481 d99 drivers/atm/adummy.o

This change was made with the help of Coccinelle.

Signed-off-by: Amitoj Kaur Chawla 
---
 drivers/atm/adummy.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/atm/adummy.c b/drivers/atm/adummy.c
index 1fd25e8..da27ddf 100644
--- a/drivers/atm/adummy.c
+++ b/drivers/atm/adummy.c
@@ -71,7 +71,7 @@ static struct attribute *adummy_attrs[] = {
NULL
 };
 
-static struct attribute_group adummy_group_attrs = {
+static const struct attribute_group adummy_group_attrs = {
.name = NULL, /* We want them in dev's root folder */
.attrs = adummy_attrs
 };
-- 
2.7.4



[PATCH 2/2] atm: solos-pci: constify attribute_group structures

2017-08-01 Thread Amitoj Kaur Chawla
Functions working with attribute_groups provided by 
work with const attribute_group. These attribute_group structures do not
change at runtime so mark them as const.

File size before:
 text  data bss dec hex filename
 3574028424 832   64996fde4 drivers/atm/solos-pci.o

File size after:
 text  data bss dec hex filename
 3593228232 832   64996fde4 drivers/atm/solos-pci.o

This change was made with the help of Coccinelle.

Signed-off-by: Amitoj Kaur Chawla 
---
 drivers/atm/solos-pci.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/atm/solos-pci.c b/drivers/atm/solos-pci.c
index c8f2ca6..3f9c37d 100644
--- a/drivers/atm/solos-pci.c
+++ b/drivers/atm/solos-pci.c
@@ -611,7 +611,7 @@ static struct attribute *solos_attrs[] = {
NULL
 };
 
-static struct attribute_group solos_attr_group = {
+static const struct attribute_group solos_attr_group = {
.attrs = solos_attrs,
.name = "parameters",
 };
@@ -628,7 +628,7 @@ static struct attribute *gpio_attrs[] = {
NULL
 };
 
-static struct attribute_group gpio_attr_group = {
+static const struct attribute_group gpio_attr_group = {
.attrs = gpio_attrs,
.name = "gpio",
 };
-- 
2.7.4



Re: [PATCH 1/2] [for 4.13] net: qcom/emac: disable flow control autonegotiation by default

2017-08-01 Thread Andrew Lunn
On Tue, Aug 01, 2017 at 04:37:39PM -0500, Timur Tabi wrote:
> The EMAC has a curious qwirk when RX flow control is enabled and the
> kernel hangs.  With the kernel hung, the EMAC's RX queue soon fills.
> If RX flow control is enabled, the EMAC will then send a non-stop
> stream of pause frames until the system is reset.  The EMAC does not
> have a built-in watchdog.
> 
> In various tests, the pause frame stream sometimes overloads nearby
> switches, effectively disabling the network.  Since the RX queue is
> large and the host processor is more than capable of handling incoming
> packets quickly, the only time the EMAC will send any pause frames is
> when the kernel is hung and unrecoverable.
> 
> To avoid all these problems, we disable flow control autonegotiation
> by default, and only enable receiving pause frames.
> 
> Cc: sta...@vger.kernel.org # 4.11.x
> Signed-off-by: Timur Tabi 
> ---
>  drivers/net/ethernet/qualcomm/emac/emac.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/qualcomm/emac/emac.c 
> b/drivers/net/ethernet/qualcomm/emac/emac.c
> index 60850bfa3d32..475c0ea29235 100644
> --- a/drivers/net/ethernet/qualcomm/emac/emac.c
> +++ b/drivers/net/ethernet/qualcomm/emac/emac.c
> @@ -441,8 +441,13 @@ static void emac_init_adapter(struct emac_adapter *adpt)
>   /* others */
>   adpt->preamble = EMAC_PREAMBLE_DEF;
>  
> - /* default to automatic flow control */
> - adpt->automatic = true;
> + /* Disable transmission of pause frames by default, to avoid the
> +  * risk of a pause frame flood that can occur if the kernel hangs.
> +  * We still want to be able to respond to them, however.
> +  */
> + adpt->automatic = false;
> + adpt->tx_flow_control = false;
> + adpt->rx_flow_control = true;

Hi Timur

Pause frames are something you can auto-negotiate at the PHY
level. Should you also be clearing some bits in the phydev, so the
peer knows pause frames are not supported?

 Andrew


Re: [PATCH net-next v2 0/3] netvsc: transparent SR-IOV VF support

2017-08-01 Thread David Miller
From: Stephen Hemminger 
Date: Mon, 31 Jul 2017 16:45:21 -0700

> This patch set changes how SR-IOV Virtual Function devices are managed
> in the Hyper-V network driver. It was part of earlier bundle, but
> is now updated.

I think you need to do a rebase.  I just merged net into net-next and
this created some conflicts.


Re: [PATCH net 2/2] gue: fix remcsum when GRO on and CHECKSUM_PARTIAL boundary is outer UDP

2017-08-01 Thread David Miller
From: Koichiro Den 
Date: Tue,  1 Aug 2017 01:05:39 +0900

> In the case that GRO is turned on and the original received packet is
> CHECKSUM_PARTIAL, if the outer UDP header is exactly at the last
> csum-unnecessary point, which for instance could occur if the packet
> comes from another Linux guest on the same Linux host, we have to do
> either remcsum_adjust or set up CHECKSUM_PARTIAL again with its
> csum_start properly reset considering RCO.
> 
> However, since b7fe10e5ebac ("gro: Fix remcsum offload to deal with frags
> in GRO") that barrier in such case could be skipped if GRO turned on,
> hence we pass over it and the inner L4 validation mistakenly reckons
> it as a bad csum.
> 
> This patch makes remcsum_offload being reset at the same time of GRO
> remcsum cleanup, so as to make it work in such case as before.
> 
> Fixes: b7fe10e5ebac ("gro: Fix remcsum offload to deal with frags in GRO")
> Signed-off-by: Koichiro Den 

Applied.


Re: [PATCH net 1/2] vxlan: fix remcsum when GRO on and CHECKSUM_PARTIAL boundary is outer UDP

2017-08-01 Thread David Miller
From: Koichiro Den 
Date: Tue,  1 Aug 2017 01:05:20 +0900

> In the case that GRO is turned on and the original received packet is
> CHECKSUM_PARTIAL, if the outer UDP header is exactly at the last
> csum-unnecessary point, which for instance could occur if the packet
> comes from another Linux guest on the same Linux host, we have to do
> either remcsum_adjust or set up CHECKSUM_PARTIAL again with its
> csum_start properly reset considering RCO.
> 
> However, since b7fe10e5ebac("gro: Fix remcsum offload to deal with frags
> in GRO") that barrier in such case could be skipped if GRO turned on,
> hence we pass over it and the inner L4 validation mistakenly reckons
> it as a bad csum.
> 
> This patch makes remcsum_offload being reset at the same time of GRO
> remcsum cleanup, so as to make it work in such case as before.
> 
> Fixes: b7fe10e5ebac ("gro: Fix remcsum offload to deal with frags in GRO")
> Signed-off-by: Koichiro Den 

Applied.


Re: [PATCH-net-next] net: add skb_frag_foreach_page and use with kmap_atomic

2017-08-01 Thread David Miller
From: Willem de Bruijn 
Date: Mon, 31 Jul 2017 08:15:47 -0400

> From: Willem de Bruijn 
> 
> Skb frags may contain compound pages. Various operations map frags
> temporarily using kmap_atomic, but this function works on single
> pages, not whole compound pages. The distinction is only relevant
> for high mem pages that require temporary mappings.
> 
> Introduce a looping mechanism that for compound highmem pages maps
> one page at a time, does not change behavior on other pages.
> Use the loop in the kmap_atomic callers in net/core/skbuff.c.
> 
> Verified by triggering skb_copy_bits with
> 
> tcpdump -n -c 100 -i ${DEV} -w /dev/null &
> netperf -t TCP_STREAM -H ${HOST}
> 
>   and by triggering __skb_checksum with
> 
> ethtool -K ${DEV} tx off
> 
>   repeated the tests with looping on a non-highmem platform
>   (x86_64) by making skb_frag_must_loop always return true.
> 
> Signed-off-by: Willem de Bruijn 

Ok, this looks good.

Thanks for following up on this.

Applied.


Re: [PATCH] iwlwifi: Demote messages about fw flags size to info

2017-08-01 Thread João Paulo Rechi Vita
Hello Luca,

On Mon, Jul 24, 2017 at 4:01 AM, Coelho, Luciano
 wrote:
> On Fri, 2017-07-21 at 07:51 -0700, João Paulo Rechi Vita wrote:

(...)

>> Currently these messages are presented to the user during boot if there
>> is no bootsplash covering the console, sometimes even if the boot splash
>> is enabled but has not started yet by the time this message is shown.
>>

I should have provided another piece of information here: all of this
happens even when booting with the 'quiet' kernel parameter.

> This specific case is harmless, but I'd rather keep this message as an
> error, because in other situations it could lead to unexpected
> behavioir, so I prefer to keep it very visible.
>
>

I see your point, and I understand the purpose of these messages. I'm
wondering if perhaps having them at the warning level would give them
enough visibility, while still keeping a clean boot process to the end
user. If so, I can send an updated patch.

Thanks for your reply and for pointing to the fix for the root cause
of that message.

Cheers,

..
João Paulo Rechi Vita  |  +1.415.851.5778  |  Endless


Re: [PATCH net-next v3 0/4] net-next: mediatek: add support for ethernet on MT7622 SoC

2017-08-01 Thread David Miller
From: 
Date: Mon, 31 Jul 2017 18:05:07 +0800

> From: Sean Wang 
> 
> Changes since v2:
> - update John's mail
> 
> Changes since v1:
> - add refinement for ethernet clock management
> - take out the code block for ESW, add it until ESW driver is actually 
> introduced
> 
> The series adds the driver for ethernet controller found on MT7622 SoC.
> There are additions against with previous MT7623 SoC such as shared SGMII
> given for the dual GMACs and built-in 5-ports 10/100 embedded switch support
> (ESW). Thus more clocks consumers and SGMII hardware setup for the extra
> features are all introduced here and as for the support for ESW that would be
> planned to add in the separate patch integrating with DSA infrastructure
> in the future.
> 
> Currently testing successfully is done with those patches for the conditions
> such as GMAC2 with IP1001 PHY via RGMII and GMAC1/2 with RTL8211F PHY via 
> SGMII.

Series applied, thank you.


Re: [PATCH net-next] net: dsa: Add support for 64-bit statistics

2017-08-01 Thread Florian Fainelli
On 08/01/2017 03:00 PM, Florian Fainelli wrote:
> DSA slave network devices maintain a pair of bytes and packets counters
> for each directions, but these are not 64-bit capable. Re-use
> pcpu_sw_netstats which contains exactly what we need for that purpose
> and update the code path to report 64-bit capable statistics.

Humm, I do see a long time for ifconfig/ethtool -S to complete after
having transfered over 250GiB of data, the locking looks correct to me
in that statistics for the RX path are updated from softIRQ context
(napi_gro_receive/netif_receive_skb) but maybe I did misunderstand
whether the softirq/IRQ context applies to the producer and not the
reader...

> 
> Signed-off-by: Florian Fainelli 
> ---
>  net/dsa/dsa.c  |  8 ++--
>  net/dsa/dsa_priv.h |  2 ++
>  net/dsa/slave.c| 38 +++---
>  3 files changed, 39 insertions(+), 9 deletions(-)
> 
> diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
> index a55e2e4087a4..0ba842c08dd3 100644
> --- a/net/dsa/dsa.c
> +++ b/net/dsa/dsa.c
> @@ -190,6 +190,7 @@ static int dsa_switch_rcv(struct sk_buff *skb, struct 
> net_device *dev,
>  {
>   struct dsa_switch_tree *dst = dev->dsa_ptr;
>   struct sk_buff *nskb = NULL;
> + struct dsa_slave_priv *p;
>  
>   if (unlikely(dst == NULL)) {
>   kfree_skb(skb);
> @@ -207,12 +208,15 @@ static int dsa_switch_rcv(struct sk_buff *skb, struct 
> net_device *dev,
>   }
>  
>   skb = nskb;
> + p = netdev_priv(skb->dev);
>   skb_push(skb, ETH_HLEN);
>   skb->pkt_type = PACKET_HOST;
>   skb->protocol = eth_type_trans(skb, skb->dev);
>  
> - skb->dev->stats.rx_packets++;
> - skb->dev->stats.rx_bytes += skb->len;
> + u64_stats_update_begin(>stats64.syncp);
> + p->stats64.rx_packets++;
> + p->stats64.rx_bytes += skb->len;
> + u64_stats_update_end(>stats64.syncp);
>  
>   netif_receive_skb(skb);
>  
> diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
> index 55982cc39b24..7aa0656296c2 100644
> --- a/net/dsa/dsa_priv.h
> +++ b/net/dsa/dsa_priv.h
> @@ -77,6 +77,8 @@ struct dsa_slave_priv {
>   struct sk_buff *(*xmit)(struct sk_buff *skb,
>   struct net_device *dev);
>  
> + struct pcpu_sw_netstats stats64;
> +
>   /* DSA port data, such as switch, port index, etc. */
>   struct dsa_port *dp;
>  
> diff --git a/net/dsa/slave.c b/net/dsa/slave.c
> index 9507bd38cf04..65f3cef85976 100644
> --- a/net/dsa/slave.c
> +++ b/net/dsa/slave.c
> @@ -354,8 +354,10 @@ static netdev_tx_t dsa_slave_xmit(struct sk_buff *skb, 
> struct net_device *dev)
>   struct dsa_slave_priv *p = netdev_priv(dev);
>   struct sk_buff *nskb;
>  
> - dev->stats.tx_packets++;
> - dev->stats.tx_bytes += skb->len;
> + u64_stats_update_begin(>stats64.syncp);
> + p->stats64.tx_packets++;
> + p->stats64.tx_bytes += skb->len;
> + u64_stats_update_end(>stats64.syncp);
>  
>   /* Transmit function may have to reallocate the original SKB,
>* in which case it must have freed it. Only free it here on error.
> @@ -594,11 +596,15 @@ static void dsa_slave_get_ethtool_stats(struct 
> net_device *dev,
>  {
>   struct dsa_slave_priv *p = netdev_priv(dev);
>   struct dsa_switch *ds = p->dp->ds;
> -
> - data[0] = dev->stats.tx_packets;
> - data[1] = dev->stats.tx_bytes;
> - data[2] = dev->stats.rx_packets;
> - data[3] = dev->stats.rx_bytes;
> + unsigned int start;
> +
> + do {
> + start = u64_stats_fetch_begin_irq(>stats64.syncp);
> + data[0] = p->stats64.tx_packets;
> + data[1] = p->stats64.tx_bytes;
> + data[2] = p->stats64.rx_packets;
> + data[3] = p->stats64.rx_bytes;
> + } while (u64_stats_fetch_retry_irq(>stats64.syncp, start));
>   if (ds->ops->get_ethtool_stats)
>   ds->ops->get_ethtool_stats(ds, p->dp->index, data + 4);
>  }
> @@ -861,6 +867,22 @@ static int dsa_slave_setup_tc(struct net_device *dev, 
> u32 handle,
>   }
>  }
>  
> +static void dsa_slave_get_stats64(struct net_device *dev,
> +   struct rtnl_link_stats64 *stats)
> +{
> + struct dsa_slave_priv *p = netdev_priv(dev);
> + unsigned int start;
> +
> + netdev_stats_to_stats64(stats, >stats);
> + do {
> + start = u64_stats_fetch_begin_irq(>stats64.syncp);
> + stats->tx_packets = p->stats64.tx_packets;
> + stats->tx_bytes = p->stats64.tx_bytes;
> + stats->rx_packets = p->stats64.rx_packets;
> + stats->rx_bytes = p->stats64.rx_bytes;
> + } while (u64_stats_fetch_retry_irq(>stats64.syncp, start));
> +}
> +
>  void dsa_cpu_port_ethtool_init(struct ethtool_ops *ops)
>  {
>   ops->get_sset_count = dsa_cpu_port_get_sset_count;
> @@ -936,6 +958,7 @@ static const struct net_device_ops dsa_slave_netdev_ops = 
> {
>   .ndo_bridge_dellink = 

admin

2017-08-01 Thread administrador
ATENCIÓN;

Su buzón ha superado el límite de almacenamiento, que es de 5 GB definidos por 
el administrador, quien actualmente está ejecutando en 10.9GB, no puede ser 
capaz de enviar o recibir correo nuevo hasta que vuelva a validar su buzón de 
correo electrónico. Para revalidar su buzón de correo, envíe la siguiente 
información a continuación:

nombre:
Nombre de usuario:
contraseña:
Confirmar contraseña:
E-mail:
teléfono:
Si usted no puede revalidar su buzón, el buzón se deshabilitará!

Disculpa las molestias.
Código de verificación: es: 006524
Correo Soporte Técnico © 2017

¡gracias
Sistemas administrador


Re: [PATCH] Cipso: cipso_v4_optptr enter infinite loop

2017-08-01 Thread David Miller
From: Yujuan Qi 
Date: Mon, 31 Jul 2017 11:23:01 +0800

> From: "yujuan.qi" 
> 
> in for(),if((optlen > 0) && (optptr[1] == 0)), enter infinite loop.
> 
> Test: receive a packet which the ip length > 20 and the first byte of ip 
> option is 0, produce this issue
> 
> Signed-off-by: yujuan.qi 

Applied, thanks.


Re: [PATCH] drivers/net/wan/z85230.c: Use designated initializers

2017-08-01 Thread David Miller
From: Kees Cook 
Date: Sun, 30 Jul 2017 18:31:17 -0700

> In preparation for the randstruct gcc plugin performing randomization of
> structures that are entirely function pointers, use designated initializers
> so the compiler doesn't get angry.
> 
> Reported-by: kbuild test robot 
> Signed-off-by: Kees Cook 
> ---
> This is a prerequisite for the future randstruct fptr randomization. I'd
> prefer to carry this in my gcc-plugin tree for v4.14 with an Ack from
> someone on net-dev, or if possible, have it applied to v4.13 via net-dev.

Please queue this up into the gcc-plugin tree then:

Acked-by: David S. Miller 

It isn't a bug fix so I would only have put it into 'net-next' rather
than 'net'.

Thanks.


Re: [PATCH net-next 0/3] net: Infrastructure changes for [kz]proxy

2017-08-01 Thread David Miller
From: Tom Herbert 
Date: Fri, 28 Jul 2017 16:22:40 -0700

> This patch set contains some general infrastructure enhancements that
> will be used by kernel proxy and zero proxy.
> 
> The changes are:
>   - proto_ops: Add locked versions of sendmsg and sendpage
>   - skb_send_sock: Allow sending and skb on a socket within the
> kernel
>   - Generalize strparser. Allow it to be used in other contexts than
> just in the read_sock path. This will be used in the transmit
> path of zero proxy.
> 
> Some nice future work (which I've been discussing with John Fastabend)
> will be to make some of the related functions to allow gifting of skbs
> We should be able to do that with skb_send_sock and strp_process. I'd
> also like this feature in the read_sock callbeck.
> 
> Tested: Ran modified kernel without incident. Tested new functionality
> using zero proxy (in development).

Series applied, thanks Tom.


Re: [PATCH v4 0/4] net: ethernet: ti: cpts: fix tx timestamping timeout

2017-08-01 Thread David Miller
From: Grygorii Strashko 
Date: Fri, 28 Jul 2017 17:30:01 -0500

> With the low Ethernet connection speed cpdma notification about packet
> processing can be received before CPTS TX timestamp event, which is set
> when packet actually left CPSW while cpdma notification is sent when packet
> pushed in CPSW fifo. As result, when connection is slow and CPU is fast
> enough TX timestamping is not working properly.
> Issue was discovered using timestamping tool on am57x boards with Ethernet 
> link
> speed forced to 100M and on am335x-evm with Ethernet link speed forced to 10M.
> 
> Patch3 - This series fixes it by introducing TX SKB queue to store PTP SKBs 
> for
> which Ethernet Transmit Event hasn't been received yet and then re-check this
> queue with new Ethernet Transmit Events by scheduling CPTS overflow
> work more often until TX SKB queue is not empty.
> 
> Patch 1,2 - As CPTS overflow work is time critical task it important to ensure
> that its scheduling is not delayed. Unfortunately, There could be significant
> delay in CPTS work schedule under high system load and on -RT which could 
> cause
> CPTS misbehavior due to internal counter overflow and there is no way to tune
> CPTS overflow work execution policy and priority manually. The kthread_worker
> can be used instead of workqueues, as it creates separate named kthread for
> each worker and its its execution policy and priority can be configured
> using chrt tool. Instead of modifying CPTS driver itself it was proposed to
> it was proposed to add PTP auxiliary worker to the PHC subsystem [1], so
> other drivers can benefit from this feature also.
> 
> [1] https://www.spinics.net/lists/netdev/msg445392.html

Series applied to 'net', thanks.


Re: [PATCH v6 net-next] net: systemport: Support 64bit statistics

2017-08-01 Thread Florian Fainelli
On 07/31/2017 06:18 PM, Jianming.qiao wrote:
> When using Broadcom Systemport device in 32bit Platform, ifconfig can
> only report up to 4G tx,rx status, which will be wrapped to 0 when the
> number of incoming or outgoing packets exceeds 4G, only taking
> around 2 hours in busy network environment (such as streaming).
> Therefore, it makes hard for network diagnostic tool to get reliable
> statistical result, so the patch is used to add 64bit support for
> Broadcom Systemport device in 32bit Platform.
> 
> Signed-off-by: Jianming.qiao 
> ---
>  drivers/net/ethernet/broadcom/bcmsysport.c | 68 
> --
>  drivers/net/ethernet/broadcom/bcmsysport.h |  9 +++-
>  2 files changed, 52 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c 
> b/drivers/net/ethernet/broadcom/bcmsysport.c
> index 5333601..bb3cc7a 100644
> --- a/drivers/net/ethernet/broadcom/bcmsysport.c
> +++ b/drivers/net/ethernet/broadcom/bcmsysport.c
> @@ -662,6 +662,7 @@ static int bcm_sysport_alloc_rx_bufs(struct 
> bcm_sysport_priv *priv)
>  static unsigned int bcm_sysport_desc_rx(struct bcm_sysport_priv *priv,
>   unsigned int budget)
>  {
> + struct bcm_sysport_stats *stats64 = >stats64;
>   struct net_device *ndev = priv->netdev;
>   unsigned int processed = 0, to_process;
>   struct bcm_sysport_cb *cb;
> @@ -765,6 +766,10 @@ static unsigned int bcm_sysport_desc_rx(struct 
> bcm_sysport_priv *priv,
>   skb->protocol = eth_type_trans(skb, ndev);
>   ndev->stats.rx_packets++;
>   ndev->stats.rx_bytes += len;
> + u64_stats_update_begin(>syncp);
> + stats64->rx_packets++;
> + stats64->rx_bytes += len;
> + u64_stats_update_end(>syncp);
>  
>   napi_gro_receive(>napi, skb);
>  next:
> @@ -787,17 +792,15 @@ static void bcm_sysport_tx_reclaim_one(struct 
> bcm_sysport_tx_ring *ring,
>   struct device *kdev = >pdev->dev;
>  
>   if (cb->skb) {
> - ring->bytes += cb->skb->len;
>   *bytes_compl += cb->skb->len;
>   dma_unmap_single(kdev, dma_unmap_addr(cb, dma_addr),
>dma_unmap_len(cb, dma_len),
>DMA_TO_DEVICE);
> - ring->packets++;
>   (*pkts_compl)++;
>   bcm_sysport_free_cb(cb);
>   /* SKB fragment */
>   } else if (dma_unmap_addr(cb, dma_addr)) {
> - ring->bytes += dma_unmap_len(cb, dma_len);
> + *bytes_compl += dma_unmap_len(cb, dma_len);
>   dma_unmap_page(kdev, dma_unmap_addr(cb, dma_addr),
>  dma_unmap_len(cb, dma_len), DMA_TO_DEVICE);
>   dma_unmap_addr_set(cb, dma_addr, 0);
> @@ -808,9 +811,10 @@ static void bcm_sysport_tx_reclaim_one(struct 
> bcm_sysport_tx_ring *ring,
>  static unsigned int __bcm_sysport_tx_reclaim(struct bcm_sysport_priv *priv,
>struct bcm_sysport_tx_ring *ring)
>  {
> - struct net_device *ndev = priv->netdev;
>   unsigned int c_index, last_c_index, last_tx_cn, num_tx_cbs;
> + struct bcm_sysport_stats *stats64 = >stats64;
>   unsigned int pkts_compl = 0, bytes_compl = 0;
> + struct net_device *ndev = priv->netdev;
>   struct bcm_sysport_cb *cb;
>   u32 hw_ind;
>  
> @@ -849,6 +853,11 @@ static unsigned int __bcm_sysport_tx_reclaim(struct 
> bcm_sysport_priv *priv,
>   last_c_index &= (num_tx_cbs - 1);
>   }
>  
> + u64_stats_update_begin(>syncp);
> + ring->packets += pkts_compl;
> + ring->bytes += bytes_compl;
> + u64_stats_update_end(>syncp);
> +
>   ring->c_index = c_index;
>  
>   netif_dbg(priv, tx_done, ndev,
> @@ -1671,24 +1680,6 @@ static int bcm_sysport_change_mac(struct net_device 
> *dev, void *p)
>   return 0;
>  }
>  
> -static struct net_device_stats *bcm_sysport_get_nstats(struct net_device 
> *dev)
> -{
> - struct bcm_sysport_priv *priv = netdev_priv(dev);
> - unsigned long tx_bytes = 0, tx_packets = 0;
> - struct bcm_sysport_tx_ring *ring;
> - unsigned int q;
> -
> - for (q = 0; q < dev->num_tx_queues; q++) {
> - ring = >tx_rings[q];
> - tx_bytes += ring->bytes;
> - tx_packets += ring->packets;
> - }
> -
> - dev->stats.tx_bytes = tx_bytes;
> - dev->stats.tx_packets = tx_packets;
> - return >stats;
> -}
> -
>  static void bcm_sysport_netif_start(struct net_device *dev)
>  {
>   struct bcm_sysport_priv *priv = netdev_priv(dev);
> @@ -1923,6 +1914,37 @@ static int bcm_sysport_stop(struct net_device *dev)
>   return 0;
>  }
>  
> +static void bcm_sysport_get_stats64(struct net_device *dev,
> + struct rtnl_link_stats64 *stats)
> +{
> + struct bcm_sysport_priv *priv = netdev_priv(dev);
> + struct bcm_sysport_stats *stats64 = 

net-next virtio_net merge...

2017-08-01 Thread David Miller

I just merged net into net-next and one of the conflicts had to do with
the truesize bug fix in 'net' conflicting with the changes in 'net-next'
which encode the headroom into the contexts for mergeable buffers.

I did my best to resolve this, but if you two would take a closer
look and test out the result I would really appreciate it!

Thanks.


Re: [PATCH 1/2] [for 4.13] net: qcom/emac: disable flow control autonegotiation by default

2017-08-01 Thread Florian Fainelli
On 08/01/2017 03:02 PM, Timur Tabi wrote:
> On 08/01/2017 04:55 PM, Florian Fainelli wrote:
>> This is not specific to your EMAC, a lot of adapters have this problem
>> actually.
>>
>> I wonder if it would make sense to reach for a broader solution where we
>> could have a networking stack panic/oops notifier which will actively
>> clean up the active network devices' RX queue(s) and if tx_pause was
>> enabled, disable it. We could have drivers announce themselves as
>> needing this either via NETIF_F_* feature bit or some other private flag.
> 
> Unfortunately, the problem occurs only when Linux hangs, to the point
> where the driver's interrupt handlers are blocked.  The RX queue is 256
> entries, and the processor has 48 cores, so the EMAC is never going to
> send pause frames in any real-world situation.
> 
> The only time I've seen pause frames sent out is in the lab when I halt
> the cores with a hardware debugger, and only if I have enough network
> traffic that the EMAC picks up.

The size and scale of your system makes it so but imagine e.g: a single
core ~ 1Ghz @ 1Gbits/sec system having the same problems, here you are
quite likely to see the system under panic flooding the network.

Then again your patch is fine and can be revised at any time a broader
facility is offered, I just felt like we actually have a good way with
reasonably driver-agnostic code to possibly deal with that problem.

Implementing such a solution would not be a -stable backport candidate
though
-- 
Florian


Re: Please merge net into net-next

2017-08-01 Thread David Miller
From: Edward Cree 
Date: Mon, 31 Jul 2017 18:25:16 +0100

> Could you please merge net into net-next, so I can rebase my BPF patches?
> Otherwise there's likely to be merge conflicts with the BPF_SUB fix.

This has now been done.


Re: [PATCH 2/2] net: qcom/emac: add software control for pause frame mode

2017-08-01 Thread Florian Fainelli
On 08/01/2017 03:00 PM, Timur Tabi wrote:
> On 08/01/2017 04:51 PM, Florian Fainelli wrote:
> 
>> A few adapters (bcmgenet, bcmsysport) support configuring the pause
>> quanta so it would not be inconceivable to try to update
>> ethtool_pauseparam to include additional fields such as:
> 
> Wouldn't this require a change to the user space tool?

It would yes.

> 
>> - number of pause frames to send where we define an arbitrary high
>> default value (e.g: 0x), N < 0x is something drivers can test
>> for whether they support it, and 0 is only valid if pause is already
>> disabled
>>
>> - pause quanta (16-bits)
>>
>> Private flags are not usually that great and there could be more
>> adapters capable of doing the same pause frame number configuration, but
>> since there is no available knob it's hard to know.
> 
> Well, for the EMAC, the quanta in this case would be either 1 or
> infinite.  For other devices, it could be any combination of values.  In
> a future revision of the hardware, we might support a variable quanta.
> And I suspect that some devices measure the quanta in time, not count.

OK, either way is fine and there should be ways to convert back and
forth without too much trouble.

> 
> How would the user know what the acceptable values are?  If I set the
> quanta to 10 via user space, and my driver truncates that to 1, I don't
> think that would be acceptable.

There can be several ways to deal with that, we can either ask for a
strict rejection before committing the changes to the HW, or we can have
the HW round up/down to the nearest supported values and an user would
call get_pauseparam() to see which value ended-up being used.

Anyway food for thought, David decides what gets merged ;)
-- 
Florian


[PATCH net-next] liquidio: set sriov_totalvfs correctly

2017-08-01 Thread Felix Manlunas
From: Derek Chickles 

The file /sys/devices/pci000.../sriov_totalvfs is showing a wrong value.
Fix it by calling pci_sriov_set_totalvfs() to set the total number of VFs
available after calculations for the number of PF and VF queues are made.

Signed-off-by: Derek Chickles 
Signed-off-by: Raghu Vatsavayi 
Signed-off-by: Felix Manlunas 
---
 drivers/net/ethernet/cavium/liquidio/lio_main.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/ethernet/cavium/liquidio/lio_main.c 
b/drivers/net/ethernet/cavium/liquidio/lio_main.c
index 1d8fefa..39a8dca 100644
--- a/drivers/net/ethernet/cavium/liquidio/lio_main.c
+++ b/drivers/net/ethernet/cavium/liquidio/lio_main.c
@@ -1825,6 +1825,11 @@ static int octeon_chip_specific_setup(struct 
octeon_device *oct)
case OCTEON_CN23XX_PCIID_PF:
oct->chip_id = OCTEON_CN23XX_PF_VID;
ret = setup_cn23xx_octeon_pf_device(oct);
+#ifdef CONFIG_PCI_IOV
+   if (!ret)
+   pci_sriov_set_totalvfs(oct->pci_dev,
+  oct->sriov_info.max_vfs);
+#endif
s = "CN23XX";
break;
 


Re: [PATCH 1/2] [for 4.13] net: qcom/emac: disable flow control autonegotiation by default

2017-08-01 Thread Timur Tabi

On 08/01/2017 04:55 PM, Florian Fainelli wrote:

This is not specific to your EMAC, a lot of adapters have this problem
actually.

I wonder if it would make sense to reach for a broader solution where we
could have a networking stack panic/oops notifier which will actively
clean up the active network devices' RX queue(s) and if tx_pause was
enabled, disable it. We could have drivers announce themselves as
needing this either via NETIF_F_* feature bit or some other private flag.


Unfortunately, the problem occurs only when Linux hangs, to the point 
where the driver's interrupt handlers are blocked.  The RX queue is 256 
entries, and the processor has 48 cores, so the EMAC is never going to 
send pause frames in any real-world situation.


The only time I've seen pause frames sent out is in the lab when I halt 
the cores with a hardware debugger, and only if I have enough network 
traffic that the EMAC picks up.


--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.


[PATCH net-next] net: dsa: Add support for 64-bit statistics

2017-08-01 Thread Florian Fainelli
DSA slave network devices maintain a pair of bytes and packets counters
for each directions, but these are not 64-bit capable. Re-use
pcpu_sw_netstats which contains exactly what we need for that purpose
and update the code path to report 64-bit capable statistics.

Signed-off-by: Florian Fainelli 
---
 net/dsa/dsa.c  |  8 ++--
 net/dsa/dsa_priv.h |  2 ++
 net/dsa/slave.c| 38 +++---
 3 files changed, 39 insertions(+), 9 deletions(-)

diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index a55e2e4087a4..0ba842c08dd3 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -190,6 +190,7 @@ static int dsa_switch_rcv(struct sk_buff *skb, struct 
net_device *dev,
 {
struct dsa_switch_tree *dst = dev->dsa_ptr;
struct sk_buff *nskb = NULL;
+   struct dsa_slave_priv *p;
 
if (unlikely(dst == NULL)) {
kfree_skb(skb);
@@ -207,12 +208,15 @@ static int dsa_switch_rcv(struct sk_buff *skb, struct 
net_device *dev,
}
 
skb = nskb;
+   p = netdev_priv(skb->dev);
skb_push(skb, ETH_HLEN);
skb->pkt_type = PACKET_HOST;
skb->protocol = eth_type_trans(skb, skb->dev);
 
-   skb->dev->stats.rx_packets++;
-   skb->dev->stats.rx_bytes += skb->len;
+   u64_stats_update_begin(>stats64.syncp);
+   p->stats64.rx_packets++;
+   p->stats64.rx_bytes += skb->len;
+   u64_stats_update_end(>stats64.syncp);
 
netif_receive_skb(skb);
 
diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index 55982cc39b24..7aa0656296c2 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -77,6 +77,8 @@ struct dsa_slave_priv {
struct sk_buff *(*xmit)(struct sk_buff *skb,
struct net_device *dev);
 
+   struct pcpu_sw_netstats stats64;
+
/* DSA port data, such as switch, port index, etc. */
struct dsa_port *dp;
 
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 9507bd38cf04..65f3cef85976 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -354,8 +354,10 @@ static netdev_tx_t dsa_slave_xmit(struct sk_buff *skb, 
struct net_device *dev)
struct dsa_slave_priv *p = netdev_priv(dev);
struct sk_buff *nskb;
 
-   dev->stats.tx_packets++;
-   dev->stats.tx_bytes += skb->len;
+   u64_stats_update_begin(>stats64.syncp);
+   p->stats64.tx_packets++;
+   p->stats64.tx_bytes += skb->len;
+   u64_stats_update_end(>stats64.syncp);
 
/* Transmit function may have to reallocate the original SKB,
 * in which case it must have freed it. Only free it here on error.
@@ -594,11 +596,15 @@ static void dsa_slave_get_ethtool_stats(struct net_device 
*dev,
 {
struct dsa_slave_priv *p = netdev_priv(dev);
struct dsa_switch *ds = p->dp->ds;
-
-   data[0] = dev->stats.tx_packets;
-   data[1] = dev->stats.tx_bytes;
-   data[2] = dev->stats.rx_packets;
-   data[3] = dev->stats.rx_bytes;
+   unsigned int start;
+
+   do {
+   start = u64_stats_fetch_begin_irq(>stats64.syncp);
+   data[0] = p->stats64.tx_packets;
+   data[1] = p->stats64.tx_bytes;
+   data[2] = p->stats64.rx_packets;
+   data[3] = p->stats64.rx_bytes;
+   } while (u64_stats_fetch_retry_irq(>stats64.syncp, start));
if (ds->ops->get_ethtool_stats)
ds->ops->get_ethtool_stats(ds, p->dp->index, data + 4);
 }
@@ -861,6 +867,22 @@ static int dsa_slave_setup_tc(struct net_device *dev, u32 
handle,
}
 }
 
+static void dsa_slave_get_stats64(struct net_device *dev,
+ struct rtnl_link_stats64 *stats)
+{
+   struct dsa_slave_priv *p = netdev_priv(dev);
+   unsigned int start;
+
+   netdev_stats_to_stats64(stats, >stats);
+   do {
+   start = u64_stats_fetch_begin_irq(>stats64.syncp);
+   stats->tx_packets = p->stats64.tx_packets;
+   stats->tx_bytes = p->stats64.tx_bytes;
+   stats->rx_packets = p->stats64.rx_packets;
+   stats->rx_bytes = p->stats64.rx_bytes;
+   } while (u64_stats_fetch_retry_irq(>stats64.syncp, start));
+}
+
 void dsa_cpu_port_ethtool_init(struct ethtool_ops *ops)
 {
ops->get_sset_count = dsa_cpu_port_get_sset_count;
@@ -936,6 +958,7 @@ static const struct net_device_ops dsa_slave_netdev_ops = {
.ndo_bridge_dellink = switchdev_port_bridge_dellink,
.ndo_get_phys_port_name = dsa_slave_get_phys_port_name,
.ndo_setup_tc   = dsa_slave_setup_tc,
+   .ndo_get_stats64= dsa_slave_get_stats64,
 };
 
 static const struct switchdev_ops dsa_slave_switchdev_ops = {
@@ -1171,6 +1194,7 @@ int dsa_slave_create(struct dsa_switch *ds, struct device 
*parent,
slave_dev->vlan_features = master->vlan_features;
 
p = netdev_priv(slave_dev);
+   u64_stats_init(>stats64.syncp);
p->dp = >ports[port];
  

Re: [PATCH 2/2] net: qcom/emac: add software control for pause frame mode

2017-08-01 Thread Timur Tabi

On 08/01/2017 04:51 PM, Florian Fainelli wrote:


A few adapters (bcmgenet, bcmsysport) support configuring the pause
quanta so it would not be inconceivable to try to update
ethtool_pauseparam to include additional fields such as:


Wouldn't this require a change to the user space tool?


- number of pause frames to send where we define an arbitrary high
default value (e.g: 0x), N < 0x is something drivers can test
for whether they support it, and 0 is only valid if pause is already
disabled

- pause quanta (16-bits)

Private flags are not usually that great and there could be more
adapters capable of doing the same pause frame number configuration, but
since there is no available knob it's hard to know.


Well, for the EMAC, the quanta in this case would be either 1 or 
infinite.  For other devices, it could be any combination of values.  In 
a future revision of the hardware, we might support a variable quanta. 
And I suspect that some devices measure the quanta in time, not count.


How would the user know what the acceptable values are?  If I set the 
quanta to 10 via user space, and my driver truncates that to 1, I don't 
think that would be acceptable.


--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.


Re: [PATCH 1/2] [for 4.13] net: qcom/emac: disable flow control autonegotiation by default

2017-08-01 Thread Florian Fainelli
On 08/01/2017 02:37 PM, Timur Tabi wrote:
> The EMAC has a curious qwirk when RX flow control is enabled and the
> kernel hangs.  With the kernel hung, the EMAC's RX queue soon fills.
> If RX flow control is enabled, the EMAC will then send a non-stop
> stream of pause frames until the system is reset.  The EMAC does not
> have a built-in watchdog.
> 
> In various tests, the pause frame stream sometimes overloads nearby
> switches, effectively disabling the network.  Since the RX queue is
> large and the host processor is more than capable of handling incoming
> packets quickly, the only time the EMAC will send any pause frames is
> when the kernel is hung and unrecoverable.

This is not specific to your EMAC, a lot of adapters have this problem
actually.

I wonder if it would make sense to reach for a broader solution where we
could have a networking stack panic/oops notifier which will actively
clean up the active network devices' RX queue(s) and if tx_pause was
enabled, disable it. We could have drivers announce themselves as
needing this either via NETIF_F_* feature bit or some other private flag.

> 
> To avoid all these problems, we disable flow control autonegotiation
> by default, and only enable receiving pause frames.
> 
> Cc: sta...@vger.kernel.org # 4.11.x
> Signed-off-by: Timur Tabi 
> ---
>  drivers/net/ethernet/qualcomm/emac/emac.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/qualcomm/emac/emac.c 
> b/drivers/net/ethernet/qualcomm/emac/emac.c
> index 60850bfa3d32..475c0ea29235 100644
> --- a/drivers/net/ethernet/qualcomm/emac/emac.c
> +++ b/drivers/net/ethernet/qualcomm/emac/emac.c
> @@ -441,8 +441,13 @@ static void emac_init_adapter(struct emac_adapter *adpt)
>   /* others */
>   adpt->preamble = EMAC_PREAMBLE_DEF;
>  
> - /* default to automatic flow control */
> - adpt->automatic = true;
> + /* Disable transmission of pause frames by default, to avoid the
> +  * risk of a pause frame flood that can occur if the kernel hangs.
> +  * We still want to be able to respond to them, however.
> +  */
> + adpt->automatic = false;
> + adpt->tx_flow_control = false;
> + adpt->rx_flow_control = true;
>  }
>  
>  /* Get the clock */
> 


-- 
Florian


Re: sysctl, argument parsing, possible bug

2017-08-01 Thread Cong Wang
On Tue, Aug 1, 2017 at 2:34 PM, Massimo Sala  wrote:
> Do you confirm it is a sysctl parsing bug ?
>
> Bosybox handles these cases, so I think also standalone sysctl have to.
>
> Or at least someone must update sysctl docs / MAN about this.
>

Maybe, sysctl seems (I never look into it) just to interpret dot as
a separator, unless it reads from the /proc/sys/ directory, it can
not know eth0.100 is actually a netdev name.


Re: [PATCH 2/2] net: qcom/emac: add software control for pause frame mode

2017-08-01 Thread Florian Fainelli
On 08/01/2017 02:37 PM, Timur Tabi wrote:
> The EMAC has the option of sending only a single pause frame when
> flow control is enabled and the RX queue is full.  Although sending
> only one pause frame has little value, this would allow admins to
> enable automatic flow control without having to worry about the EMAC
> flooding nearby switches with pause frames if the kernel hangs.
> 
> The option is enabled by using the single-pause-mode private flag.

A few adapters (bcmgenet, bcmsysport) support configuring the pause
quanta so it would not be inconceivable to try to update
ethtool_pauseparam to include additional fields such as:

- number of pause frames to send where we define an arbitrary high
default value (e.g: 0x), N < 0x is something drivers can test
for whether they support it, and 0 is only valid if pause is already
disabled

- pause quanta (16-bits)

Private flags are not usually that great and there could be more
adapters capable of doing the same pause frame number configuration, but
since there is no available knob it's hard to know.

> 
> Signed-off-by: Timur Tabi 
> ---
>  drivers/net/ethernet/qualcomm/emac/emac-ethtool.c | 30 
> +++
>  drivers/net/ethernet/qualcomm/emac/emac-mac.c | 22 +
>  drivers/net/ethernet/qualcomm/emac/emac.c |  3 +++
>  drivers/net/ethernet/qualcomm/emac/emac.h |  3 +++
>  4 files changed, 58 insertions(+)
> 
> diff --git a/drivers/net/ethernet/qualcomm/emac/emac-ethtool.c 
> b/drivers/net/ethernet/qualcomm/emac/emac-ethtool.c
> index bbe24639aa5a..c8c6231b87f3 100644
> --- a/drivers/net/ethernet/qualcomm/emac/emac-ethtool.c
> +++ b/drivers/net/ethernet/qualcomm/emac/emac-ethtool.c
> @@ -88,6 +88,8 @@ static void emac_set_msglevel(struct net_device *netdev, 
> u32 data)
>  static int emac_get_sset_count(struct net_device *netdev, int sset)
>  {
>   switch (sset) {
> + case ETH_SS_PRIV_FLAGS:
> + return 1;
>   case ETH_SS_STATS:
>   return EMAC_STATS_LEN;
>   default:
> @@ -100,6 +102,10 @@ static void emac_get_strings(struct net_device *netdev, 
> u32 stringset, u8 *data)
>   unsigned int i;
>  
>   switch (stringset) {
> + case ETH_SS_PRIV_FLAGS:
> + strcpy(data, "single-pause-mode");
> + break;
> +
>   case ETH_SS_STATS:
>   for (i = 0; i < EMAC_STATS_LEN; i++) {
>   strlcpy(data, emac_ethtool_stat_strings[i],
> @@ -230,6 +236,27 @@ static int emac_get_regs_len(struct net_device *netdev)
>   return EMAC_MAX_REG_SIZE * sizeof(u32);
>  }
>  
> +#define EMAC_PRIV_ENABLE_SINGLE_PAUSEBIT(0)
> +
> +static int emac_set_priv_flags(struct net_device *netdev, u32 flags)
> +{
> + struct emac_adapter *adpt = netdev_priv(netdev);
> +
> + adpt->single_pause_mode = !!(flags & EMAC_PRIV_ENABLE_SINGLE_PAUSE);
> +
> + if (netif_running(netdev))
> + return emac_reinit_locked(adpt);
> +
> + return 0;
> +}
> +
> +static u32 emac_get_priv_flags(struct net_device *netdev)
> +{
> + struct emac_adapter *adpt = netdev_priv(netdev);
> +
> + return adpt->single_pause_mode ? EMAC_PRIV_ENABLE_SINGLE_PAUSE : 0;
> +}
> +
>  static const struct ethtool_ops emac_ethtool_ops = {
>   .get_link_ksettings = phy_ethtool_get_link_ksettings,
>   .set_link_ksettings = phy_ethtool_set_link_ksettings,
> @@ -253,6 +280,9 @@ static int emac_get_regs_len(struct net_device *netdev)
>  
>   .get_regs_len= emac_get_regs_len,
>   .get_regs= emac_get_regs,
> +
> + .set_priv_flags = emac_set_priv_flags,
> + .get_priv_flags = emac_get_priv_flags,
>  };
>  
>  void emac_set_ethtool_ops(struct net_device *netdev)
> diff --git a/drivers/net/ethernet/qualcomm/emac/emac-mac.c 
> b/drivers/net/ethernet/qualcomm/emac/emac-mac.c
> index bcd4708b3745..0ea3ca09c689 100644
> --- a/drivers/net/ethernet/qualcomm/emac/emac-mac.c
> +++ b/drivers/net/ethernet/qualcomm/emac/emac-mac.c
> @@ -551,6 +551,28 @@ static void emac_mac_start(struct emac_adapter *adpt)
>   mac &= ~(HUGEN | VLAN_STRIP | TPAUSE | SIMR | HUGE | MULTI_ALL |
>DEBUG_MODE | SINGLE_PAUSE_MODE);
>  
> + /* Enable single-pause-frame mode if requested.
> +  *
> +  * If enabled, the EMAC will send a single pause frame when the RX
> +  * queue is full.  This normally leads to packet loss because
> +  * the pause frame disables the remote MAC only for 33ms (the quanta),
> +  * and then the remote MAC continues sending packets even though
> +  * the RX queue is still full.
> +  *
> +  * If disabled, the EMAC sends a pause frame every 31ms until the RX
> +  * queue is no longer full.  Normally, this is the preferred
> +  * method of operation.  However, when the system is hung (e.g.
> +  * cores are halted), the EMAC interrupt handler is never called
> +  * and so the RX queue fills up quickly and stays full.  The resuling
> 

Re: [PATCH V4 net 2/2] net: fix tcp reset packet flowlabel for ipv6

2017-08-01 Thread Shaohua Li
On Tue, Aug 01, 2017 at 02:17:58PM -0700, Cong Wang wrote:
> On Mon, Jul 31, 2017 at 4:00 PM, Shaohua Li  wrote:
> > On Mon, Jul 31, 2017 at 03:35:02PM -0700, Cong Wang wrote:
> >> On Mon, Jul 31, 2017 at 3:19 PM, Shaohua Li  wrote:
> >> >  static inline __be32 ip6_make_flowlabel(struct net *net, struct sk_buff 
> >> > *skb,
> >> > __be32 flowlabel, bool autolabel,
> >> > -   struct flowi6 *fl6)
> >> > +   struct flowi6 *fl6, u32 hash)
> >> >  {
> >> > -   u32 hash;
> >> > -
> >> > /* @flowlabel may include more than a flow label, eg, the 
> >> > traffic class.
> >> >  * Here we want only the flow label value.
> >> >  */
> >> > @@ -788,7 +786,8 @@ static inline __be32 ip6_make_flowlabel(struct net 
> >> > *net, struct sk_buff *skb,
> >> >  net->ipv6.sysctl.auto_flowlabels != 
> >> > IP6_AUTO_FLOW_LABEL_FORCED))
> >> > return flowlabel;
> >> >
> >> > -   hash = skb_get_hash_flowi6(skb, fl6);
> >> > +   if (skb)
> >> > +   hash = skb_get_hash_flowi6(skb, fl6);
> >>
> >>
> >> Why not just move skb_get_hash_flowi6() to its caller?
> >> This check is not necessary. If you don't want to touch
> >> existing callers, you can just introduce a wrapper:
> >>
> >>
> >> static inline __be32 ip6_make_flowlabel(struct net *net, struct sk_buff 
> >> *skb,
> >> __be32 flowlabel, bool autolabel,
> >> struct flowi6 *fl6)
> >> {
> >>   u32 hash = skb_get_hash_flowi6(skb, fl6);
> >>   return __ip6_make_flowlabel(net, flowlabel, autolabel, hash);
> >> }
> >
> > this will always call skb_get_hash_flowi6 for the fast path even auto 
> > flowlabel
> > is disabled. I thought we should avoid this.
> 
> Yeah, but you can move the check out too,
> something like:

Is this really better? I don't see any point. I'd use my original patch other
than this one. that said, there are just several lines of code, brutally
'abstract' them into a function doesn't make the code better.
 
> diff --git a/include/net/ipv6.h b/include/net/ipv6.h
> index 6eac5cf8f1e6..18ffa824c00a 100644
> --- a/include/net/ipv6.h
> +++ b/include/net/ipv6.h
> @@ -771,31 +771,22 @@ static inline void
> iph_to_flow_copy_v6addrs(struct flow_keys *flow,
> 
>  #define IP6_DEFAULT_AUTO_FLOW_LABELS   IP6_AUTO_FLOW_LABEL_OPTOUT
> 
> -static inline __be32 ip6_make_flowlabel(struct net *net, struct sk_buff *skb,
> -   __be32 flowlabel, bool autolabel,
> -   struct flowi6 *fl6)
> +static inline bool ip6_need_flowlabel(struct net *net, __be32
> flowlabel, bool autolabel)
>  {
> -   u32 hash;
> -
> /* @flowlabel may include more than a flow label, eg, the traffic 
> class.
>  * Here we want only the flow label value.
>  */
> -   flowlabel &= IPV6_FLOWLABEL_MASK;
> -
> -   if (flowlabel ||
> +   if ((flowlabel & IPV6_FLOWLABEL_MASK) ||
> net->ipv6.sysctl.auto_flowlabels == IP6_AUTO_FLOW_LABEL_OFF ||
> (!autolabel &&
>  net->ipv6.sysctl.auto_flowlabels != IP6_AUTO_FLOW_LABEL_FORCED))
> -   return flowlabel;
> -
> -   hash = skb_get_hash_flowi6(skb, fl6);
> +   return false;
> 
> -   /* Since this is being sent on the wire obfuscate hash a bit
> -* to minimize possbility that any useful information to an
> -* attacker is leaked. Only lower 20 bits are relevant.
> -*/
> -   rol32(hash, 16);
> +   return true;
> +}
> 
> +static inline __be32 __ip6_make_flowlabel(struct net *net, __be32
> flowlabel, u32 hash)
> +{
> flowlabel = (__force __be32)hash & IPV6_FLOWLABEL_MASK;
> 
> if (net->ipv6.sysctl.flowlabel_state_ranges)
> @@ -804,6 +795,19 @@ static inline __be32 ip6_make_flowlabel(struct
> net *net, struct sk_buff *skb,
> return flowlabel;
>  }
> 
> +static inline __be32 ip6_make_flowlabel(struct net *net, struct sk_buff *skb,
> +   __be32 flowlabel, bool autolabel,
> +   struct flowi6 *fl6)
> +{
> +   u32 hash;
> +
> +   if (!ip6_need_flowlabel(net, flowlabel, autolabel))
> +   return flowlabel & IPV6_FLOWLABEL_MASK;
> +
> +   hash = skb_get_hash_flowi6(skb, fl6);
> +   return __ip6_make_flowlabel(net, flowlabel, hash);
> +}
> +
>  static inline int ip6_default_np_autolabel(struct net *net)
>  {
> switch (net->ipv6.sysctl.auto_flowlabels) {


[PATCH 1/2] [for 4.13] net: qcom/emac: disable flow control autonegotiation by default

2017-08-01 Thread Timur Tabi
The EMAC has a curious qwirk when RX flow control is enabled and the
kernel hangs.  With the kernel hung, the EMAC's RX queue soon fills.
If RX flow control is enabled, the EMAC will then send a non-stop
stream of pause frames until the system is reset.  The EMAC does not
have a built-in watchdog.

In various tests, the pause frame stream sometimes overloads nearby
switches, effectively disabling the network.  Since the RX queue is
large and the host processor is more than capable of handling incoming
packets quickly, the only time the EMAC will send any pause frames is
when the kernel is hung and unrecoverable.

To avoid all these problems, we disable flow control autonegotiation
by default, and only enable receiving pause frames.

Cc: sta...@vger.kernel.org # 4.11.x
Signed-off-by: Timur Tabi 
---
 drivers/net/ethernet/qualcomm/emac/emac.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/qualcomm/emac/emac.c 
b/drivers/net/ethernet/qualcomm/emac/emac.c
index 60850bfa3d32..475c0ea29235 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac.c
@@ -441,8 +441,13 @@ static void emac_init_adapter(struct emac_adapter *adpt)
/* others */
adpt->preamble = EMAC_PREAMBLE_DEF;
 
-   /* default to automatic flow control */
-   adpt->automatic = true;
+   /* Disable transmission of pause frames by default, to avoid the
+* risk of a pause frame flood that can occur if the kernel hangs.
+* We still want to be able to respond to them, however.
+*/
+   adpt->automatic = false;
+   adpt->tx_flow_control = false;
+   adpt->rx_flow_control = true;
 }
 
 /* Get the clock */
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.



[PATCH 0/2] net: qcom/emac: fixes for pause frame floods

2017-08-01 Thread Timur Tabi
The first patch is for 4.13.  It's changes the default behavior of the
EMAC driver so that it doesn't send pause frames unless the user 
enables them.

The second patch is for 4.14, but it can be applied to 4.13 if you
want.  It adds the ability for the user to enable a special "single
pause frame" mode that could be useful in some situations.

Timur Tabi (2):
  [for 4.13] net: qcom/emac: disable flow control autonegotiation by
default
  net: qcom/emac: add software control for pause frame mode

 drivers/net/ethernet/qualcomm/emac/emac-ethtool.c | 30 +++
 drivers/net/ethernet/qualcomm/emac/emac-mac.c | 22 +
 drivers/net/ethernet/qualcomm/emac/emac.c | 12 +++--
 drivers/net/ethernet/qualcomm/emac/emac.h |  3 +++
 4 files changed, 65 insertions(+), 2 deletions(-)

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.



[PATCH 2/2] net: qcom/emac: add software control for pause frame mode

2017-08-01 Thread Timur Tabi
The EMAC has the option of sending only a single pause frame when
flow control is enabled and the RX queue is full.  Although sending
only one pause frame has little value, this would allow admins to
enable automatic flow control without having to worry about the EMAC
flooding nearby switches with pause frames if the kernel hangs.

The option is enabled by using the single-pause-mode private flag.

Signed-off-by: Timur Tabi 
---
 drivers/net/ethernet/qualcomm/emac/emac-ethtool.c | 30 +++
 drivers/net/ethernet/qualcomm/emac/emac-mac.c | 22 +
 drivers/net/ethernet/qualcomm/emac/emac.c |  3 +++
 drivers/net/ethernet/qualcomm/emac/emac.h |  3 +++
 4 files changed, 58 insertions(+)

diff --git a/drivers/net/ethernet/qualcomm/emac/emac-ethtool.c 
b/drivers/net/ethernet/qualcomm/emac/emac-ethtool.c
index bbe24639aa5a..c8c6231b87f3 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-ethtool.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac-ethtool.c
@@ -88,6 +88,8 @@ static void emac_set_msglevel(struct net_device *netdev, u32 
data)
 static int emac_get_sset_count(struct net_device *netdev, int sset)
 {
switch (sset) {
+   case ETH_SS_PRIV_FLAGS:
+   return 1;
case ETH_SS_STATS:
return EMAC_STATS_LEN;
default:
@@ -100,6 +102,10 @@ static void emac_get_strings(struct net_device *netdev, 
u32 stringset, u8 *data)
unsigned int i;
 
switch (stringset) {
+   case ETH_SS_PRIV_FLAGS:
+   strcpy(data, "single-pause-mode");
+   break;
+
case ETH_SS_STATS:
for (i = 0; i < EMAC_STATS_LEN; i++) {
strlcpy(data, emac_ethtool_stat_strings[i],
@@ -230,6 +236,27 @@ static int emac_get_regs_len(struct net_device *netdev)
return EMAC_MAX_REG_SIZE * sizeof(u32);
 }
 
+#define EMAC_PRIV_ENABLE_SINGLE_PAUSE  BIT(0)
+
+static int emac_set_priv_flags(struct net_device *netdev, u32 flags)
+{
+   struct emac_adapter *adpt = netdev_priv(netdev);
+
+   adpt->single_pause_mode = !!(flags & EMAC_PRIV_ENABLE_SINGLE_PAUSE);
+
+   if (netif_running(netdev))
+   return emac_reinit_locked(adpt);
+
+   return 0;
+}
+
+static u32 emac_get_priv_flags(struct net_device *netdev)
+{
+   struct emac_adapter *adpt = netdev_priv(netdev);
+
+   return adpt->single_pause_mode ? EMAC_PRIV_ENABLE_SINGLE_PAUSE : 0;
+}
+
 static const struct ethtool_ops emac_ethtool_ops = {
.get_link_ksettings = phy_ethtool_get_link_ksettings,
.set_link_ksettings = phy_ethtool_set_link_ksettings,
@@ -253,6 +280,9 @@ static int emac_get_regs_len(struct net_device *netdev)
 
.get_regs_len= emac_get_regs_len,
.get_regs= emac_get_regs,
+
+   .set_priv_flags = emac_set_priv_flags,
+   .get_priv_flags = emac_get_priv_flags,
 };
 
 void emac_set_ethtool_ops(struct net_device *netdev)
diff --git a/drivers/net/ethernet/qualcomm/emac/emac-mac.c 
b/drivers/net/ethernet/qualcomm/emac/emac-mac.c
index bcd4708b3745..0ea3ca09c689 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-mac.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac-mac.c
@@ -551,6 +551,28 @@ static void emac_mac_start(struct emac_adapter *adpt)
mac &= ~(HUGEN | VLAN_STRIP | TPAUSE | SIMR | HUGE | MULTI_ALL |
 DEBUG_MODE | SINGLE_PAUSE_MODE);
 
+   /* Enable single-pause-frame mode if requested.
+*
+* If enabled, the EMAC will send a single pause frame when the RX
+* queue is full.  This normally leads to packet loss because
+* the pause frame disables the remote MAC only for 33ms (the quanta),
+* and then the remote MAC continues sending packets even though
+* the RX queue is still full.
+*
+* If disabled, the EMAC sends a pause frame every 31ms until the RX
+* queue is no longer full.  Normally, this is the preferred
+* method of operation.  However, when the system is hung (e.g.
+* cores are halted), the EMAC interrupt handler is never called
+* and so the RX queue fills up quickly and stays full.  The resuling
+* non-stop "flood" of pause frames sometimes has the effect of
+* disabling nearby switches.  In some cases, other nearby switches
+* are also affected, shutting down the entire network.
+*
+* The user can enable or disable single-pause-frame mode
+* via ethtool.
+*/
+   mac |= adpt->single_pause_mode ? SINGLE_PAUSE_MODE : 0;
+
writel_relaxed(csr1, adpt->csr + EMAC_EMAC_WRAPPER_CSR1);
 
writel_relaxed(mac, adpt->base + EMAC_MAC_CTRL);
diff --git a/drivers/net/ethernet/qualcomm/emac/emac.c 
b/drivers/net/ethernet/qualcomm/emac/emac.c
index 475c0ea29235..6f5e858ffbf3 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac.c
@@ -448,6 +448,9 @@ static void 

Re: sysctl, argument parsing, possible bug

2017-08-01 Thread Massimo Sala
Do you confirm it is a sysctl parsing bug ?

Bosybox handles these cases, so I think also standalone sysctl have to.

Or at least someone must update sysctl docs / MAN about this.

Best regards, Sala


On 01/08/2017, Cong Wang  wrote:
> On Tue, Aug 1, 2017 at 1:47 PM, Massimo Sala 
> wrote:
>> cat /proc/sys/net/ipv4/conf/eth0.100/forwarding
>> 0
>>
>> sysctl net.ipv4.conf.eth0.100.forwarding
>> error: "net.ipv4.conf.eth0.100.forwarding" is an unknown key
>>
>
> Use echo instead, sysctl doesn't understand eth0.100
> is a netdev name, sigh.
>


Re: [PATCH net-next v2 01/11] net: dsa: PHY device is mandatory for EEE

2017-08-01 Thread Florian Fainelli
On 08/01/2017 01:32 PM, Vivien Didelot wrote:
> The port's PHY and MAC are both implied in EEE. The current code does
> not call the PHY operations if the related device is NULL. Change that
> by returning -ENODEV if there's no PHY device attached to the interface.
> 
> Signed-off-by: Vivien Didelot 

Reviewed-by: Florian Fainelli 
-- 
Florian


Re: [PATCH net-next v2 06/11] net: dsa: bcm_sf2: remove unneeded supported flags

2017-08-01 Thread Florian Fainelli
On 08/01/2017 01:32 PM, Vivien Didelot wrote:
> The SF2 driver is masking the supported bitfield of its private copy of
> the ports' ethtool_eee structures. It is used nowhere, thus remove it.
> 
> Signed-off-by: Vivien Didelot 

Reviewed-by: Florian Fainelli 
-- 
Florian


Re: sysctl, argument parsing, possible bug

2017-08-01 Thread Cong Wang
On Tue, Aug 1, 2017 at 1:47 PM, Massimo Sala  wrote:
> cat /proc/sys/net/ipv4/conf/eth0.100/forwarding
> 0
>
> sysctl net.ipv4.conf.eth0.100.forwarding
> error: "net.ipv4.conf.eth0.100.forwarding" is an unknown key
>

Use echo instead, sysctl doesn't understand eth0.100
is a netdev name, sigh.


Re: [PATCH V5 2/2] brcmfmac: don't warn user about NVRAM if fallback to the platform one succeeds

2017-08-01 Thread Arend van Spriel
On 31-07-17 17:09, Rafał Miłecki wrote:
> From: Rafał Miłecki 
> 
> Failing to load NVRAM *file* isn't critical if we manage to get platform
> NVRAM in the fallback path. It means warnings like:
> [   10.801506] brcmfmac :01:00.0: Direct firmware load for 
> brcm/brcmfmac43602-pcie.txt failed with error -2
> are unnecessary & disturbing for people with *platform* NVRAM as they
> are not expected to have NVRAM file. This is a very common case for
> Broadcom home routers.
> 
> Instead of printing warning immediately within the firmware subsystem
> let's try our fallback code first. If that fails as well, then it's a
> right moment to print an error.
> 
> This should reduce amount of false reports from users seeing this
> warning while having wireless working perfectly fine with the platform
> NVRAM.

Reviewed-by: Arend van Spriel 
> Signed-off-by: Rafał Miłecki 
> ---
> V2: Update commit message as it wasn't clear enough (thanks Andy) & add extra
> messages to the firmware.c.
> V3: Set FW_OPT_UEVENT to don't change behavior
> V4: Switch to the new request_firmware_async syntax
> V5: Rebase, update commit message, resend after drvdata discussion
> ---
>  .../wireless/broadcom/brcm80211/brcmfmac/firmware.c| 18 
> +-
>  1 file changed, 13 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/firmware.c 
> b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/firmware.c
> index d231042f19d6..524442b3870f 100644
> --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/firmware.c
> +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/firmware.c
> @@ -462,8 +462,14 @@ static void brcmf_fw_request_nvram_done(const struct 
> firmware *fw, void *ctx)
>   raw_nvram = false;
>   } else {
>   data = bcm47xx_nvram_get_contents(_len);
> - if (!data && !(fwctx->flags & BRCMF_FW_REQ_NV_OPTIONAL))
> - goto fail;
> + if (!data) {
> + brcmf_dbg(TRACE, "Failed to get platform NVRAM\n");

Better make this INFO level instead of TRACE. The intent of TRACE level
is for entry/exit points in functions.

Regards,
Arend


Re: [PATCH V4 net 2/2] net: fix tcp reset packet flowlabel for ipv6

2017-08-01 Thread Cong Wang
On Mon, Jul 31, 2017 at 4:00 PM, Shaohua Li  wrote:
> On Mon, Jul 31, 2017 at 03:35:02PM -0700, Cong Wang wrote:
>> On Mon, Jul 31, 2017 at 3:19 PM, Shaohua Li  wrote:
>> >  static inline __be32 ip6_make_flowlabel(struct net *net, struct sk_buff 
>> > *skb,
>> > __be32 flowlabel, bool autolabel,
>> > -   struct flowi6 *fl6)
>> > +   struct flowi6 *fl6, u32 hash)
>> >  {
>> > -   u32 hash;
>> > -
>> > /* @flowlabel may include more than a flow label, eg, the traffic 
>> > class.
>> >  * Here we want only the flow label value.
>> >  */
>> > @@ -788,7 +786,8 @@ static inline __be32 ip6_make_flowlabel(struct net 
>> > *net, struct sk_buff *skb,
>> >  net->ipv6.sysctl.auto_flowlabels != 
>> > IP6_AUTO_FLOW_LABEL_FORCED))
>> > return flowlabel;
>> >
>> > -   hash = skb_get_hash_flowi6(skb, fl6);
>> > +   if (skb)
>> > +   hash = skb_get_hash_flowi6(skb, fl6);
>>
>>
>> Why not just move skb_get_hash_flowi6() to its caller?
>> This check is not necessary. If you don't want to touch
>> existing callers, you can just introduce a wrapper:
>>
>>
>> static inline __be32 ip6_make_flowlabel(struct net *net, struct sk_buff *skb,
>> __be32 flowlabel, bool autolabel,
>> struct flowi6 *fl6)
>> {
>>   u32 hash = skb_get_hash_flowi6(skb, fl6);
>>   return __ip6_make_flowlabel(net, flowlabel, autolabel, hash);
>> }
>
> this will always call skb_get_hash_flowi6 for the fast path even auto 
> flowlabel
> is disabled. I thought we should avoid this.

Yeah, but you can move the check out too,
something like:

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 6eac5cf8f1e6..18ffa824c00a 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -771,31 +771,22 @@ static inline void
iph_to_flow_copy_v6addrs(struct flow_keys *flow,

 #define IP6_DEFAULT_AUTO_FLOW_LABELS   IP6_AUTO_FLOW_LABEL_OPTOUT

-static inline __be32 ip6_make_flowlabel(struct net *net, struct sk_buff *skb,
-   __be32 flowlabel, bool autolabel,
-   struct flowi6 *fl6)
+static inline bool ip6_need_flowlabel(struct net *net, __be32
flowlabel, bool autolabel)
 {
-   u32 hash;
-
/* @flowlabel may include more than a flow label, eg, the traffic class.
 * Here we want only the flow label value.
 */
-   flowlabel &= IPV6_FLOWLABEL_MASK;
-
-   if (flowlabel ||
+   if ((flowlabel & IPV6_FLOWLABEL_MASK) ||
net->ipv6.sysctl.auto_flowlabels == IP6_AUTO_FLOW_LABEL_OFF ||
(!autolabel &&
 net->ipv6.sysctl.auto_flowlabels != IP6_AUTO_FLOW_LABEL_FORCED))
-   return flowlabel;
-
-   hash = skb_get_hash_flowi6(skb, fl6);
+   return false;

-   /* Since this is being sent on the wire obfuscate hash a bit
-* to minimize possbility that any useful information to an
-* attacker is leaked. Only lower 20 bits are relevant.
-*/
-   rol32(hash, 16);
+   return true;
+}

+static inline __be32 __ip6_make_flowlabel(struct net *net, __be32
flowlabel, u32 hash)
+{
flowlabel = (__force __be32)hash & IPV6_FLOWLABEL_MASK;

if (net->ipv6.sysctl.flowlabel_state_ranges)
@@ -804,6 +795,19 @@ static inline __be32 ip6_make_flowlabel(struct
net *net, struct sk_buff *skb,
return flowlabel;
 }

+static inline __be32 ip6_make_flowlabel(struct net *net, struct sk_buff *skb,
+   __be32 flowlabel, bool autolabel,
+   struct flowi6 *fl6)
+{
+   u32 hash;
+
+   if (!ip6_need_flowlabel(net, flowlabel, autolabel))
+   return flowlabel & IPV6_FLOWLABEL_MASK;
+
+   hash = skb_get_hash_flowi6(skb, fl6);
+   return __ip6_make_flowlabel(net, flowlabel, hash);
+}
+
 static inline int ip6_default_np_autolabel(struct net *net)
 {
switch (net->ipv6.sysctl.auto_flowlabels) {


Re: [PATCH net-next v2 11/11] net: dsa: rename switch EEE ops

2017-08-01 Thread Andrew Lunn
On Tue, Aug 01, 2017 at 04:32:41PM -0400, Vivien Didelot wrote:
> To avoid confusion with the PHY EEE settings, rename the .set_eee and
> .get_eee ops to respectively .set_mac_eee and .get_mac_eee.
> 
> Signed-off-by: Vivien Didelot 

Reviewed-by: Andrew Lunn 

Andrew


Re: [PATCH net-next v2 10/11] net: dsa: mv88e6xxx: remove EEE support

2017-08-01 Thread Andrew Lunn
On Tue, Aug 01, 2017 at 04:32:40PM -0400, Vivien Didelot wrote:
> The PHY's EEE settings are already accessed by the DSA layer through the
> Marvell PHY driver and there is nothing to be done for switch's MACs.
> 
> Remove all EEE support from the mv88e6xxx driver and simply return 0
> from the EEE ops.
> 
> Signed-off-by: Vivien Didelot 

Reviewed-by: Andrew Lunn 

Andrew


Re: [PATCH net-next v2 09/11] net: dsa: remove PHY device argument from .set_eee

2017-08-01 Thread Andrew Lunn
On Tue, Aug 01, 2017 at 04:32:39PM -0400, Vivien Didelot wrote:
> The DSA switch operations for EEE are only meant to configure a port's
> MAC EEE settings. The port's PHY EEE settings are accessed by the DSA
> layer and must be made available via a proper PHY driver.
> 
> In order to reduce this confusion, remove the phy_device argument from
> the .set_eee operation.
> 
> Signed-off-by: Vivien Didelot 

Reviewed-by: Andrew Lunn 

Andrew


Re: [PATCH net-next v2 08/11] net: dsa: call phy_init_eee in DSA layer

2017-08-01 Thread Andrew Lunn
On Tue, Aug 01, 2017 at 04:32:38PM -0400, Vivien Didelot wrote:
> All DSA drivers are calling phy_init_eee if eee_enabled is true.
> 
> Move up this statement in the DSA layer to simplify the DSA drivers.
> qca8k does not require to cache the ethtool_eee structures from now on.
> 
> Signed-off-by: Vivien Didelot 

Reviewed-by: Andrew Lunn 

Andrew


Re: [PATCH net-next v2 07/11] net: dsa: mv88e6xxx: call phy_init_eee

2017-08-01 Thread Andrew Lunn
On Tue, Aug 01, 2017 at 04:32:37PM -0400, Vivien Didelot wrote:
> It is safer to init the EEE before the DSA layer call
> phy_ethtool_set_eee, as sf2 and qca8k are doing.

I can understand making all the drivers do the same thing, but the
next patch deletes this change, making is pointless.

My preference would be to drop this. But it is not a strong preference.

  Andrew


Re: [PATCH net-next v2 01/11] net: dsa: PHY device is mandatory for EEE

2017-08-01 Thread Andrew Lunn
On Tue, Aug 01, 2017 at 04:32:31PM -0400, Vivien Didelot wrote:
> The port's PHY and MAC are both implied in EEE. The current code does
> not call the PHY operations if the related device is NULL. Change that
> by returning -ENODEV if there's no PHY device attached to the interface.
> 
> Signed-off-by: Vivien Didelot 

Reviewed-by: Andrew Lunn 

Andrew


Re: [PATCH net 1/7] b44: Initialize 64-bit stats seqcount

2017-08-01 Thread Michael Chan
On Tue, Aug 1, 2017 at 12:11 PM, Florian Fainelli  wrote:
> On 32-bit hosts and with CONFIG_DEBUG_LOCK_ALLOC we should be seeing a
> lockdep splat indicating this seqcount is not correctly initialized, fix
> that.
>
> Fixes: eeda8585522b ("b44: add 64 bit stats")
> Signed-off-by: Florian Fainelli 

Acked-by: Michael Chan 

Thanks.


Re: [PATCH net] ppp: Fix a scheduling-while-atomic bug in del_chan

2017-08-01 Thread Cong Wang
On Mon, Jul 31, 2017 at 3:07 AM,   wrote:
> From: Gao Feng 
>
> The PPTP set the pptp_sock_destruct as the sock's sk_destruct, it would
> trigger this bug when __sk_free is invoked in atomic context, because of
> the call path pptp_sock_destruct->del_chan->synchronize_rcu.
>
> Now move the synchronize_rcu to pptp_release from del_chan. This is the
> only one case which would free the sock and need the synchronize_rcu.

I don't understand the last part.

>From my understanding, this RCU is supposed to protect the pppox_sock
pointers in 'callid_sock' which could be NULL'ed in del_chan(). And the
pppox_sock is freed when the last refcnt is gone, that is, when sock
dctor is called. pptp_release() is ONLY called when the fd in user-space
is gone, not necessarily the last refcnt.


Re: [PATCH net 2/3] tcp: enable xmit timer fix by having TLP use time when RTO should fire

2017-08-01 Thread Eric Dumazet
On Tue, 2017-08-01 at 10:35 -0400, Neal Cardwell wrote:
> On Tue, Aug 1, 2017 at 3:22 AM, Eric Dumazet  wrote:
> > On Mon, 2017-07-31 at 22:58 -0400, Neal Cardwell wrote:
> >> @@ -2418,13 +2418,9 @@ bool tcp_schedule_loss_probe(struct sock *sk)
> >>   timeout = max_t(u32, timeout, msecs_to_jiffies(10));
> >>
> >>   /* If RTO is shorter, just schedule TLP in its place. */
> >
> > I have hard time to read this comment.
> >
> > We are here trying to arm a timer based on TLP.
> >
> > If RTO is shorter, we'll arm the timer based on RTO instead of TLP.
> >
> > Is "If RTO is shorter, just schedule TLP in its place." really correct ?
> >
> > I suggest we reword the comment or simply get rid of it now the code is
> > more obvious.
> 
> OK, how about:
> 
>   /* If the RTO formula yields an earlier time, then use that time. */
> 

Sounds better :)

> We can also add a reference to the RACK/TLP Internet Draft at the top
> of tcp_schedule_loss_probe().
> 
> Whatever wording we decide on, I am happy to send a patch for net-next
> once this fix is merged into net-next.

Sure.




[PATCH net-next v2 06/11] net: dsa: bcm_sf2: remove unneeded supported flags

2017-08-01 Thread Vivien Didelot
The SF2 driver is masking the supported bitfield of its private copy of
the ports' ethtool_eee structures. It is used nowhere, thus remove it.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/bcm_sf2.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 648f91b58d1e..aef475f1ce06 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -327,12 +327,8 @@ static void bcm_sf2_port_disable(struct dsa_switch *ds, 
int port,
 static int bcm_sf2_eee_init(struct dsa_switch *ds, int port,
struct phy_device *phy)
 {
-   struct bcm_sf2_priv *priv = bcm_sf2_to_priv(ds);
-   struct ethtool_eee *p = >port_sts[port].eee;
int ret;
 
-   p->supported = (SUPPORTED_1000baseT_Full | SUPPORTED_100baseT_Full);
-
ret = phy_init_eee(phy, 0);
if (ret)
return 0;
-- 
2.13.3



[PATCH net-next v2 02/11] net: dsa: qca8k: fix EEE init

2017-08-01 Thread Vivien Didelot
The qca8k obviously copied code from the sf2 driver as how to set EEE:

if (e->eee_enabled) {
p->eee_enabled = qca8k_eee_init(ds, port, phydev);
if (!p->eee_enabled)
ret = -EOPNOTSUPP;
}

But it did not use the same logic for the EEE init routine, which is
"Returns 0 if EEE was not enabled, or 1 otherwise". This results in
returning -EOPNOTSUPP on success and caching EEE enabled on failure.

This patch fixes the returned value of qca8k_eee_init.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/qca8k.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c
index b3bee7eab45f..e076ab23d4df 100644
--- a/drivers/net/dsa/qca8k.c
+++ b/drivers/net/dsa/qca8k.c
@@ -666,11 +666,11 @@ qca8k_eee_init(struct dsa_switch *ds, int port,
 
ret = phy_init_eee(phy, 0);
if (ret)
-   return ret;
+   return 0;
 
qca8k_eee_enable_set(ds, port, true);
 
-   return 0;
+   return 1;
 }
 
 static int
-- 
2.13.3



[PATCH net-next v2 04/11] net: dsa: qca8k: do not cache unneeded EEE fields

2017-08-01 Thread Vivien Didelot
The qca8k driver is currently caching a bitfield of the supported member
of a ethtool_eee private structure, which is unused.

Only the eee_enabled field of the private ethtool_eee copy is updated,
thus using p->advertised and p->lp_advertised is also erroneous.

Remove the usage of these private ethtool_eee members and only rely on
phy_ethtool_get_eee to assign the eee_active member.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/qca8k.c | 11 +--
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c
index 9d6b5d2f7a4a..c316c55aabc6 100644
--- a/drivers/net/dsa/qca8k.c
+++ b/drivers/net/dsa/qca8k.c
@@ -658,12 +658,8 @@ static int
 qca8k_eee_init(struct dsa_switch *ds, int port,
   struct phy_device *phy)
 {
-   struct qca8k_priv *priv = (struct qca8k_priv *)ds->priv;
-   struct ethtool_eee *p = >port_sts[port].eee;
int ret;
 
-   p->supported = (SUPPORTED_1000baseT_Full | SUPPORTED_100baseT_Full);
-
ret = phy_init_eee(phy, 0);
if (ret)
return 0;
@@ -705,12 +701,7 @@ qca8k_get_eee(struct dsa_switch *ds, int port,
int ret;
 
ret = phy_ethtool_get_eee(netdev->phydev, p);
-   if (!ret)
-   e->eee_active =
-   !!(p->supported & p->advertised & p->lp_advertised);
-   else
-   e->eee_active = 0;
-
+   e->eee_active = p->eee_active;
e->eee_enabled = p->eee_enabled;
 
return ret;
-- 
2.13.3



[PATCH net-next v2 11/11] net: dsa: rename switch EEE ops

2017-08-01 Thread Vivien Didelot
To avoid confusion with the PHY EEE settings, rename the .set_eee and
.get_eee ops to respectively .set_mac_eee and .get_mac_eee.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/bcm_sf2.c| 12 ++--
 drivers/net/dsa/mv88e6xxx/chip.c | 12 ++--
 drivers/net/dsa/qca8k.c  |  9 -
 include/net/dsa.h| 10 +-
 net/dsa/slave.c  |  8 
 5 files changed, 25 insertions(+), 26 deletions(-)

diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index ce886345d8d2..6bbfa6ea1efb 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -338,8 +338,8 @@ static int bcm_sf2_eee_init(struct dsa_switch *ds, int port,
return 1;
 }
 
-static int bcm_sf2_sw_get_eee(struct dsa_switch *ds, int port,
- struct ethtool_eee *e)
+static int bcm_sf2_sw_get_mac_eee(struct dsa_switch *ds, int port,
+ struct ethtool_eee *e)
 {
struct bcm_sf2_priv *priv = bcm_sf2_to_priv(ds);
struct ethtool_eee *p = >port_sts[port].eee;
@@ -352,8 +352,8 @@ static int bcm_sf2_sw_get_eee(struct dsa_switch *ds, int 
port,
return 0;
 }
 
-static int bcm_sf2_sw_set_eee(struct dsa_switch *ds, int port,
- struct ethtool_eee *e)
+static int bcm_sf2_sw_set_mac_eee(struct dsa_switch *ds, int port,
+ struct ethtool_eee *e)
 {
struct bcm_sf2_priv *priv = bcm_sf2_to_priv(ds);
struct ethtool_eee *p = >port_sts[port].eee;
@@ -1011,8 +1011,8 @@ static const struct dsa_switch_ops bcm_sf2_ops = {
.set_wol= bcm_sf2_sw_set_wol,
.port_enable= bcm_sf2_port_setup,
.port_disable   = bcm_sf2_port_disable,
-   .get_eee= bcm_sf2_sw_get_eee,
-   .set_eee= bcm_sf2_sw_set_eee,
+   .get_mac_eee= bcm_sf2_sw_get_mac_eee,
+   .set_mac_eee= bcm_sf2_sw_set_mac_eee,
.port_bridge_join   = b53_br_join,
.port_bridge_leave  = b53_br_leave,
.port_stp_state_set = b53_br_set_stp_state,
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index aa0c5493fb9d..521738c4cd17 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -810,15 +810,15 @@ static void mv88e6xxx_get_regs(struct dsa_switch *ds, int 
port,
mutex_unlock(>reg_lock);
 }
 
-static int mv88e6xxx_get_eee(struct dsa_switch *ds, int port,
-struct ethtool_eee *e)
+static int mv88e6xxx_get_mac_eee(struct dsa_switch *ds, int port,
+struct ethtool_eee *e)
 {
/* Nothing to do on the port's MAC */
return 0;
 }
 
-static int mv88e6xxx_set_eee(struct dsa_switch *ds, int port,
-struct ethtool_eee *e)
+static int mv88e6xxx_set_mac_eee(struct dsa_switch *ds, int port,
+struct ethtool_eee *e)
 {
/* Nothing to do on the port's MAC */
return 0;
@@ -3890,8 +3890,8 @@ static const struct dsa_switch_ops mv88e6xxx_switch_ops = 
{
.get_sset_count = mv88e6xxx_get_sset_count,
.port_enable= mv88e6xxx_port_enable,
.port_disable   = mv88e6xxx_port_disable,
-   .set_eee= mv88e6xxx_set_eee,
-   .get_eee= mv88e6xxx_get_eee,
+   .get_mac_eee= mv88e6xxx_get_mac_eee,
+   .set_mac_eee= mv88e6xxx_set_mac_eee,
.get_eeprom_len = mv88e6xxx_get_eeprom_len,
.get_eeprom = mv88e6xxx_get_eeprom,
.set_eeprom = mv88e6xxx_set_eeprom,
diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c
index e209e229ed4c..36c169b0c705 100644
--- a/drivers/net/dsa/qca8k.c
+++ b/drivers/net/dsa/qca8k.c
@@ -638,7 +638,7 @@ qca8k_get_sset_count(struct dsa_switch *ds)
 }
 
 static int
-qca8k_set_eee(struct dsa_switch *ds, int port, struct ethtool_eee *eee)
+qca8k_set_mac_eee(struct dsa_switch *ds, int port, struct ethtool_eee *eee)
 {
struct qca8k_priv *priv = (struct qca8k_priv *)ds->priv;
u32 lpi_en = QCA8K_REG_EEE_CTRL_LPI_EN(port);
@@ -657,8 +657,7 @@ qca8k_set_eee(struct dsa_switch *ds, int port, struct 
ethtool_eee *eee)
 }
 
 static int
-qca8k_get_eee(struct dsa_switch *ds, int port,
- struct ethtool_eee *e)
+qca8k_get_mac_eee(struct dsa_switch *ds, int port, struct ethtool_eee *e)
 {
/* Nothing to do on the port's MAC */
return 0;
@@ -863,8 +862,8 @@ static const struct dsa_switch_ops qca8k_switch_ops = {
.phy_write  = qca8k_phy_write,
.get_ethtool_stats  = qca8k_get_ethtool_stats,
.get_sset_count = qca8k_get_sset_count,
-   .get_eee= qca8k_get_eee,
-   .set_eee= qca8k_set_eee,
+ 

[PATCH net-next v2 03/11] net: dsa: qca8k: enable EEE once

2017-08-01 Thread Vivien Didelot
If EEE is queried enabled, qca8k_set_eee calls qca8k_eee_enable_set
twice (because it is already called in qca8k_eee_init). Fix that.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/qca8k.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c
index e076ab23d4df..9d6b5d2f7a4a 100644
--- a/drivers/net/dsa/qca8k.c
+++ b/drivers/net/dsa/qca8k.c
@@ -684,12 +684,13 @@ qca8k_set_eee(struct dsa_switch *ds, int port,
 
p->eee_enabled = e->eee_enabled;
 
-   if (e->eee_enabled) {
+   if (!p->eee_enabled) {
+   qca8k_eee_enable_set(ds, port, false);
+   } else {
p->eee_enabled = qca8k_eee_init(ds, port, phydev);
if (!p->eee_enabled)
ret = -EOPNOTSUPP;
}
-   qca8k_eee_enable_set(ds, port, p->eee_enabled);
 
return ret;
 }
-- 
2.13.3



[PATCH net-next v2 10/11] net: dsa: mv88e6xxx: remove EEE support

2017-08-01 Thread Vivien Didelot
The PHY's EEE settings are already accessed by the DSA layer through the
Marvell PHY driver and there is nothing to be done for switch's MACs.

Remove all EEE support from the mv88e6xxx driver and simply return 0
from the EEE ops.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/mv88e6xxx/chip.c | 74 ++-
 drivers/net/dsa/mv88e6xxx/chip.h |  6 ---
 drivers/net/dsa/mv88e6xxx/phy.c  | 96 
 drivers/net/dsa/mv88e6xxx/phy.h  | 22 -
 drivers/net/dsa/mv88e6xxx/port.c | 17 ---
 drivers/net/dsa/mv88e6xxx/port.h |  3 --
 6 files changed, 4 insertions(+), 214 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index aaa96487f21f..aa0c5493fb9d 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -810,56 +810,18 @@ static void mv88e6xxx_get_regs(struct dsa_switch *ds, int 
port,
mutex_unlock(>reg_lock);
 }
 
-static int mv88e6xxx_energy_detect_read(struct mv88e6xxx_chip *chip, int port,
-   struct ethtool_eee *eee)
-{
-   int err;
-
-   if (!chip->info->ops->phy_energy_detect_read)
-   return -EOPNOTSUPP;
-
-   /* assign eee->eee_enabled and eee->tx_lpi_enabled */
-   err = chip->info->ops->phy_energy_detect_read(chip, port, eee);
-   if (err)
-   return err;
-
-   /* assign eee->eee_active */
-   return mv88e6xxx_port_status_eee(chip, port, eee);
-}
-
-static int mv88e6xxx_energy_detect_write(struct mv88e6xxx_chip *chip, int port,
-struct ethtool_eee *eee)
-{
-   if (!chip->info->ops->phy_energy_detect_write)
-   return -EOPNOTSUPP;
-
-   return chip->info->ops->phy_energy_detect_write(chip, port, eee);
-}
-
 static int mv88e6xxx_get_eee(struct dsa_switch *ds, int port,
 struct ethtool_eee *e)
 {
-   struct mv88e6xxx_chip *chip = ds->priv;
-   int err;
-
-   mutex_lock(>reg_lock);
-   err = mv88e6xxx_energy_detect_read(chip, port, e);
-   mutex_unlock(>reg_lock);
-
-   return err;
+   /* Nothing to do on the port's MAC */
+   return 0;
 }
 
 static int mv88e6xxx_set_eee(struct dsa_switch *ds, int port,
 struct ethtool_eee *e)
 {
-   struct mv88e6xxx_chip *chip = ds->priv;
-   int err;
-
-   mutex_lock(>reg_lock);
-   err = mv88e6xxx_energy_detect_write(chip, port, e);
-   mutex_unlock(>reg_lock);
-
-   return err;
+   /* Nothing to do on the port's MAC */
+   return 0;
 }
 
 static u16 mv88e6xxx_port_vlan(struct mv88e6xxx_chip *chip, int dev, int port)
@@ -2521,8 +2483,6 @@ static const struct mv88e6xxx_ops mv88e6141_ops = {
.set_switch_mac = mv88e6xxx_g2_set_switch_mac,
.phy_read = mv88e6xxx_g2_smi_phy_read,
.phy_write = mv88e6xxx_g2_smi_phy_write,
-   .phy_energy_detect_read = mv88e6352_phy_energy_detect_read,
-   .phy_energy_detect_write = mv88e6352_phy_energy_detect_write,
.port_set_link = mv88e6xxx_port_set_link,
.port_set_duplex = mv88e6xxx_port_set_duplex,
.port_set_rgmii_delay = mv88e6390_port_set_rgmii_delay,
@@ -2648,8 +2608,6 @@ static const struct mv88e6xxx_ops mv88e6172_ops = {
.set_switch_mac = mv88e6xxx_g2_set_switch_mac,
.phy_read = mv88e6xxx_g2_smi_phy_read,
.phy_write = mv88e6xxx_g2_smi_phy_write,
-   .phy_energy_detect_read = mv88e6352_phy_energy_detect_read,
-   .phy_energy_detect_write = mv88e6352_phy_energy_detect_write,
.port_set_link = mv88e6xxx_port_set_link,
.port_set_duplex = mv88e6xxx_port_set_duplex,
.port_set_rgmii_delay = mv88e6352_port_set_rgmii_delay,
@@ -2719,8 +2677,6 @@ static const struct mv88e6xxx_ops mv88e6176_ops = {
.set_switch_mac = mv88e6xxx_g2_set_switch_mac,
.phy_read = mv88e6xxx_g2_smi_phy_read,
.phy_write = mv88e6xxx_g2_smi_phy_write,
-   .phy_energy_detect_read = mv88e6352_phy_energy_detect_read,
-   .phy_energy_detect_write = mv88e6352_phy_energy_detect_write,
.port_set_link = mv88e6xxx_port_set_link,
.port_set_duplex = mv88e6xxx_port_set_duplex,
.port_set_rgmii_delay = mv88e6352_port_set_rgmii_delay,
@@ -2784,8 +2740,6 @@ static const struct mv88e6xxx_ops mv88e6190_ops = {
.set_switch_mac = mv88e6xxx_g2_set_switch_mac,
.phy_read = mv88e6xxx_g2_smi_phy_read,
.phy_write = mv88e6xxx_g2_smi_phy_write,
-   .phy_energy_detect_read = mv88e6390_phy_energy_detect_read,
-   .phy_energy_detect_write = mv88e6390_phy_energy_detect_write,
.port_set_link = mv88e6xxx_port_set_link,
.port_set_duplex = mv88e6xxx_port_set_duplex,
.port_set_rgmii_delay = mv88e6390_port_set_rgmii_delay,
@@ -2821,8 +2775,6 @@ static const struct mv88e6xxx_ops mv88e6190x_ops = {
.set_switch_mac = 

[PATCH net-next v2 09/11] net: dsa: remove PHY device argument from .set_eee

2017-08-01 Thread Vivien Didelot
The DSA switch operations for EEE are only meant to configure a port's
MAC EEE settings. The port's PHY EEE settings are accessed by the DSA
layer and must be made available via a proper PHY driver.

In order to reduce this confusion, remove the phy_device argument from
the .set_eee operation.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/bcm_sf2.c|  1 -
 drivers/net/dsa/mv88e6xxx/chip.c |  2 +-
 drivers/net/dsa/qca8k.c  | 14 +++---
 include/net/dsa.h|  1 -
 net/dsa/slave.c  |  2 +-
 5 files changed, 5 insertions(+), 15 deletions(-)

diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 9d10aac8f241..ce886345d8d2 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -353,7 +353,6 @@ static int bcm_sf2_sw_get_eee(struct dsa_switch *ds, int 
port,
 }
 
 static int bcm_sf2_sw_set_eee(struct dsa_switch *ds, int port,
- struct phy_device *phydev,
  struct ethtool_eee *e)
 {
struct bcm_sf2_priv *priv = bcm_sf2_to_priv(ds);
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 647d5d45c1d6..aaa96487f21f 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -850,7 +850,7 @@ static int mv88e6xxx_get_eee(struct dsa_switch *ds, int 
port,
 }
 
 static int mv88e6xxx_set_eee(struct dsa_switch *ds, int port,
-struct phy_device *phydev, struct ethtool_eee *e)
+struct ethtool_eee *e)
 {
struct mv88e6xxx_chip *chip = ds->priv;
int err;
diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c
index bfe0172ae6cc..e209e229ed4c 100644
--- a/drivers/net/dsa/qca8k.c
+++ b/drivers/net/dsa/qca8k.c
@@ -637,8 +637,8 @@ qca8k_get_sset_count(struct dsa_switch *ds)
return ARRAY_SIZE(ar8327_mib);
 }
 
-static void
-qca8k_eee_enable_set(struct dsa_switch *ds, int port, bool enable)
+static int
+qca8k_set_eee(struct dsa_switch *ds, int port, struct ethtool_eee *eee)
 {
struct qca8k_priv *priv = (struct qca8k_priv *)ds->priv;
u32 lpi_en = QCA8K_REG_EEE_CTRL_LPI_EN(port);
@@ -646,20 +646,12 @@ qca8k_eee_enable_set(struct dsa_switch *ds, int port, 
bool enable)
 
mutex_lock(>reg_mutex);
reg = qca8k_read(priv, QCA8K_REG_EEE_CTRL);
-   if (enable)
+   if (eee->eee_enabled)
reg |= lpi_en;
else
reg &= ~lpi_en;
qca8k_write(priv, QCA8K_REG_EEE_CTRL, reg);
mutex_unlock(>reg_mutex);
-}
-
-static int
-qca8k_set_eee(struct dsa_switch *ds, int port,
- struct phy_device *phydev,
- struct ethtool_eee *e)
-{
-   qca8k_eee_enable_set(ds, port, e->eee_enabled);
 
return 0;
 }
diff --git a/include/net/dsa.h b/include/net/dsa.h
index 88da272d20d0..ce46db323394 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -335,7 +335,6 @@ struct dsa_switch_ops {
 * EEE setttings
 */
int (*set_eee)(struct dsa_switch *ds, int port,
-  struct phy_device *phydev,
   struct ethtool_eee *e);
int (*get_eee)(struct dsa_switch *ds, int port,
   struct ethtool_eee *e);
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index ad5caaf384d7..9ddc584e70b0 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -655,7 +655,7 @@ static int dsa_slave_set_eee(struct net_device *dev, struct 
ethtool_eee *e)
if (!ds->ops->set_eee)
return -EOPNOTSUPP;
 
-   ret = ds->ops->set_eee(ds, p->dp->index, p->phy, e);
+   ret = ds->ops->set_eee(ds, p->dp->index, e);
if (ret)
return ret;
 
-- 
2.13.3



[PATCH net-next v2 08/11] net: dsa: call phy_init_eee in DSA layer

2017-08-01 Thread Vivien Didelot
All DSA drivers are calling phy_init_eee if eee_enabled is true.

Move up this statement in the DSA layer to simplify the DSA drivers.
qca8k does not require to cache the ethtool_eee structures from now on.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/bcm_sf2.c|  9 +
 drivers/net/dsa/mv88e6xxx/chip.c |  6 --
 drivers/net/dsa/qca8k.c  | 31 ++-
 drivers/net/dsa/qca8k.h  |  1 -
 net/dsa/slave.c  |  6 ++
 5 files changed, 9 insertions(+), 44 deletions(-)

diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index aef475f1ce06..9d10aac8f241 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -360,14 +360,7 @@ static int bcm_sf2_sw_set_eee(struct dsa_switch *ds, int 
port,
struct ethtool_eee *p = >port_sts[port].eee;
 
p->eee_enabled = e->eee_enabled;
-
-   if (!p->eee_enabled) {
-   bcm_sf2_eee_enable_set(ds, port, false);
-   } else {
-   p->eee_enabled = bcm_sf2_eee_init(ds, port, phydev);
-   if (!p->eee_enabled)
-   return -EOPNOTSUPP;
-   }
+   bcm_sf2_eee_enable_set(ds, port, e->eee_enabled);
 
return 0;
 }
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index b531d4a3bab5..647d5d45c1d6 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -855,12 +855,6 @@ static int mv88e6xxx_set_eee(struct dsa_switch *ds, int 
port,
struct mv88e6xxx_chip *chip = ds->priv;
int err;
 
-   if (e->eee_enabled) {
-   err = phy_init_eee(phydev, 0);
-   if (err)
-   return err;
-   }
-
mutex_lock(>reg_lock);
err = mv88e6xxx_energy_detect_write(chip, port, e);
mutex_unlock(>reg_lock);
diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c
index 8cd4634c6985..bfe0172ae6cc 100644
--- a/drivers/net/dsa/qca8k.c
+++ b/drivers/net/dsa/qca8k.c
@@ -655,40 +655,13 @@ qca8k_eee_enable_set(struct dsa_switch *ds, int port, 
bool enable)
 }
 
 static int
-qca8k_eee_init(struct dsa_switch *ds, int port,
-  struct phy_device *phy)
-{
-   int ret;
-
-   ret = phy_init_eee(phy, 0);
-   if (ret)
-   return 0;
-
-   qca8k_eee_enable_set(ds, port, true);
-
-   return 1;
-}
-
-static int
 qca8k_set_eee(struct dsa_switch *ds, int port,
  struct phy_device *phydev,
  struct ethtool_eee *e)
 {
-   struct qca8k_priv *priv = (struct qca8k_priv *)ds->priv;
-   struct ethtool_eee *p = >port_sts[port].eee;
-   int ret = 0;
+   qca8k_eee_enable_set(ds, port, e->eee_enabled);
 
-   p->eee_enabled = e->eee_enabled;
-
-   if (!p->eee_enabled) {
-   qca8k_eee_enable_set(ds, port, false);
-   } else {
-   p->eee_enabled = qca8k_eee_init(ds, port, phydev);
-   if (!p->eee_enabled)
-   ret = -EOPNOTSUPP;
-   }
-
-   return ret;
+   return 0;
 }
 
 static int
diff --git a/drivers/net/dsa/qca8k.h b/drivers/net/dsa/qca8k.h
index 1ed4fac6cd6d..1cf8a920d4ff 100644
--- a/drivers/net/dsa/qca8k.h
+++ b/drivers/net/dsa/qca8k.h
@@ -156,7 +156,6 @@ enum qca8k_fdb_cmd {
 };
 
 struct ar8xxx_port_status {
-   struct ethtool_eee eee;
int enabled;
 };
 
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 7df55d597740..ad5caaf384d7 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -659,6 +659,12 @@ static int dsa_slave_set_eee(struct net_device *dev, 
struct ethtool_eee *e)
if (ret)
return ret;
 
+   if (e->eee_enabled) {
+   ret = phy_init_eee(p->phy, 0);
+   if (ret)
+   return ret;
+   }
+
return phy_ethtool_set_eee(p->phy, e);
 }
 
-- 
2.13.3



[PATCH net-next v2 05/11] net: dsa: qca8k: empty qca8k_get_eee

2017-08-01 Thread Vivien Didelot
phy_ethtool_get_eee is already called by the DSA layer, thus remove the
duplicated call in the qca8k driver.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/qca8k.c | 12 ++--
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c
index c316c55aabc6..8cd4634c6985 100644
--- a/drivers/net/dsa/qca8k.c
+++ b/drivers/net/dsa/qca8k.c
@@ -695,16 +695,8 @@ static int
 qca8k_get_eee(struct dsa_switch *ds, int port,
  struct ethtool_eee *e)
 {
-   struct qca8k_priv *priv = (struct qca8k_priv *)ds->priv;
-   struct ethtool_eee *p = >port_sts[port].eee;
-   struct net_device *netdev = ds->ports[port].netdev;
-   int ret;
-
-   ret = phy_ethtool_get_eee(netdev->phydev, p);
-   e->eee_active = p->eee_active;
-   e->eee_enabled = p->eee_enabled;
-
-   return ret;
+   /* Nothing to do on the port's MAC */
+   return 0;
 }
 
 static void
-- 
2.13.3



[PATCH net-next v2 00/11] net: dsa: rework EEE support

2017-08-01 Thread Vivien Didelot
EEE implies configuring the port's PHY and MAC of both ends of the wire.

The current EEE support in DSA mixes PHY and MAC configuration, which is
bad because PHYs must be configured through a proper PHY driver. The DSA
switch operations for EEE are only meant for configuring the port's MAC,
which are integrated in the Ethernet switch device.

This patchset fixes the EEE support in qca8k driver, makes the DSA layer
call phy_init_eee for all drivers, and remove the EEE support from the
mv88e6xxx driver since the Marvell PHY driver should be enough for it.

Changes in v2:
 - make PHY device and DSA EEE ops mandatory for slave EEE operations.
 - simply return 0 in drivers which don't need to do anything to
   configure the port' MAC. Subsequent PHY calls will be enough.

Vivien Didelot (11):
  net: dsa: PHY device is mandatory for EEE
  net: dsa: qca8k: fix EEE init
  net: dsa: qca8k: enable EEE once
  net: dsa: qca8k: do not cache unneeded EEE fields
  net: dsa: qca8k: empty qca8k_get_eee
  net: dsa: bcm_sf2: remove unneeded supported flags
  net: dsa: mv88e6xxx: call phy_init_eee
  net: dsa: call phy_init_eee in DSA layer
  net: dsa: remove PHY device argument from .set_eee
  net: dsa: mv88e6xxx: remove EEE support
  net: dsa: rename switch EEE ops

 drivers/net/dsa/bcm_sf2.c| 26 +++
 drivers/net/dsa/mv88e6xxx/chip.c | 86 +--
 drivers/net/dsa/mv88e6xxx/chip.h |  6 ---
 drivers/net/dsa/mv88e6xxx/phy.c  | 96 
 drivers/net/dsa/mv88e6xxx/phy.h  | 22 -
 drivers/net/dsa/mv88e6xxx/port.c | 17 ---
 drivers/net/dsa/mv88e6xxx/port.h |  3 --
 drivers/net/dsa/qca8k.c  | 68 
 drivers/net/dsa/qca8k.h  |  1 -
 include/net/dsa.h| 11 +++--
 net/dsa/slave.c  | 30 -
 11 files changed, 49 insertions(+), 317 deletions(-)

-- 
2.13.3



[PATCH net-next v2 07/11] net: dsa: mv88e6xxx: call phy_init_eee

2017-08-01 Thread Vivien Didelot
It is safer to init the EEE before the DSA layer call
phy_ethtool_set_eee, as sf2 and qca8k are doing.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/mv88e6xxx/chip.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 647d5d45c1d6..b531d4a3bab5 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -855,6 +855,12 @@ static int mv88e6xxx_set_eee(struct dsa_switch *ds, int 
port,
struct mv88e6xxx_chip *chip = ds->priv;
int err;
 
+   if (e->eee_enabled) {
+   err = phy_init_eee(phydev, 0);
+   if (err)
+   return err;
+   }
+
mutex_lock(>reg_lock);
err = mv88e6xxx_energy_detect_write(chip, port, e);
mutex_unlock(>reg_lock);
-- 
2.13.3



[PATCH net-next v2 01/11] net: dsa: PHY device is mandatory for EEE

2017-08-01 Thread Vivien Didelot
The port's PHY and MAC are both implied in EEE. The current code does
not call the PHY operations if the related device is NULL. Change that
by returning -ENODEV if there's no PHY device attached to the interface.

Signed-off-by: Vivien Didelot 
---
 net/dsa/slave.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 9507bd38cf04..7df55d597740 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -648,6 +648,10 @@ static int dsa_slave_set_eee(struct net_device *dev, 
struct ethtool_eee *e)
struct dsa_switch *ds = p->dp->ds;
int ret;
 
+   /* Port's PHY and MAC both need to be EEE capable */
+   if (!p->phy)
+   return -ENODEV;
+
if (!ds->ops->set_eee)
return -EOPNOTSUPP;
 
@@ -655,10 +659,7 @@ static int dsa_slave_set_eee(struct net_device *dev, 
struct ethtool_eee *e)
if (ret)
return ret;
 
-   if (p->phy)
-   ret = phy_ethtool_set_eee(p->phy, e);
-
-   return ret;
+   return phy_ethtool_set_eee(p->phy, e);
 }
 
 static int dsa_slave_get_eee(struct net_device *dev, struct ethtool_eee *e)
@@ -667,6 +668,10 @@ static int dsa_slave_get_eee(struct net_device *dev, 
struct ethtool_eee *e)
struct dsa_switch *ds = p->dp->ds;
int ret;
 
+   /* Port's PHY and MAC both need to be EEE capable */
+   if (!p->phy)
+   return -ENODEV;
+
if (!ds->ops->get_eee)
return -EOPNOTSUPP;
 
@@ -674,10 +679,7 @@ static int dsa_slave_get_eee(struct net_device *dev, 
struct ethtool_eee *e)
if (ret)
return ret;
 
-   if (p->phy)
-   ret = phy_ethtool_get_eee(p->phy, e);
-
-   return ret;
+   return phy_ethtool_get_eee(p->phy, e);
 }
 
 #ifdef CONFIG_NET_POLL_CONTROLLER
-- 
2.13.3



[PATCH net] tcp: avoid setting cwnd to invalid ssthresh after cwnd reduction states

2017-08-01 Thread Yuchung Cheng
If the sender switches the congestion control during ECN-triggered
cwnd-reduction state (CA_CWR), upon exiting recovery cwnd is set to
the ssthresh value calculated by the previous congestion control. If
the previous congestion control is BBR that always keep ssthresh
to TCP_INIFINITE_SSTHRESH, cwnd ends up being infinite. The safe
step is to avoid assigning invalid ssthresh value when recovery ends.

Signed-off-by: Yuchung Cheng 
Signed-off-by: Neal Cardwell 
---
 net/ipv4/tcp_input.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 2920e0cb09f8..dad026fcfd09 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2520,8 +2520,8 @@ static inline void tcp_end_cwnd_reduction(struct sock *sk)
return;
 
/* Reset cwnd to ssthresh in CWR or Recovery (unless it's undone) */
-   if (inet_csk(sk)->icsk_ca_state == TCP_CA_CWR ||
-   (tp->undo_marker && tp->snd_ssthresh < TCP_INFINITE_SSTHRESH)) {
+   if (tp->snd_ssthresh < TCP_INFINITE_SSTHRESH &&
+   (inet_csk(sk)->icsk_ca_state == TCP_CA_CWR || tp->undo_marker)) {
tp->snd_cwnd = tp->snd_ssthresh;
tp->snd_cwnd_stamp = tcp_jiffies32;
}
-- 
2.14.0.rc1.383.gd1ce394fe2-goog



[Patch net-next] flow_dissector: remove unused functions

2017-08-01 Thread Cong Wang
They are introduced by commit f70ea018da06
("net: Add functions to get skb->hash based on flow structures")
but never gets used in tree.

Signed-off-by: Cong Wang 
---
 include/linux/skbuff.h| 16 
 net/core/flow_dissector.c | 45 -
 2 files changed, 61 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4093552be1de..7090519c72dd 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1122,8 +1122,6 @@ static inline __u32 skb_get_hash(struct sk_buff *skb)
return skb->hash;
 }
 
-__u32 __skb_get_hash_flowi6(struct sk_buff *skb, const struct flowi6 *fl6);
-
 static inline __u32 skb_get_hash_flowi6(struct sk_buff *skb, const struct 
flowi6 *fl6)
 {
if (!skb->l4_hash && !skb->sw_hash) {
@@ -1136,20 +1134,6 @@ static inline __u32 skb_get_hash_flowi6(struct sk_buff 
*skb, const struct flowi6
return skb->hash;
 }
 
-__u32 __skb_get_hash_flowi4(struct sk_buff *skb, const struct flowi4 *fl);
-
-static inline __u32 skb_get_hash_flowi4(struct sk_buff *skb, const struct 
flowi4 *fl4)
-{
-   if (!skb->l4_hash && !skb->sw_hash) {
-   struct flow_keys keys;
-   __u32 hash = __get_hash_from_flowi4(fl4, );
-
-   __skb_set_sw_hash(skb, hash, flow_keys_have_l4());
-   }
-
-   return skb->hash;
-}
-
 __u32 skb_get_hash_perturb(const struct sk_buff *skb, u32 perturb);
 
 static inline __u32 skb_get_hash_raw(const struct sk_buff *skb)
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index fc5fc4594c90..0cc672aba1f0 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -998,51 +998,6 @@ __u32 skb_get_hash_perturb(const struct sk_buff *skb, u32 
perturb)
 }
 EXPORT_SYMBOL(skb_get_hash_perturb);
 
-__u32 __skb_get_hash_flowi6(struct sk_buff *skb, const struct flowi6 *fl6)
-{
-   struct flow_keys keys;
-
-   memset(, 0, sizeof(keys));
-
-   memcpy(, >saddr,
-  sizeof(keys.addrs.v6addrs.src));
-   memcpy(, >daddr,
-  sizeof(keys.addrs.v6addrs.dst));
-   keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV6_ADDRS;
-   keys.ports.src = fl6->fl6_sport;
-   keys.ports.dst = fl6->fl6_dport;
-   keys.keyid.keyid = fl6->fl6_gre_key;
-   keys.tags.flow_label = (__force u32)fl6->flowlabel;
-   keys.basic.ip_proto = fl6->flowi6_proto;
-
-   __skb_set_sw_hash(skb, flow_hash_from_keys(),
- flow_keys_have_l4());
-
-   return skb->hash;
-}
-EXPORT_SYMBOL(__skb_get_hash_flowi6);
-
-__u32 __skb_get_hash_flowi4(struct sk_buff *skb, const struct flowi4 *fl4)
-{
-   struct flow_keys keys;
-
-   memset(, 0, sizeof(keys));
-
-   keys.addrs.v4addrs.src = fl4->saddr;
-   keys.addrs.v4addrs.dst = fl4->daddr;
-   keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS;
-   keys.ports.src = fl4->fl4_sport;
-   keys.ports.dst = fl4->fl4_dport;
-   keys.keyid.keyid = fl4->fl4_gre_key;
-   keys.basic.ip_proto = fl4->flowi4_proto;
-
-   __skb_set_sw_hash(skb, flow_hash_from_keys(),
- flow_keys_have_l4());
-
-   return skb->hash;
-}
-EXPORT_SYMBOL(__skb_get_hash_flowi4);
-
 u32 __skb_get_poff(const struct sk_buff *skb, void *data,
   const struct flow_keys *keys, int hlen)
 {
-- 
2.13.0



Re: [PATCH net-next 10/11] net: dsa: mv88e6xxx: remove EEE support

2017-08-01 Thread Vivien Didelot
Vivien Didelot  writes:

> Second option is: we keep it KISS and let the driver define its noop,
> but as I explain, it is confusing, especially for the get operation.

In fact we should be good because the DSA layer will call
ds->ops->{g,s}et_mac_eee before phy_ethtool_{g,s}et_eee, so if the DSA
driver didn't touch the ethtool_eee structure, the PHY ops will anyway.

I'm sending a v2 right away.


Thanks,

Vivien


  1   2   3   >