date:20160602

Re: ath9k gpio request

2016-06-02 Thread Pan, Miaoqing

Done, https://patchwork.kernel.org/patch/9151847/.

Thanks,
Miaoqing

From: Kalle Valo 
Sent: Friday, June 3, 2016 1:33 PM
To: Pan, Miaoqing
Cc: Sudip Mukherjee; Stephen Rothwell; ath9k-devel; linux-n...@vger.kernel.org; 
linux-ker...@vger.kernel.org; linux-wirel...@vger.kernel.org; 
ath9k-de...@lists.ath9k.org; netdev@vger.kernel.org; Miaoqing Pan
Subject: Re: ath9k gpio request

Sudip Mukherjee  writes:

> On Thursday 02 June 2016 01:32 PM, Pan, Miaoqing wrote:
>> Seems there are something wrong in the datasheet,  try
>>
>> --- a/drivers/net/wireless/ath/ath9k/reg.h
>> +++ b/drivers/net/wireless/ath/ath9k/reg.h
>> @@ -1122,8 +1122,8 @@ enum {
>>   #define AR9300_NUM_GPIO  16
>>   #define AR9330_NUM_GPIO 16
>>   #define AR9340_NUM_GPIO 23
>> -#define AR9462_NUM_GPIO 10
>> -#define AR9485_NUM_GPIO 12
>> +#define AR9462_NUM_GPIO 14
>> +#define AR9485_NUM_GPIO 11
>>   #define AR9531_NUM_GPIO 18
>>   #define AR9550_NUM_GPIO 24
>>   #define AR9561_NUM_GPIO 23
>> @@ -1139,8 +1139,8 @@ enum {
>>   #define AR9300_GPIO_MASK0xF4FF
>>   #define AR9330_GPIO_MASK0xF4FF
>>   #define AR9340_GPIO_MASK0x000F
>> -#define AR9462_GPIO_MASK0x03FF
>> -#define AR9485_GPIO_MASK0x0FFF
>> +#define AR9462_GPIO_MASK0x3FFF
>> +#define AR9485_GPIO_MASK0x07FF
>>   #define AR9531_GPIO_MASK0x000F
>>   #define AR9550_GPIO_MASK0x000F
>>   #define AR9561_GPIO_MASK0x000F
>
> solves the problem.
>
> Tested-by: Sudip Mukherjee 

Great, thanks for testing everyone. Miaoqing, please send a proper patch
ASAP and I'll push it to 4.7.

--
Kalle Valo

Re: ath9k gpio request

2016-06-02 Thread Kalle Valo

Sudip Mukherjee  writes:

> On Thursday 02 June 2016 01:32 PM, Pan, Miaoqing wrote:
>> Seems there are something wrong in the datasheet,  try
>>
>> --- a/drivers/net/wireless/ath/ath9k/reg.h
>> +++ b/drivers/net/wireless/ath/ath9k/reg.h
>> @@ -1122,8 +1122,8 @@ enum {
>>   #define AR9300_NUM_GPIO  16
>>   #define AR9330_NUM_GPIO 16
>>   #define AR9340_NUM_GPIO 23
>> -#define AR9462_NUM_GPIO 10
>> -#define AR9485_NUM_GPIO 12
>> +#define AR9462_NUM_GPIO 14
>> +#define AR9485_NUM_GPIO 11
>>   #define AR9531_NUM_GPIO 18
>>   #define AR9550_NUM_GPIO 24
>>   #define AR9561_NUM_GPIO 23
>> @@ -1139,8 +1139,8 @@ enum {
>>   #define AR9300_GPIO_MASK0xF4FF
>>   #define AR9330_GPIO_MASK0xF4FF
>>   #define AR9340_GPIO_MASK0x000F
>> -#define AR9462_GPIO_MASK0x03FF
>> -#define AR9485_GPIO_MASK0x0FFF
>> +#define AR9462_GPIO_MASK0x3FFF
>> +#define AR9485_GPIO_MASK0x07FF
>>   #define AR9531_GPIO_MASK0x000F
>>   #define AR9550_GPIO_MASK0x000F
>>   #define AR9561_GPIO_MASK0x000F
>
> solves the problem.
>
> Tested-by: Sudip Mukherjee 

Great, thanks for testing everyone. Miaoqing, please send a proper patch
ASAP and I'll push it to 4.7.

-- 
Kalle Valo

Good News

2016-06-02 Thread Pedro Quezada

You are a recipient to Mr Pedro Quezada Donation of 2M USD. Contact 
(qpedro...@gmail.com) for claims.

Offer

2016-06-02 Thread Pedro Quezada

You are a recipient to Mr Pedro Quezada Donation of 2M USD. Contact 
(qpedro...@gmail.com) for claims.

Re: [PATCH] tipc: fix an infoleak in tipc_nl_compat_link_dump

2016-06-02 Thread David Miller

From: Kangjie Lu 
Date: Thu,  2 Jun 2016 04:04:56 -0400

> link_info.str is a char array of size 60. Memory after the NULL
> byte is not initialized. Sending the whole object out can cause
> a leak.
> 
> Signed-off-by: Kangjie Lu 

Applied.

Re: [PATCH] rds: fix an infoleak in rds_inc_info_copy

2016-06-02 Thread David Miller

From: Kangjie Lu 
Date: Thu,  2 Jun 2016 04:11:20 -0400

> The last field "flags" of object "minfo" is not initialized.
> Copying this object out may leak kernel stack data.
> Assign 0 to it to avoid leak.
> 
> Signed-off-by: Kangjie Lu 

Applied.

Re: [PATCH net-next] qed: Utilize FW 8.10.3.0

2016-06-02 Thread David Miller

From: Yuval Mintz 
Date: Thu, 2 Jun 2016 10:23:29 +0300

> The New QED firmware contains several fixes, including:
>   - Wrong classification of packets in 4-port devices.
>   - Anti-spoof interoperability with encapsulated packets.
>   - Tx-switching of encapsulated packets.
> It also slightly improves Tx performance of the device.
> 
> In addition, this firmware contains the necessary logic for
> supporting iscsi & rdma, for which we plan on pushing protocol
> drivers in the imminent future.
> 
> Signed-off-by: Yuval Mintz 

Applied, thanks.

Re: Possible problem with e6afc8ac ("udp: remove headers from UDP packets before queueing")

2016-06-02 Thread David Miller

From: Eric Dumazet 
Date: Thu, 02 Jun 2016 19:58:26 -0700

> Arg, I totally messed up the patch title :(

I noticed it was odd, but it's not a big deal.

Re: Possible problem with e6afc8ac ("udp: remove headers from UDP packets before queueing")

2016-06-02 Thread Eric Dumazet

On Thu, 2016-06-02 at 18:31 -0400, David Miller wrote:
> From: Eric Dumazet 
> Date: Thu, 02 Jun 2016 14:52:43 -0700
> 
> > From: Eric Dumazet 
> > 
> > Paul Moore tracked a regression caused by a recent commit, which
> > mistakenly assumed that sk_filter() could be avoided if socket
> > had no current BPF filter.
> > 
> > The intent was to avoid udp_lib_checksum_complete() overhead.
> > 
> > But sk_filter() also checks skb_pfmemalloc() and
> > security_sock_rcv_skb(), so better call it.
> > 
> > Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing")
> > Signed-off-by: Eric Dumazet 
> > Reported-by: Paul Moore 
> > Tested-by: Paul Moore 
> > Tested-by: Stephen Smalley 
> > Cc: samanthakumar 
> 
> Applied, thanks Eric.

Arg, I totally messed up the patch title :(

[PATCH v4 net-next 06/13] net: hns: use platform_get_irq instead of irq_of_parse_and_map

2016-06-02 Thread Yisen Zhuang

From: Kejian Yan 

As irq_of_parse_and_map is only used by DT case, it is excepted to use
a uniform interface. So it is used platform_get_irq() instead.

Signed-off-by: Kejian Yan 
Signed-off-by: Yisen Zhuang 
---
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c 
b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c
index 4ef6d23..3ce2409 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c
@@ -458,7 +458,6 @@ void hns_rcb_get_cfg(struct rcb_common_cb *rcb_common)
u32 i;
u32 ring_num = rcb_common->ring_num;
int base_irq_idx = hns_rcb_get_base_irq_idx(rcb_common);
-   struct device_node *np = rcb_common->dsaf_dev->dev->of_node;
struct platform_device *pdev =
to_platform_device(rcb_common->dsaf_dev->dev);
bool is_ver1 = AE_IS_VER1(rcb_common->dsaf_dev->dsaf_ver);
@@ -473,10 +472,10 @@ void hns_rcb_get_cfg(struct rcb_common_cb *rcb_common)
ring_pair_cb->port_id_in_comm =
hns_rcb_get_port_in_comm(rcb_common, i);
ring_pair_cb->virq[HNS_RCB_IRQ_IDX_TX] =
-   is_ver1 ? irq_of_parse_and_map(np, base_irq_idx + i * 2) :
+   is_ver1 ? platform_get_irq(pdev, base_irq_idx + i * 2) :
  platform_get_irq(pdev, base_irq_idx + i * 3 + 1);
ring_pair_cb->virq[HNS_RCB_IRQ_IDX_RX] =
-   is_ver1 ? irq_of_parse_and_map(np, base_irq_idx + i * 2 + 1) :
+   is_ver1 ? platform_get_irq(pdev, base_irq_idx + i * 2 + 1) :
  platform_get_irq(pdev, base_irq_idx + i * 3);
ring_pair_cb->q.phy_base =
RCB_COMM_BASE_TO_RING_BASE(rcb_common->phy_base, i);
-- 
1.9.1

[PATCH v4 net-next 09/13] net: hns: add dsaf misc operation method

2016-06-02 Thread Yisen Zhuang

From: Kejian Yan 

The misc operation for different hw platform may be different, if using
current implementation, it will add a new branch on each function for
every new hw platform, so we add a method for this operation.

Signed-off-by: Kejian Yan 
Signed-off-by: Yisen Zhuang 
---
 drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c  |  4 +-
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_gmac.c |  6 +-
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c  | 14 ++--
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.h  |  2 -
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c | 11 ++-
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.h | 33 ++---
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c | 79 +++---
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.h |  7 +-
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.c  | 15 ++--
 .../net/ethernet/hisilicon/hns/hns_dsaf_xgmac.c| 10 +--
 10 files changed, 111 insertions(+), 70 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c 
b/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c
index 8e009f4..d37b778 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c
@@ -637,13 +637,15 @@ static int hns_ae_config_loopback(struct hnae_handle 
*handle,
int ret;
struct hnae_vf_cb *vf_cb = hns_ae_get_vf_cb(handle);
struct hns_mac_cb *mac_cb = hns_get_mac_cb(handle);
+   struct dsaf_device *dsaf_dev = mac_cb->dsaf_dev;
 
switch (loop) {
case MAC_INTERNALLOOP_PHY:
ret = 0;
break;
case MAC_INTERNALLOOP_SERDES:
-   ret = hns_mac_config_sds_loopback(vf_cb->mac_cb, en);
+   ret = dsaf_dev->misc_op->cfg_serdes_loopback(vf_cb->mac_cb,
+!!en);
break;
case MAC_INTERNALLOOP_MAC:
ret = hns_mac_config_mac_loopback(vf_cb->mac_cb, loop, en);
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_gmac.c 
b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_gmac.c
index 44abb08..1235c7f 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_gmac.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_gmac.c
@@ -110,7 +110,7 @@ static void hns_gmac_free(void *mac_drv)
 
u32 mac_id = drv->mac_id;
 
-   hns_dsaf_ge_srst_by_port(dsaf_dev, mac_id, 0);
+   dsaf_dev->misc_op->ge_srst(dsaf_dev, mac_id, 0);
 }
 
 static void hns_gmac_set_tx_auto_pause_frames(void *mac_drv, u16 newval)
@@ -317,9 +317,9 @@ static void hns_gmac_init(void *mac_drv)
 
port = drv->mac_id;
 
-   hns_dsaf_ge_srst_by_port(dsaf_dev, port, 0);
+   dsaf_dev->misc_op->ge_srst(dsaf_dev, port, 0);
mdelay(10);
-   hns_dsaf_ge_srst_by_port(dsaf_dev, port, 1);
+   dsaf_dev->misc_op->ge_srst(dsaf_dev, port, 1);
mdelay(10);
hns_gmac_disable(mac_drv, MAC_COMM_MODE_RX_AND_TX);
hns_gmac_tx_loop_pkt_dis(mac_drv);
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c 
b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c
index 527b49d..2ebf14a 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c
@@ -95,7 +95,7 @@ void hns_mac_get_link_status(struct hns_mac_cb *mac_cb, u32 
*link_status)
else
*link_status = 0;
 
-   ret = hns_mac_get_sfp_prsnt(mac_cb, _prsnt);
+   ret = mac_cb->dsaf_dev->misc_op->get_sfp_prsnt(mac_cb, _prsnt);
if (!ret)
*link_status = *link_status && sfp_prsnt;
 
@@ -512,7 +512,7 @@ void hns_mac_stop(struct hns_mac_cb *mac_cb)
 
mac_ctrl_drv->mac_en_flg = 0;
mac_cb->link = 0;
-   cpld_led_reset(mac_cb);
+   mac_cb->dsaf_dev->misc_op->cpld_reset_led(mac_cb);
 }
 
 /**
@@ -804,7 +804,7 @@ int hns_mac_get_cfg(struct dsaf_device *dsaf_dev, struct 
hns_mac_cb *mac_cb)
else
mac_cb->mac_type = HNAE_PORT_DEBUG;
 
-   mac_cb->phy_if = hns_mac_get_phy_if(mac_cb);
+   mac_cb->phy_if = dsaf_dev->misc_op->get_phy_if(mac_cb);
 
ret = hns_mac_get_mode(mac_cb->phy_if);
if (ret < 0) {
@@ -819,7 +819,7 @@ int hns_mac_get_cfg(struct dsaf_device *dsaf_dev, struct 
hns_mac_cb *mac_cb)
if (ret)
return ret;
 
-   cpld_led_reset(mac_cb);
+   mac_cb->dsaf_dev->misc_op->cpld_reset_led(mac_cb);
mac_cb->vaddr = hns_mac_get_vaddr(dsaf_dev, mac_cb, mac_mode_idx);
 
return 0;
@@ -906,7 +906,7 @@ void hns_mac_uninit(struct dsaf_device *dsaf_dev)
int max_port_num = hns_mac_get_max_port_num(dsaf_dev);
 
for (i = 0; i < max_port_num; i++) {
-   cpld_led_reset(dsaf_dev->mac_cb[i]);
+   dsaf_dev->misc_op->cpld_reset_led(dsaf_dev->mac_cb[i]);
dsaf_dev->mac_cb[i] = NULL;
}
 }
@@ -989,7 +989,7 @@ void

[PATCH v4 net-next 08/13] net: hns: add uniform interface for phy connection

2016-06-02 Thread Yisen Zhuang

From: Kejian Yan 

As device_node is only used by DT case, HNS needs to treat the other
cases including ACPI. It needs to use uniform ways to handle both of
DT and ACPI. This patch chooses phy_device, and of_phy_connect and
of_phy_attach are only used by DT case. It needs to use uniform interface
to handle that sequence by both DT and ACPI.

Signed-off-by: Kejian Yan 
Signed-off-by: Yisen Zhuang 
---
change log:
 v2:
  1. remove the redundant functions, and
  2. adds fwnode match method beside DT and ACPI.

 v1: first submit
  link: https://lkml.org/lkml/2016/5/13/100
---
 drivers/net/ethernet/hisilicon/hns/hnae.c  |  8 -
 drivers/net/ethernet/hisilicon/hns/hnae.h  |  3 +-
 drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c  |  2 +-
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c  | 34 +++---
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.h  |  2 +-
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c |  2 +-
 drivers/net/ethernet/hisilicon/hns/hns_enet.c  | 21 +++--
 drivers/net/ethernet/hisilicon/hns/hns_ethtool.c   |  2 +-
 8 files changed, 49 insertions(+), 25 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns/hnae.c 
b/drivers/net/ethernet/hisilicon/hns/hnae.c
index d630acd..5d3047c 100644
--- a/drivers/net/ethernet/hisilicon/hns/hnae.c
+++ b/drivers/net/ethernet/hisilicon/hns/hnae.c
@@ -96,7 +96,13 @@ static int __ae_match(struct device *dev, const void *data)
 {
struct hnae_ae_dev *hdev = cls_to_ae_dev(dev);
 
-   return (data == >dev->of_node->fwnode);
+   if (dev_of_node(hdev->dev))
+   return (data == >dev->of_node->fwnode);
+   else if (is_acpi_node(hdev->dev->fwnode))
+   return (data == hdev->dev->fwnode);
+
+   dev_err(dev, "__ae_match cannot read cfg data from OF or acpi\n");
+   return 0;
 }
 
 static struct hnae_ae_dev *find_ae(const struct fwnode_handle *fwnode)
diff --git a/drivers/net/ethernet/hisilicon/hns/hnae.h 
b/drivers/net/ethernet/hisilicon/hns/hnae.h
index f5f8140..529cb13 100644
--- a/drivers/net/ethernet/hisilicon/hns/hnae.h
+++ b/drivers/net/ethernet/hisilicon/hns/hnae.h
@@ -27,6 +27,7 @@
  * "cb" means control block
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -512,7 +513,7 @@ struct hnae_ae_dev {
 struct hnae_handle {
struct device *owner_dev; /* the device which make use of this handle */
struct hnae_ae_dev *dev;  /* the device who provides this handle */
-   struct device_node *phy_node;
+   struct phy_device *phy_dev;
phy_interface_t phy_if;
u32 if_support;
int q_num;
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c 
b/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c
index 7a757e8..8e009f4 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c
@@ -131,7 +131,7 @@ struct hnae_handle *hns_ae_get_handle(struct hnae_ae_dev 
*dev,
vf_cb->mac_cb = dsaf_dev->mac_cb[port_id];
 
ae_handle->phy_if = vf_cb->mac_cb->phy_if;
-   ae_handle->phy_node = vf_cb->mac_cb->phy_node;
+   ae_handle->phy_dev = vf_cb->mac_cb->phy_dev;
ae_handle->if_support = vf_cb->mac_cb->if_support;
ae_handle->port_type = vf_cb->mac_cb->mac_type;
ae_handle->dport_id = port_id;
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c 
b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c
index 611581f..527b49d 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c
@@ -15,7 +15,8 @@
 #include 
 #include 
 #include 
-#include 
+#include 
+#include 
 #include 
 
 #include "hns_dsaf_main.h"
@@ -645,7 +646,7 @@ free_mac_drv:
  */
 static int  hns_mac_get_info(struct hns_mac_cb *mac_cb)
 {
-   struct device_node *np = mac_cb->dev->of_node;
+   struct device_node *np;
struct regmap *syscon;
struct of_phandle_args cpld_args;
u32 ret;
@@ -672,21 +673,34 @@ static int  hns_mac_get_info(struct hns_mac_cb *mac_cb)
 * from dsaf node
 */
if (!mac_cb->fw_port) {
-   mac_cb->phy_node = of_parse_phandle(np, "phy-handle",
-   mac_cb->mac_id);
-   if (mac_cb->phy_node)
+   np = of_parse_phandle(mac_cb->dev->of_node, "phy-handle",
+ mac_cb->mac_id);
+   mac_cb->phy_dev = of_phy_find_device(np);
+   if (mac_cb->phy_dev) {
+   /* refcount is held by of_phy_find_device()
+* if the phy_dev is found
+*/
+   put_device(_cb->phy_dev->mdio.dev);
+
dev_dbg(mac_cb->dev, "mac%d phy_node: %s\n",
-   mac_cb->mac_id, mac_cb->phy_node->name);
+

[PATCH v4 net-next 00/13] net: hns: add support of ACPI

2016-06-02 Thread Yisen Zhuang

From: Kejian Yan 

This series adds HNS support of acpi. The routine will call some ACPI
helper functions, like acpi_dev_found() and acpi_evaluate_dsm(), which
are not included in other cases. In order to make system compile
successfully in other cases except ACPI, it needs to add relative stub
functions to linux/acpi.h. And we use device property functions instead
of serial helper functions to suport both DT and ACPI cases. And then
add the supports of ACPI for HNS.

change log:
 v3->v4:
  mii-id gets from dev-name instead of address

 v2->v3:
 1. add Review-by: Andy Shevchenko
 2. fix the potential memory leak

 v1 -> v2:
 1. use acpi_dev_found() instead of acpi_match_device_ids() to check if
it is a acpi node.
 2. use is_of_node() instead of IS_ENABLED() to check if it is a DT node.
 3. split the patch("add support of acpi for hns-mdio") into two patches:
3.1 Move to use fwnode_handle
3.2 Add ACPI
 4. add the patch which subject is dsaf misc operation method
 5. fix the comments by Andy Shevchenko

Kejian Yan (13):
  ACPI: bus: add stub acpi_dev_found() to linux/acpi.h
  ACPI: bus: add stub acpi_evaluate_dsm() to linux/acpi.h
  net: hisilicon: cleanup to prepare for other cases
  net: hisilicon: add support of acpi for hns-mdio
  net: hns: use device_* APIs instead of of_* APIs
  net: hns: use platform_get_irq instead of irq_of_parse_and_map
  net: hns: enet specify a reference to dsaf by fwnode_handle
  net: hns: add uniform interface for phy connection
  net: hns: add dsaf misc operation method
  net: hns: dsaf adds support of acpi
  net: hns: register phy device in each mac initial sequence
  net: hns: implement the miscellaneous operation by asl
  net: hns: net: hns: enet adds support of acpi

 drivers/net/ethernet/hisilicon/hns/hnae.c  |  18 +-
 drivers/net/ethernet/hisilicon/hns/hnae.h  |   5 +-
 drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c  |   6 +-
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_gmac.c |   6 +-
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c  | 247 +++-
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.h  |   4 +-
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c | 105 ++---
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.h |  33 ++-
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c | 250 ++---
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.h |   7 +-
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.c  |  15 +-
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_rcb.c  |   5 +-
 .../net/ethernet/hisilicon/hns/hns_dsaf_xgmac.c|  10 +-
 drivers/net/ethernet/hisilicon/hns/hns_enet.c  |  90 +---
 drivers/net/ethernet/hisilicon/hns/hns_enet.h  |   2 +-
 drivers/net/ethernet/hisilicon/hns/hns_ethtool.c   |   2 +-
 drivers/net/ethernet/hisilicon/hns_mdio.c  | 150 +++--
 include/linux/acpi.h   |  13 ++
 18 files changed, 706 insertions(+), 262 deletions(-)

-- 
1.9.1

[PATCH v4 net-next 01/13] ACPI: bus: add stub acpi_dev_found() to linux/acpi.h

2016-06-02 Thread Yisen Zhuang

From: Kejian Yan 

acpi_dev_found() will be used to detect if a given ACPI device is in the
system. It will be compiled in non-ACPI case, but the function is in
acpi_bus.h and acpi_bus.h can only be used in ACPI case, so this patch add
the stub function to linux/acpi.h to make compiled successfully in
non-ACPI cases.

Cc: Rafael J. Wysocki 
Signed-off-by: Kejian Yan 
Signed-off-by: Yisen Zhuang 
---
 include/linux/acpi.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 288fac5..3025d19 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -543,6 +543,11 @@ struct platform_device *acpi_create_platform_device(struct 
acpi_device *);

 struct fwnode_handle;

+static inline bool acpi_dev_found(const char *hid)
+{
+   return false;
+}
+
 static inline bool is_acpi_node(struct fwnode_handle *fwnode)
 {
return false;
-- 
1.9.1

[PATCH v4 net-next 12/13] net: hns: implement the miscellaneous operation by asl

2016-06-02 Thread Yisen Zhuang

From: Kejian Yan 

The miscellaneous operation is implemented in BIOS, the kernel can call
_DSM method help to call the implementation in ACPI case. Here is a patch
to do that.

Signed-off-by: Kejian Yan 
Signed-off-by: Yisen Zhuang 
---
change log:
 v2: use a serial function to implement the reset sequence

 v1: first submit
  link: https://lkml.org/lkml/2016/5/13/94
---
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c | 167 +
 1 file changed, 167 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c 
b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c
index f21177b..96cb628 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c
@@ -12,6 +12,27 @@
 #include "hns_dsaf_ppe.h"
 #include "hns_dsaf_reg.h"
 
+enum _dsm_op_index {
+   HNS_OP_RESET_FUNC   = 0x1,
+   HNS_OP_SERDES_LP_FUNC   = 0x2,
+   HNS_OP_LED_SET_FUNC = 0x3,
+   HNS_OP_GET_PORT_TYPE_FUNC   = 0x4,
+   HNS_OP_GET_SFP_STAT_FUNC= 0x5,
+};
+
+enum _dsm_rst_type {
+   HNS_DSAF_RESET_FUNC = 0x1,
+   HNS_PPE_RESET_FUNC  = 0x2,
+   HNS_XGE_CORE_RESET_FUNC = 0x3,
+   HNS_XGE_RESET_FUNC  = 0x4,
+   HNS_GE_RESET_FUNC   = 0x5,
+};
+
+const u8 hns_dsaf_acpi_dsm_uuid[] = {
+   0x1A, 0xAA, 0x85, 0x1A, 0x93, 0xE2, 0x5E, 0x41,
+   0x8E, 0x28, 0x8D, 0x69, 0x0A, 0x0F, 0x82, 0x0A
+};
+
 static void dsaf_write_sub(struct dsaf_device *dsaf_dev, u32 reg, u32 val)
 {
if (dsaf_dev->sub_ctrl)
@@ -109,6 +130,34 @@ static int cpld_set_led_id(struct hns_mac_cb *mac_cb,
 
 #define RESET_REQ_OR_DREQ 1
 
+static void hns_dsaf_acpi_srst_by_port(struct dsaf_device *dsaf_dev, u8 
op_type,
+  u32 port_type, u32 port, u32 val)
+{
+   union acpi_object *obj;
+   union acpi_object obj_args[3], argv4;
+
+   obj_args[0].integer.type = ACPI_TYPE_INTEGER;
+   obj_args[0].integer.value = port_type;
+   obj_args[1].integer.type = ACPI_TYPE_INTEGER;
+   obj_args[1].integer.value = port;
+   obj_args[2].integer.type = ACPI_TYPE_INTEGER;
+   obj_args[2].integer.value = val;
+
+   argv4.type = ACPI_TYPE_PACKAGE;
+   argv4.package.count = 3;
+   argv4.package.elements = obj_args;
+
+   obj = acpi_evaluate_dsm(ACPI_HANDLE(dsaf_dev->dev),
+   hns_dsaf_acpi_dsm_uuid, 0, op_type, );
+   if (!obj) {
+   dev_warn(dsaf_dev->dev, "reset port_type%d port%d fail!",
+port_type, port);
+   return;
+   }
+
+   ACPI_FREE(obj);
+}
+
 static void hns_dsaf_rst(struct dsaf_device *dsaf_dev, bool dereset)
 {
u32 xbar_reg_addr;
@@ -126,6 +175,13 @@ static void hns_dsaf_rst(struct dsaf_device *dsaf_dev, 
bool dereset)
dsaf_write_sub(dsaf_dev, nt_reg_addr, RESET_REQ_OR_DREQ);
 }
 
+static void hns_dsaf_rst_acpi(struct dsaf_device *dsaf_dev, bool dereset)
+{
+   hns_dsaf_acpi_srst_by_port(dsaf_dev, HNS_OP_RESET_FUNC,
+  HNS_DSAF_RESET_FUNC,
+  0, dereset);
+}
+
 static void hns_dsaf_xge_srst_by_port(struct dsaf_device *dsaf_dev, u32 port,
  bool dereset)
 {
@@ -146,6 +202,13 @@ static void hns_dsaf_xge_srst_by_port(struct dsaf_device 
*dsaf_dev, u32 port,
dsaf_write_sub(dsaf_dev, reg_addr, reg_val);
 }
 
+static void hns_dsaf_xge_srst_by_port_acpi(struct dsaf_device *dsaf_dev,
+  u32 port, bool dereset)
+{
+   hns_dsaf_acpi_srst_by_port(dsaf_dev, HNS_OP_RESET_FUNC,
+  HNS_XGE_RESET_FUNC, port, dereset);
+}
+
 static void hns_dsaf_xge_core_srst_by_port(struct dsaf_device *dsaf_dev,
   u32 port, bool dereset)
 {
@@ -166,6 +229,14 @@ static void hns_dsaf_xge_core_srst_by_port(struct 
dsaf_device *dsaf_dev,
dsaf_write_sub(dsaf_dev, reg_addr, reg_val);
 }
 
+static void
+hns_dsaf_xge_core_srst_by_port_acpi(struct dsaf_device *dsaf_dev,
+   u32 port, bool dereset)
+{
+   hns_dsaf_acpi_srst_by_port(dsaf_dev, HNS_OP_RESET_FUNC,
+  HNS_XGE_CORE_RESET_FUNC, port, dereset);
+}
+
 static void hns_dsaf_ge_srst_by_port(struct dsaf_device *dsaf_dev, u32 port,
 bool dereset)
 {
@@ -218,6 +289,13 @@ static void hns_dsaf_ge_srst_by_port(struct dsaf_device 
*dsaf_dev, u32 port,
}
 }
 
+static void hns_dsaf_ge_srst_by_port_acpi(struct dsaf_device *dsaf_dev,
+ u32 port, bool dereset)
+{
+   hns_dsaf_acpi_srst_by_port(dsaf_dev, HNS_OP_RESET_FUNC,
+  HNS_GE_RESET_FUNC, port, dereset);
+}
+
 static void hns_ppe_srst_by_port(struct dsaf_device

[PATCH v4 net-next 04/13] net: hisilicon: add support of acpi for hns-mdio

2016-06-02 Thread Yisen Zhuang

From: Kejian Yan 

hns-mdio needs to register itself to mii-bus. The info of the device can
be read by both DT and ACPI.
HNS tries to call Linux PHY driver to help access PHY-devices, the HNS
hardware topology is as below. The MDIO controller may control several
PHY-devices, and each PHY-device connects to a MAC device. The MDIO will
be registered to mdiobus, then PHY-devices will register when each mac
find PHY device.
   cpu
|
|
 ---
|   |   |
|   |   |
|  dsaf |
   MDIO |  MDIO
|  ---  |
| | | |   | |
| | | |   | |
|MAC   MAC   MAC MAC|
| | | |   | |
  | | |   | 
 ||||||   ||
 PHY   PHY   PHY PHY

And the driver can handle reset sequence by _RST method in DSDT in ACPI
case.

Signed-off-by: Kejian Yan 
Signed-off-by: Yisen Zhuang 
---
change log:
 v2:
1. use dev_of_node instead of IS_ENABLED macro
2. Add ACPI bits
 v1: first submit
Link: https://lkml.org/lkml/2016/5/13/93
---
 drivers/net/ethernet/hisilicon/hns_mdio.c | 106 +++---
 1 file changed, 69 insertions(+), 37 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns_mdio.c 
b/drivers/net/ethernet/hisilicon/hns_mdio.c
index 297edc4..761a32f 100644
--- a/drivers/net/ethernet/hisilicon/hns_mdio.c
+++ b/drivers/net/ethernet/hisilicon/hns_mdio.c
@@ -7,6 +7,7 @@
  * (at your option) any later version.
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -354,48 +355,60 @@ static int hns_mdio_reset(struct mii_bus *bus)
struct hns_mdio_device *mdio_dev = (struct hns_mdio_device *)bus->priv;
int ret;
 
-   if (!dev_of_node(bus->parent))
-   return -ENOTSUPP;
+   if (dev_of_node(bus->parent)) {
+   if (!mdio_dev->subctrl_vbase) {
+   dev_err(>dev, "mdio sys ctl reg has not maped\n");
+   return -ENODEV;
+   }
 
-   if (!mdio_dev->subctrl_vbase) {
-   dev_err(>dev, "mdio sys ctl reg has not maped\n");
-   return -ENODEV;
-   }
+   /* 1. reset req, and read reset st check */
+   ret = mdio_sc_cfg_reg_write(mdio_dev, MDIO_SC_RESET_REQ, 0x1,
+   MDIO_SC_RESET_ST, 0x1,
+   MDIO_CHECK_SET_ST);
+   if (ret) {
+   dev_err(>dev, "MDIO reset fail\n");
+   return ret;
+   }
 
-   /*1. reset req, and read reset st check*/
-   ret = mdio_sc_cfg_reg_write(mdio_dev, MDIO_SC_RESET_REQ, 0x1,
-   MDIO_SC_RESET_ST, 0x1,
-   MDIO_CHECK_SET_ST);
-   if (ret) {
-   dev_err(>dev, "MDIO reset fail\n");
-   return ret;
-   }
+   /* 2. dis clk, and read clk st check */
+   ret = mdio_sc_cfg_reg_write(mdio_dev, MDIO_SC_CLK_DIS,
+   0x1, MDIO_SC_CLK_ST, 0x1,
+   MDIO_CHECK_CLR_ST);
+   if (ret) {
+   dev_err(>dev, "MDIO dis clk fail\n");
+   return ret;
+   }
 
-   /*2. dis clk, and read clk st check*/
-   ret = mdio_sc_cfg_reg_write(mdio_dev, MDIO_SC_CLK_DIS,
-   0x1, MDIO_SC_CLK_ST, 0x1,
-   MDIO_CHECK_CLR_ST);
-   if (ret) {
-   dev_err(>dev, "MDIO dis clk fail\n");
-   return ret;
-   }
+   /* 3. reset dreq, and read reset st check */
+   ret = mdio_sc_cfg_reg_write(mdio_dev, MDIO_SC_RESET_DREQ, 0x1,
+   MDIO_SC_RESET_ST, 0x1,
+   MDIO_CHECK_CLR_ST);
+   if (ret) {
+   dev_err(>dev, "MDIO dis clk fail\n");
+   return ret;
+   }
 
-   /*3. reset dreq, and read reset st check*/
-   ret = mdio_sc_cfg_reg_write(mdio_dev, MDIO_SC_RESET_DREQ, 0x1,
-   MDIO_SC_RESET_ST, 0x1,
-   MDIO_CHECK_CLR_ST);
-   if (ret) {
-   dev_err(>dev, "MDIO dis clk fail\n");
-   return ret;
+   /* 4. en clk, and read clk st check */
+   ret = mdio_sc_cfg_reg_write(mdio_dev, MDIO_SC_CLK_EN,
+

[PATCH v4 net-next 03/13] net: hisilicon: cleanup to prepare for other cases

2016-06-02 Thread Yisen Zhuang

From: Kejian Yan 

Hns-mdio only supports DT case now. do some cleanup to prepare
for introducing other cases later, no functional change.

Signed-off-by: Kejian Yan 
Signed-off-by: Yisen Zhuang 
---
change log:
 v4:
  mii-id gets from dev_name instead of address

 v3:
  first submit
  Link: https://lkml.org/lkml/2016/5/30/298
---
 drivers/net/ethernet/hisilicon/hns_mdio.c | 48 ---
 1 file changed, 18 insertions(+), 30 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns_mdio.c 
b/drivers/net/ethernet/hisilicon/hns_mdio.c
index 765ddb3..297edc4 100644
--- a/drivers/net/ethernet/hisilicon/hns_mdio.c
+++ b/drivers/net/ethernet/hisilicon/hns_mdio.c
@@ -354,6 +354,9 @@ static int hns_mdio_reset(struct mii_bus *bus)
struct hns_mdio_device *mdio_dev = (struct hns_mdio_device *)bus->priv;
int ret;
 
+   if (!dev_of_node(bus->parent))
+   return -ENOTSUPP;
+
if (!mdio_dev->subctrl_vbase) {
dev_err(>dev, "mdio sys ctl reg has not maped\n");
return -ENODEV;
@@ -397,24 +400,6 @@ static int hns_mdio_reset(struct mii_bus *bus)
 }
 
 /**
- * hns_mdio_bus_name - get mdio bus name
- * @name: mdio bus name
- * @np: mdio device node pointer
- */
-static void hns_mdio_bus_name(char *name, struct device_node *np)
-{
-   const u32 *addr;
-   u64 taddr = OF_BAD_ADDR;
-
-   addr = of_get_address(np, 0, NULL, NULL);
-   if (addr)
-   taddr = of_translate_address(np, addr);
-
-   snprintf(name, MII_BUS_ID_SIZE, "%s@%llx", np->name,
-(unsigned long long)taddr);
-}
-
-/**
  * hns_mdio_probe - probe mdio device
  * @pdev: mdio platform device
  *
@@ -422,17 +407,16 @@ static void hns_mdio_bus_name(char *name, struct 
device_node *np)
  */
 static int hns_mdio_probe(struct platform_device *pdev)
 {
-   struct device_node *np;
struct hns_mdio_device *mdio_dev;
struct mii_bus *new_bus;
struct resource *res;
-   int ret;
+   int ret = -ENODEV;
 
if (!pdev) {
dev_err(NULL, "pdev is NULL!\r\n");
return -ENODEV;
}
-   np = pdev->dev.of_node;
+
mdio_dev = devm_kzalloc(>dev, sizeof(*mdio_dev), GFP_KERNEL);
if (!mdio_dev)
return -ENOMEM;
@@ -448,7 +432,7 @@ static int hns_mdio_probe(struct platform_device *pdev)
new_bus->write = hns_mdio_write;
new_bus->reset = hns_mdio_reset;
new_bus->priv = mdio_dev;
-   hns_mdio_bus_name(new_bus->id, np);
+   new_bus->parent = >dev;
 
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
mdio_dev->vbase = devm_ioremap_resource(>dev, res);
@@ -457,16 +441,20 @@ static int hns_mdio_probe(struct platform_device *pdev)
return ret;
}
 
-   mdio_dev->subctrl_vbase =
-   syscon_node_to_regmap(of_parse_phandle(np, "subctrl-vbase", 0));
-   if (IS_ERR(mdio_dev->subctrl_vbase)) {
-   dev_warn(>dev, "no syscon hisilicon,peri-c-subctrl\n");
-   mdio_dev->subctrl_vbase = NULL;
-   }
-   new_bus->parent = >dev;
platform_set_drvdata(pdev, new_bus);
+   snprintf(new_bus->id, MII_BUS_ID_SIZE, "%s-%s", "Mii",
+dev_name(>dev));
+   if (dev_of_node(>dev)) {
+   mdio_dev->subctrl_vbase = syscon_node_to_regmap(
+   of_parse_phandle(pdev->dev.of_node,
+"subctrl-vbase", 0));
+   if (IS_ERR(mdio_dev->subctrl_vbase)) {
+   dev_warn(>dev, "no syscon 
hisilicon,peri-c-subctrl\n");
+   mdio_dev->subctrl_vbase = NULL;
+   }
+   ret = of_mdiobus_register(new_bus, pdev->dev.of_node);
+   }
 
-   ret = of_mdiobus_register(new_bus, np);
if (ret) {
dev_err(>dev, "Cannot register as MDIO bus!\n");
platform_set_drvdata(pdev, NULL);
-- 
1.9.1

[PATCH v4 net-next 07/13] net: hns: enet specify a reference to dsaf by fwnode_handle

2016-06-02 Thread Yisen Zhuang

From: Kejian Yan 

As device_node is only used by DT case, it is expected to find uniform
ways. So fwnode_handle is the suitable method.

Signed-off-by: Kejian Yan 
Signed-off-by: Yisen Zhuang 
---
change log:
 v2: remove the redundant line

 v1: first submit
 link: https://lkml.org/lkml/2016/5/13/98
---
 drivers/net/ethernet/hisilicon/hns/hnae.c | 12 ++--
 drivers/net/ethernet/hisilicon/hns/hnae.h |  2 +-
 drivers/net/ethernet/hisilicon/hns/hns_enet.c | 14 --
 drivers/net/ethernet/hisilicon/hns/hns_enet.h |  2 +-
 4 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns/hnae.c 
b/drivers/net/ethernet/hisilicon/hns/hnae.c
index 3bfe36f..d630acd 100644
--- a/drivers/net/ethernet/hisilicon/hns/hnae.c
+++ b/drivers/net/ethernet/hisilicon/hns/hnae.c
@@ -96,16 +96,16 @@ static int __ae_match(struct device *dev, const void *data)
 {
struct hnae_ae_dev *hdev = cls_to_ae_dev(dev);
 
-   return hdev->dev->of_node == data;
+   return (data == >dev->of_node->fwnode);
 }
 
-static struct hnae_ae_dev *find_ae(const struct device_node *ae_node)
+static struct hnae_ae_dev *find_ae(const struct fwnode_handle *fwnode)
 {
struct device *dev;
 
-   WARN_ON(!ae_node);
+   WARN_ON(!fwnode);
 
-   dev = class_find_device(hnae_class, NULL, ae_node, __ae_match);
+   dev = class_find_device(hnae_class, NULL, fwnode, __ae_match);
 
return dev ? cls_to_ae_dev(dev) : NULL;
 }
@@ -312,7 +312,7 @@ EXPORT_SYMBOL(hnae_reinit_handle);
  * return handle ptr or ERR_PTR
  */
 struct hnae_handle *hnae_get_handle(struct device *owner_dev,
-   const struct device_node *ae_node,
+   const struct fwnode_handle  *fwnode,
u32 port_id,
struct hnae_buf_ops *bops)
 {
@@ -321,7 +321,7 @@ struct hnae_handle *hnae_get_handle(struct device 
*owner_dev,
int i, j;
int ret;
 
-   dev = find_ae(ae_node);
+   dev = find_ae(fwnode);
if (!dev)
return ERR_PTR(-ENODEV);
 
diff --git a/drivers/net/ethernet/hisilicon/hns/hnae.h 
b/drivers/net/ethernet/hisilicon/hns/hnae.h
index e8d36aa..f5f8140 100644
--- a/drivers/net/ethernet/hisilicon/hns/hnae.h
+++ b/drivers/net/ethernet/hisilicon/hns/hnae.h
@@ -528,7 +528,7 @@ struct hnae_handle {
 #define ring_to_dev(ring) ((ring)->q->dev->dev)
 
 struct hnae_handle *hnae_get_handle(struct device *owner_dev,
-   const struct device_node *ae_node,
+   const struct fwnode_handle  *fwnode,
u32 port_id,
struct hnae_buf_ops *bops);
 
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_enet.c 
b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
index 8851420..93f6ccb 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
@@ -1807,7 +1807,7 @@ static int hns_nic_try_get_ae(struct net_device *ndev)
int ret;
 
h = hnae_get_handle(>netdev->dev,
-   priv->ae_node, priv->port_id, NULL);
+   priv->fwnode, priv->port_id, NULL);
if (IS_ERR_OR_NULL(h)) {
ret = -ENODEV;
dev_dbg(priv->dev, "has not handle, register notifier!\n");
@@ -1867,7 +1867,7 @@ static int hns_nic_dev_probe(struct platform_device *pdev)
struct device *dev = >dev;
struct net_device *ndev;
struct hns_nic_priv *priv;
-   struct device_node *node = dev->of_node;
+   struct device_node *ae_node;
u32 port_id;
int ret;
 
@@ -1881,17 +1881,19 @@ static int hns_nic_dev_probe(struct platform_device 
*pdev)
priv->dev = dev;
priv->netdev = ndev;
 
-   if (of_device_is_compatible(node, "hisilicon,hns-nic-v1"))
+   if (of_device_is_compatible(dev->of_node, "hisilicon,hns-nic-v1"))
priv->enet_ver = AE_VERSION_1;
else
priv->enet_ver = AE_VERSION_2;
 
-   priv->ae_node = (void *)of_parse_phandle(node, "ae-handle", 0);
-   if (IS_ERR_OR_NULL(priv->ae_node)) {
-   ret = PTR_ERR(priv->ae_node);
+   ae_node = of_parse_phandle(dev->of_node, "ae-handle", 0);
+   if (IS_ERR_OR_NULL(ae_node)) {
+   ret = PTR_ERR(ae_node);
dev_err(dev, "not find ae-handle\n");
goto out_read_prop_fail;
}
+   priv->fwnode = _node->fwnode;
+
/* try to find port-idx-in-ae first */
ret = device_property_read_u32(dev, "port-idx-in-ae", _id);
if (ret) {
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_enet.h 
b/drivers/net/ethernet/hisilicon/hns/hns_enet.h
index 337efa5..44bb301 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_enet.h
+++

[PATCH v4 net-next 13/13] net: hns: net: hns: enet adds support of acpi

2016-06-02 Thread Yisen Zhuang

From: Kejian Yan 

Enet needs to get configration parameter by acpi. This patch
adds support of ACPI for enet. The configuration parameter will
be configed in BIOS.

Signed-off-by: Kejian Yan 
Signed-off-by: Yisen Zhuang 
---
change log:
 v2:
 1. use acpi_dev_found() instead of acpi_match_device_ids()
 2. use is_acpi_node() to check if it works by ACPI case
 3. use dev_of_node() to check if it works by DT case

 v1: first submit
  link: https://lkml.org/lkml/2016/5/13/99
---
 drivers/net/ethernet/hisilicon/hns/hns_enet.c | 56 +--
 1 file changed, 44 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns/hns_enet.c 
b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
index 3ec3c27..ad742a6 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
@@ -132,6 +132,13 @@ static void fill_v2_desc(struct hnae_ring *ring, void 
*priv,
ring_ptr_move_fw(ring, next_to_use);
 }
 
+static const struct acpi_device_id hns_enet_acpi_match[] = {
+   { "HISI00C1", 0 },
+   { "HISI00C2", 0 },
+   { },
+};
+MODULE_DEVICE_TABLE(acpi, hns_enet_acpi_match);
+
 static void fill_desc(struct hnae_ring *ring, void *priv,
  int size, dma_addr_t dma, int frag_end,
  int buf_num, enum hns_desc_type type, int mtu)
@@ -1870,7 +1877,6 @@ static int hns_nic_dev_probe(struct platform_device *pdev)
struct device *dev = >dev;
struct net_device *ndev;
struct hns_nic_priv *priv;
-   struct device_node *ae_node;
u32 port_id;
int ret;
 
@@ -1884,20 +1890,45 @@ static int hns_nic_dev_probe(struct platform_device 
*pdev)
priv->dev = dev;
priv->netdev = ndev;
 
-   if (of_device_is_compatible(dev->of_node, "hisilicon,hns-nic-v1"))
-   priv->enet_ver = AE_VERSION_1;
-   else
-   priv->enet_ver = AE_VERSION_2;
+   if (dev_of_node(dev)) {
+   struct device_node *ae_node;
 
-   ae_node = of_parse_phandle(dev->of_node, "ae-handle", 0);
-   if (IS_ERR_OR_NULL(ae_node)) {
-   ret = PTR_ERR(ae_node);
-   dev_err(dev, "not find ae-handle\n");
-   goto out_read_prop_fail;
+   if (of_device_is_compatible(dev->of_node,
+   "hisilicon,hns-nic-v1"))
+   priv->enet_ver = AE_VERSION_1;
+   else
+   priv->enet_ver = AE_VERSION_2;
+
+   ae_node = of_parse_phandle(dev->of_node, "ae-handle", 0);
+   if (IS_ERR_OR_NULL(ae_node)) {
+   ret = PTR_ERR(ae_node);
+   dev_err(dev, "not find ae-handle\n");
+   goto out_read_prop_fail;
+   }
+   priv->fwnode = _node->fwnode;
+   } else if (is_acpi_node(dev->fwnode)) {
+   struct acpi_reference_args args;
+
+   if (acpi_dev_found(hns_enet_acpi_match[0].id))
+   priv->enet_ver = AE_VERSION_1;
+   else if (acpi_dev_found(hns_enet_acpi_match[1].id))
+   priv->enet_ver = AE_VERSION_2;
+   else
+   return -ENXIO;
+
+   /* try to find port-idx-in-ae first */
+   ret = acpi_node_get_property_reference(dev->fwnode,
+  "ae-handle", 0, );
+   if (ret) {
+   dev_err(dev, "not find ae-handle\n");
+   goto out_read_prop_fail;
+   }
+   priv->fwnode = acpi_fwnode_handle(args.adev);
+   } else {
+   dev_err(dev, "cannot read cfg data from OF or acpi\n");
+   return -ENXIO;
}
-   priv->fwnode = _node->fwnode;
 
-   /* try to find port-idx-in-ae first */
ret = device_property_read_u32(dev, "port-idx-in-ae", _id);
if (ret) {
/* only for old code compatible */
@@ -2014,6 +2045,7 @@ static struct platform_driver hns_nic_dev_driver = {
.driver = {
.name = "hns-nic",
.of_match_table = hns_enet_of_match,
+   .acpi_match_table = ACPI_PTR(hns_enet_acpi_match),
},
.probe = hns_nic_dev_probe,
.remove = hns_nic_dev_remove,
-- 
1.9.1

[PATCH v4 net-next 10/13] net: hns: dsaf adds support of acpi

2016-06-02 Thread Yisen Zhuang

From: Kejian Yan 

Dsaf needs to get configuration parameter by ACPI, so this patch add
support of ACPI.

Signed-off-by: Kejian Yan 
Signed-off-by: Yisen Zhuang 
---
change log:
 v2:
  1. use dev_of_node() instead of IS_ENABLED() to check if it is in
DT case,
  2. split a new patch to implement misc operation method,
  3. use acpi_dev_found() instead of acpi_match_device_ids() to
check which hw version it is,
  4. use is_acpi_node instead of ACPI_COMPANION to check if it is
work in ACPI case.

 v1: first submit
  link: https://lkml.org/lkml/2016/5/13/108
---
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c  | 80 ++--
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c | 85 +++---
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c | 32 
 3 files changed, 114 insertions(+), 83 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c 
b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c
index 2ebf14a..3ef0c9b 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c
@@ -689,9 +689,7 @@ static int  hns_mac_get_info(struct hns_mac_cb *mac_cb)
return 0;
}
 
-   if (!is_of_node(mac_cb->fw_port))
-   return -EINVAL;
-
+   if (is_of_node(mac_cb->fw_port)) {
/* parse property from port subnode in dsaf */
np = of_parse_phandle(to_of_node(mac_cb->fw_port), "phy-handle", 0);
mac_cb->phy_dev = of_phy_find_device(np);
@@ -701,47 +699,49 @@ static int  hns_mac_get_info(struct hns_mac_cb *mac_cb)
mac_cb->mac_id, np->name);
}
 
-   syscon = syscon_node_to_regmap(
-   of_parse_phandle(to_of_node(mac_cb->fw_port),
-"serdes-syscon", 0));
-   if (IS_ERR_OR_NULL(syscon)) {
-   dev_err(mac_cb->dev, "serdes-syscon is needed!\n");
-   return -EINVAL;
-   }
-   mac_cb->serdes_ctrl = syscon;
-
-   ret = fwnode_property_read_u32(mac_cb->fw_port,
-  "port-rst-offset",
-  _cb->port_rst_off);
-   if (ret) {
-   dev_dbg(mac_cb->dev,
-   "mac%d port-rst-offset not found, use default value.\n",
-   mac_cb->mac_id);
-   }
+   syscon = syscon_node_to_regmap(
+   of_parse_phandle(to_of_node(mac_cb->fw_port),
+"serdes-syscon", 0));
+   if (IS_ERR_OR_NULL(syscon)) {
+   dev_err(mac_cb->dev, "serdes-syscon is needed!\n");
+   return -EINVAL;
+   }
+   mac_cb->serdes_ctrl = syscon;
 
-   ret = fwnode_property_read_u32(mac_cb->fw_port,
-  "port-mode-offset",
-  _cb->port_mode_off);
-   if (ret) {
-   dev_dbg(mac_cb->dev,
-   "mac%d port-mode-offset not found, use default 
value.\n",
-   mac_cb->mac_id);
-   }
+   ret = fwnode_property_read_u32(mac_cb->fw_port,
+  "port-rst-offset",
+  _cb->port_rst_off);
+   if (ret) {
+   dev_dbg(mac_cb->dev,
+   "mac%d port-rst-offset not found, use default 
value.\n",
+   mac_cb->mac_id);
+   }
 
-   ret = of_parse_phandle_with_fixed_args(to_of_node(mac_cb->fw_port),
-  "cpld-syscon", 1, 0, _args);
-   if (ret) {
-   dev_dbg(mac_cb->dev, "mac%d no cpld-syscon found.\n",
-   mac_cb->mac_id);
-   mac_cb->cpld_ctrl = NULL;
-   } else {
-   syscon = syscon_node_to_regmap(cpld_args.np);
-   if (IS_ERR_OR_NULL(syscon)) {
-   dev_dbg(mac_cb->dev, "no cpld-syscon found!\n");
+   ret = fwnode_property_read_u32(mac_cb->fw_port,
+  "port-mode-offset",
+  _cb->port_mode_off);
+   if (ret) {
+   dev_dbg(mac_cb->dev,
+   "mac%d port-mode-offset not found, use default 
value.\n",
+   mac_cb->mac_id);
+   }
+
+   ret = of_parse_phandle_with_fixed_args(
+   to_of_node(mac_cb->fw_port), "cpld-syscon", 1, 0,
+   _args);
+   if (ret) {
+   dev_dbg(mac_cb->dev, "mac%d no cpld-syscon found.\n",
+   mac_cb->mac_id);
mac_cb->cpld_ctrl = NULL;

[PATCH v4 net-next 11/13] net: hns: register phy device in each mac initial sequence

2016-06-02 Thread Yisen Zhuang

From: Kejian Yan 

In ACPI case, there is no interface to register phy device to mdio-bus.
Phy device has to be registered itself to mdio-bus, and then enet can
get the phy device's info so that it can config the phy-device to help
to trasmit and receive data.
HNS hardware topology is as below. The MDIO controller may control several
PHY-devices, and each PHY-device connects to a MAC device. PHY-devices
will register when each mac find PHY device in initial sequence.

   cpu
|
|
 ---
|   |   |
|   |   |
|  dsaf |
   MDIO |  MDIO
|  ---  |
| | | |   | |
| | | |   | |
|MAC   MAC   MAC MAC|
| | | |   | |
  | | |   | 
 ||||||   ||
 PHY   PHY   PHY PHY

Signed-off-by: Kejian Yan 
Signed-off-by: Yisen Zhuang 
---
change log:
 v2: fix the build error by kbuild test robot

 v1: first submit
  link: https://lkml.org/lkml/2016/5/13/97
---
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c | 133 --
 1 file changed, 126 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c 
b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c
index 3ef0c9b..c526558 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c
@@ -7,6 +7,7 @@
  * (at your option) any later version.
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -638,6 +639,115 @@ free_mac_drv:
return ret;
 }
 
+static int
+hns_mac_phy_parse_addr(struct device *dev, struct fwnode_handle *fwnode)
+{
+   u32 addr;
+   int ret;
+
+   ret = fwnode_property_read_u32(fwnode, "phy-addr", );
+   if (ret) {
+   dev_err(dev, "has invalid PHY address ret:%d\n", ret);
+   return ret;
+   }
+
+   if (addr >= PHY_MAX_ADDR) {
+   dev_err(dev, "PHY address %i is too large\n", addr);
+   return -EINVAL;
+   }
+
+   return addr;
+}
+
+static int hns_mac_phydev_match(struct device *dev, void *fwnode)
+{
+   return dev->fwnode == fwnode;
+}
+
+static struct
+platform_device *hns_mac_find_platform_device(struct fwnode_handle *fwnode)
+{
+   struct device *dev;
+
+   dev = bus_find_device(_bus_type, NULL,
+ fwnode, hns_mac_phydev_match);
+   return dev ? to_platform_device(dev) : NULL;
+}
+
+static int
+hns_mac_register_phydev(struct mii_bus *mdio, struct hns_mac_cb *mac_cb,
+   u32 addr)
+{
+   struct phy_device *phy;
+   const char *phy_type;
+   bool is_c45;
+   int rc;
+
+   rc = fwnode_property_read_string(mac_cb->fw_port,
+"phy-mode", _type);
+   if (rc < 0)
+   return rc;
+
+   if (!strcmp(phy_type, phy_modes(PHY_INTERFACE_MODE_XGMII)))
+   is_c45 = 1;
+   else if (!strcmp(phy_type, phy_modes(PHY_INTERFACE_MODE_SGMII)))
+   is_c45 = 0;
+   else
+   return -ENODATA;
+
+   phy = get_phy_device(mdio, addr, is_c45);
+   if (!phy || IS_ERR(phy))
+   return -EIO;
+
+   if (mdio->irq)
+   phy->irq = mdio->irq[addr];
+
+   /* All data is now stored in the phy struct;
+* register it
+*/
+   rc = phy_device_register(phy);
+   if (rc) {
+   phy_device_free(phy);
+   return -ENODEV;
+   }
+
+   mac_cb->phy_dev = phy;
+
+   dev_dbg(>dev, "registered phy at address %i\n", addr);
+
+   return 0;
+}
+
+static void hns_mac_register_phy(struct hns_mac_cb *mac_cb)
+{
+   struct acpi_reference_args args;
+   struct platform_device *pdev;
+   struct mii_bus *mii_bus;
+   int rc;
+   int addr;
+
+   /* Loop over the child nodes and register a phy_device for each one */
+   if (!to_acpi_device_node(mac_cb->fw_port))
+   return;
+
+   rc = acpi_node_get_property_reference(
+   mac_cb->fw_port, "mdio-node", 0, );
+   if (rc)
+   return;
+
+   addr = hns_mac_phy_parse_addr(mac_cb->dev, mac_cb->fw_port);
+   if (addr < 0)
+   return;
+
+   /* dev address in adev */
+   pdev = hns_mac_find_platform_device(acpi_fwnode_handle(args.adev));
+   mii_bus = platform_get_drvdata(pdev);
+   rc = hns_mac_register_phydev(mii_bus, mac_cb, addr);
+   if (!rc)
+   dev_dbg(mac_cb->dev, "mac%d register phy addr:%d\n",

[PATCH v4 net-next 02/13] ACPI: bus: add stub acpi_evaluate_dsm() to linux/acpi.h

2016-06-02 Thread Yisen Zhuang

From: Kejian Yan 

acpi_evaluate_dsm() will be used to handle the _DSM method in ACPI case.
It will be compiled in non-ACPI case, but the function is in acpi_bus.h
and acpi_bus.h can only be used in ACPI case, so this patch add the stub
function to linux/acpi.h to make compiled successfully in non-ACPI cases.

Cc: Rafael J. Wysocki 
Signed-off-by: Kejian Yan 
Signed-off-by: Yisen Zhuang 
---
 include/linux/acpi.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 3025d19..4d4bb49 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -659,6 +659,14 @@ static inline bool acpi_driver_match_device(struct device 
*dev,
return false;
 }
 
+static inline union acpi_object *acpi_evaluate_dsm(acpi_handle handle,
+  const u8 *uuid,
+  int rev, int func,
+  union acpi_object *argv4)
+{
+   return NULL;
+}
+
 static inline int acpi_device_uevent_modalias(struct device *dev,
struct kobj_uevent_env *env)
 {
-- 
1.9.1

[PATCH v4 net-next 05/13] net: hns: use device_* APIs instead of of_* APIs

2016-06-02 Thread Yisen Zhuang

From: Kejian Yan 

OF series functions can be used only for DT case. Use unified
device property function instead to support both DT and ACPI.

Signed-off-by: Kejian Yan 
Signed-off-by: Yisen Zhuang 
---
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c |  9 +
 drivers/net/ethernet/hisilicon/hns/hns_enet.c  | 11 +++
 2 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c 
b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c
index 1c2ddb2..9afc5e6 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c
@@ -50,7 +50,7 @@ int hns_dsaf_get_cfg(struct dsaf_device *dsaf_dev)
else
dsaf_dev->dsaf_ver = AE_VERSION_2;
 
-   ret = of_property_read_string(np, "mode", _str);
+   ret = device_property_read_string(dsaf_dev->dev, "mode", _str);
if (ret) {
dev_err(dsaf_dev->dev, "get dsaf mode fail, ret=%d!\n", ret);
return ret;
@@ -142,7 +142,7 @@ int hns_dsaf_get_cfg(struct dsaf_device *dsaf_dev)
}
}
 
-   ret = of_property_read_u32(np, "desc-num", _num);
+   ret = device_property_read_u32(dsaf_dev->dev, "desc-num", _num);
if (ret < 0 || desc_num < HNS_DSAF_MIN_DESC_CNT ||
desc_num > HNS_DSAF_MAX_DESC_CNT) {
dev_err(dsaf_dev->dev, "get desc-num(%d) fail, ret=%d!\n",
@@ -151,14 +151,15 @@ int hns_dsaf_get_cfg(struct dsaf_device *dsaf_dev)
}
dsaf_dev->desc_num = desc_num;
 
-   ret = of_property_read_u32(np, "reset-field-offset", _offset);
+   ret = device_property_read_u32(dsaf_dev->dev, "reset-field-offset",
+  _offset);
if (ret < 0) {
dev_dbg(dsaf_dev->dev,
"get reset-field-offset fail, ret=%d!\r\n", ret);
}
dsaf_dev->reset_offset = reset_offset;
 
-   ret = of_property_read_u32(np, "buf-size", _size);
+   ret = device_property_read_u32(dsaf_dev->dev, "buf-size", _size);
if (ret < 0) {
dev_err(dsaf_dev->dev,
"get buf-size fail, ret=%d!\r\n", ret);
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_enet.c 
b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
index e621636..8851420 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
@@ -1067,13 +1067,8 @@ void hns_nic_update_stats(struct net_device *netdev)
 static void hns_init_mac_addr(struct net_device *ndev)
 {
struct hns_nic_priv *priv = netdev_priv(ndev);
-   struct device_node *node = priv->dev->of_node;
-   const void *mac_addr_temp;
 
-   mac_addr_temp = of_get_mac_address(node);
-   if (mac_addr_temp && is_valid_ether_addr(mac_addr_temp)) {
-   memcpy(ndev->dev_addr, mac_addr_temp, ndev->addr_len);
-   } else {
+   if (!device_get_mac_address(priv->dev, ndev->dev_addr, ETH_ALEN)) {
eth_hw_addr_random(ndev);
dev_warn(priv->dev, "No valid mac, use random mac %pM",
 ndev->dev_addr);
@@ -1898,10 +1893,10 @@ static int hns_nic_dev_probe(struct platform_device 
*pdev)
goto out_read_prop_fail;
}
/* try to find port-idx-in-ae first */
-   ret = of_property_read_u32(node, "port-idx-in-ae", _id);
+   ret = device_property_read_u32(dev, "port-idx-in-ae", _id);
if (ret) {
/* only for old code compatible */
-   ret = of_property_read_u32(node, "port-id", _id);
+   ret = device_property_read_u32(dev, "port-id", _id);
if (ret)
goto out_read_prop_fail;
/* for old dts, we need to caculate the port offset */
-- 
1.9.1

Re: [PATCH v2] r8152: Add support for setting MAC to system's Auxiliary MAC address

2016-06-02 Thread Greg KH

On Thu, Jun 02, 2016 at 06:32:42PM +, mario_limoncie...@dell.com wrote:
> > And you want to check this for all Dell devices?  Please be model
> > specific, I doubt a bunch of Dell servers wants to run this code...
> > 
> 
> Tracking model specific is really going to turn into a giant list never 
> ending list.
> To drill down more specifically, I can match on chassis too.

Yes, as this is a vendor/platform-specific "quirk", you will have to
update it for each and every individual device you want it enabled as it
is so different from what all other drivers do.

thanks,

greg k-h

Offer

2016-06-02 Thread Pedro Quezada

You are a recipient to Mr Pedro Quezada Donation of 2M USD. Contact 
(qpedro...@gmail.com) for claims.

Re: [PATCH v2 6/7] Binding:PHY: Binding doc for NS2 PCIe PHYs.

2016-06-02 Thread Rob Herring

On Tue, May 31, 2016 at 07:06:40PM +0530, Pramod Kumar wrote:
> Binding doc for NS2 PCIe PHYs.
> 
> Signed-off-by: Jon Mason 
> Signed-off-by: Pramod Kumar 
> ---
>  .../bindings/phy/brcm,mdio-mux-bus-pci.txt | 27 
> ++
>  1 file changed, 27 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/phy/brcm,mdio-mux-bus-pci.txt

Acked-by: Rob Herring

Re: [PATCH v2 3/7] binding: mdio-mux: Add DT binding doc for Broadcom MDIO bus mutiplexer

2016-06-02 Thread Rob Herring

On Tue, May 31, 2016 at 07:06:37PM +0530, Pramod Kumar wrote:
> Add DT binding doc for Broadcom MDIO bus mutiplexer driver.
> 
> Signed-off-by: Pramod Kumar 
> ---
>  .../bindings/net/brcm,mdio-mux-iproc.txt   | 60 
> ++
>  1 file changed, 60 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/net/brcm,mdio-mux-iproc.txt
> 
> diff --git a/Documentation/devicetree/bindings/net/brcm,mdio-mux-iproc.txt 
> b/Documentation/devicetree/bindings/net/brcm,mdio-mux-iproc.txt
> new file mode 100644
> index 000..f270b41
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/net/brcm,mdio-mux-iproc.txt
> @@ -0,0 +1,60 @@
> +Properties for an MDIO bus mutiplexer found in Broadcom iProc based SoCs.
> +
> +This MDIO bus multiplexer defines buses that could be internal as well as
> +external to SoCs and could accept MDIO transaction compatible to C-22 or
> +C-45 Clause. When Child bus is selected, one need to select these two

s/Child/child/

s/need/needs/

> +properties as well to generate desired MDIO trascation on appropriate bus.
> +
> +Required properties in addition to the generic multiplexer properties:
> +
> +MDIO multiplexer node:
> +- complatible: brcm,mdio-mux-iproc.

typo

> +
> +Every non-ethernet PHY requires a compatible so that it could be probed based
> +on this compatible string.
> +
> +Additional information regarding generic multiplexer properties could be 
> found

s/could/can/

> +at- Documentation/devicetree/bindings/net/mdio-mux.txt
> +
> +
> +for example:
> + mdio_mux_iproc: mdio_mux_iproc@6602023c {

No '_' in node names.

mdio-mux@...

> + compatible = "brcm,mdio-mux-iproc";
> + reg = <0x6602023c 0x14>;
> + #address-cells = <1>;
> + #size-cells = <0>;
> + mdio-integrated-mux;
> +
> + mdio@0 {
> + reg = <0x0>;
> + #address-cells = <1>;
> + #size-cells = <0>;
> +
> + pci_phy0: pci-phy@0 {
> + compatible = "brcm,ns2-pcie-phy";
> + reg = <0x0>;
> + #phy-cells = <0>;
> + };
> + };
> +
> + mdio@7 {
> + reg = <0x7>;
> + #address-cells = <1>;
> + #size-cells = <0>;
> +
> + pci_phy1: pci-phy@0 {
> + compatible = "brcm,ns2-pcie-phy";
> + reg = <0x0>;
> + #phy-cells = <0>;
> + };
> + };
> + mdio@10 {
> + reg = <0x10>;
> + #address-cells = <1>;
> + #size-cells = <0>;
> +
> + gphy0: eth-phy@10 {
> + reg = <0x10>;
> + };
> + };
> + };
> -- 
> 1.9.1
>

[PATCH net] ethernet/sfc: report supported link speeds on SFP connections

2016-06-02 Thread Jarod Wilson

My solarflare cards connected to a 10GbE switch with an SFP+ module/cable
don't currently report any supported link speeds:

$ ethtool ens4f0
Settings for ens4f0:
Supported ports: [ FIBRE ]
Supported link modes:   Not reported
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: Yes
Advertised link modes:  Not reported
Advertised pause frame use: Symmetric
Advertised auto-negotiation: Yes
Link partner advertised link modes:  1baseKX4/Full
Link partner advertised pause frame use: Symmetric
Link partner advertised auto-negotiation: No
Speed: 1Mb/s
Duplex: Full
Port: FIBRE
PHYAD: 255
Transceiver: internal
Auto-negotiation: on
Cannot get wake-on-lan settings: Operation not permitted
Current message level: 0x20f7 (8439)
   drv probe link ifdown ifup rx_err tx_err hw
Link detected: yes

I've navigated my way through the sfc code down to mcdi_to_ethtool_cap's
switch on media's MC_CMD_MEDIA_SFP_PLUS case, where no speeds are set.
If we just do some cap checks similar to the MC_CMD_MEDIA_KX4 case, I get
the expected output:

$ ethtool ens4f0
Settings for ens4f0:
Supported ports: [ FIBRE ]
Supported link modes:   1000baseKX/Full
1baseKX4/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: Yes
Advertised link modes:  Not reported
Advertised pause frame use: Symmetric
Advertised auto-negotiation: Yes
Link partner advertised link modes:  1baseKX4/Full
Link partner advertised pause frame use: Symmetric
Link partner advertised auto-negotiation: No
Speed: 1Mb/s
Duplex: Full
Port: FIBRE
PHYAD: 255
Transceiver: internal
Auto-negotiation: on
Cannot get wake-on-lan settings: Operation not permitted
Current message level: 0x20f7 (8439)
   drv probe link ifdown ifup rx_err tx_err hw
Link detected: yes

This is from an sfc9120 interface here. It also applies to a 9140 with a
10GbE breakout cable.

Side note: wiring up Advertised by simply copying Supported seems to be a
thing many other drivers do. Worth doing here?...

CC: Solarflare linux maintainers 
CC: Edward Cree 
CC: Bert Kenward 
CC: netdev@vger.kernel.org
Signed-off-by: Jarod Wilson 
---
 drivers/net/ethernet/sfc/mcdi_port.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/sfc/mcdi_port.c 
b/drivers/net/ethernet/sfc/mcdi_port.c
index 7f295c4..6516471 100644
--- a/drivers/net/ethernet/sfc/mcdi_port.c
+++ b/drivers/net/ethernet/sfc/mcdi_port.c
@@ -189,6 +189,10 @@ static u32 mcdi_to_ethtool_cap(u32 media, u32 cap)
 
case MC_CMD_MEDIA_XFP:
case MC_CMD_MEDIA_SFP_PLUS:
+   if (cap & (1 << MC_CMD_PHY_CAP_1000FDX_LBN))
+   result |= SUPPORTED_1000baseKX_Full;
+   if (cap & (1 << MC_CMD_PHY_CAP_1FDX_LBN))
+   result |= SUPPORTED_1baseKX4_Full;
result |= SUPPORTED_FIBRE;
break;
 
-- 
1.8.3.1

[PATCH net-next v2 1/2] net: Add l3mdev rule

2016-06-02 Thread David Ahern

Currently, VRFs require 1 oif and 1 iif rule per address family per
VRF. As the number of VRF devices increases it brings scalability
issues with the increasing rule list. All of the VRF rules have the
same format with the exception of the specific table id to direct the
lookup. Since the table id is available from the oif or iif in the
loopup, the VRF rules can be consolidated to a single rule that pulls
the table from the VRF device.

This patch introduces a new rule attribute l3mdev. The l3mdev rule
means the table id used for the lookup is pulled from the L3 master
device (e.g., VRF) rather than being statically defined. With the
l3mdev rule all of the basic VRF FIB rules are reduced to 1 l3mdev
rule per address family (IPv4 and IPv6).

If an admin wishes to insert higher priority rules for specific VRFs
those rules will co-exist with the l3mdev rule. This capability means
current VRF scripts will co-exist with this new simpler implementation.

Currently, the rules list for both ipv4 and ipv6 look like this:
$ ip  ru ls
1000:   from all oif vrf1 lookup 1001
1000:   from all iif vrf1 lookup 1001
1000:   from all oif vrf2 lookup 1002
1000:   from all iif vrf2 lookup 1002
1000:   from all oif vrf3 lookup 1003
1000:   from all iif vrf3 lookup 1003
1000:   from all oif vrf4 lookup 1004
1000:   from all iif vrf4 lookup 1004
1000:   from all oif vrf5 lookup 1005
1000:   from all iif vrf5 lookup 1005
1000:   from all oif vrf6 lookup 1006
1000:   from all iif vrf6 lookup 1006
1000:   from all oif vrf7 lookup 1007
1000:   from all iif vrf7 lookup 1007
1000:   from all oif vrf8 lookup 1008
1000:   from all iif vrf8 lookup 1008
...
32765:  from all lookup local
32766:  from all lookup main
32767:  from all lookup default

With the l3mdev rule the list is just the following regardless of the
number of VRFs:
$ ip ru ls
1000:   from all lookup [l3mdev table]
32765:  from all lookup local
32766:  from all lookup main
32767:  from all lookup default

(Note: the above pretty print of the rule is based on an iproute2
   prototype. Actual verbage may change)

Signed-off-by: David Ahern 
---
v2
- if CONFIG_NET_L3_MASTER_DEV is not enabled changed the inline
  l3mdev_fib_rule_match function to return 1 rather than 0 allowing
  the compiler to completely drop the check:
 if (rule->l3mdev && !l3mdev_fib_rule_match())

- moved setting of tb_id down to its use in fib4_rule_action which
  addresses Dave's comment about reverse xmas tree order. Same
  change for ipv6 version.

 include/net/fib_rules.h| 24 ++--
 include/net/l3mdev.h   | 12 
 include/uapi/linux/fib_rules.h |  1 +
 net/core/fib_rules.c   | 33 -
 net/ipv4/fib_rules.c   |  6 --
 net/ipv6/fib6_rules.c  |  6 --
 net/l3mdev/l3mdev.c| 38 ++
 7 files changed, 109 insertions(+), 11 deletions(-)

diff --git a/include/net/fib_rules.h b/include/net/fib_rules.h
index 59160de702b6..456e4a6006ab 100644
--- a/include/net/fib_rules.h
+++ b/include/net/fib_rules.h
@@ -17,7 +17,8 @@ struct fib_rule {
u32 flags;
u32 table;
u8  action;
-   /* 3 bytes hole, try to use */
+   u8  l3mdev;
+   /* 2 bytes hole, try to use */
u32 target;
__be64  tun_id;
struct fib_rule __rcu   *ctarget;
@@ -36,6 +37,7 @@ struct fib_lookup_arg {
void*lookup_ptr;
void*result;
struct fib_rule *rule;
+   u32 table;
int flags;
 #define FIB_LOOKUP_NOREF   1
 #define FIB_LOOKUP_IGNORE_LINKSTATE2
@@ -89,7 +91,8 @@ struct fib_rules_ops {
[FRA_TABLE] = { .type = NLA_U32 }, \
[FRA_SUPPRESS_PREFIXLEN] = { .type = NLA_U32 }, \
[FRA_SUPPRESS_IFGROUP] = { .type = NLA_U32 }, \
-   [FRA_GOTO]  = { .type = NLA_U32 }
+   [FRA_GOTO]  = { .type = NLA_U32 }, \
+   [FRA_L3MDEV]= { .type = NLA_U8 }
 
 static inline void fib_rule_get(struct fib_rule *rule)
 {
@@ -102,6 +105,20 @@ static inline void fib_rule_put(struct fib_rule *rule)
kfree_rcu(rule, rcu);
 }
 
+#ifdef CONFIG_NET_L3_MASTER_DEV
+static inline u32 fib_rule_get_table(struct fib_rule *rule,
+struct fib_lookup_arg *arg)
+{
+   return rule->l3mdev ? arg->table : rule->table;
+}
+#else
+static inline u32 fib_rule_get_table(struct fib_rule *rule,
+struct fib_lookup_arg *arg)
+{
+   return rule->table;
+}
+#endif
+
 static inline u32

[PATCH net-next v2 2/2] net: vrf: Add l3mdev rules on first device create

2016-06-02 Thread David Ahern

Add l3mdev rule per address family when the first VRF device is
created. Remove them when the last is deleted.

Signed-off-by: David Ahern 
---
v2
- added EXCL flag and EEXISTS check. Appropriate once the exclude fib rule
  patch is accepted
- changed 3rd arg to vrf_fib_rule from 0/1 to false/true per Dave's comment

 drivers/net/vrf.c | 119 +-
 1 file changed, 118 insertions(+), 1 deletion(-)

diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
index d356f5d0f7b0..1d13c95cab97 100644
--- a/drivers/net/vrf.c
+++ b/drivers/net/vrf.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define RT_FL_TOS(oldflp4) \
((oldflp4)->flowi4_tos & (IPTOS_RT_MASK | RTO_ONLINK))
@@ -42,6 +43,11 @@
 #define DRV_NAME   "vrf"
 #define DRV_VERSION"1.0"
 
+static atomic_t num_vrfs;
+
+static u32 rule_pref = 1000;
+module_param(rule_pref, uint,  S_IRUGO);
+
 struct net_vrf {
struct rtable __rcu *rth;
struct rt6_info __rcu   *rt6;
@@ -729,6 +735,98 @@ static const struct ethtool_ops vrf_ethtool_ops = {
.get_drvinfo= vrf_get_drvinfo,
 };
 
+static inline size_t vrf_fib_rule_nl_size(void)
+{
+   size_t sz;
+
+   sz  = NLMSG_ALIGN(sizeof(struct fib_rule_hdr));
+   sz += nla_total_size(sizeof(u8));   /* FRA_L3MDEV */
+   sz += nla_total_size(sizeof(u32));  /* FRA_PRIORITY */
+
+   return sz;
+}
+
+static int vrf_fib_rule(const struct net_device *dev, __u8 family, bool add_it)
+{
+   struct fib_rule_hdr *frh;
+   struct nlmsghdr *nlh;
+   struct sk_buff *skb;
+   int err;
+
+   skb = nlmsg_new(vrf_fib_rule_nl_size(), GFP_KERNEL);
+   if (!skb)
+   return -ENOMEM;
+
+   nlh = nlmsg_put(skb, 0, 0, 0, sizeof(*frh), 0);
+   if (!nlh)
+   goto nla_put_failure;
+
+   /* rule only needs to appear once */
+   nlh->nlmsg_flags &= NLM_F_EXCL;
+
+   frh = nlmsg_data(nlh);
+   memset(frh, 0, sizeof(*frh));
+   frh->family = family;
+   frh->action = FR_ACT_TO_TBL;
+
+   if (nla_put_u32(skb, FRA_L3MDEV, 1))
+   goto nla_put_failure;
+
+   if (nla_put_u32(skb, FRA_PRIORITY, rule_pref))
+   goto nla_put_failure;
+
+   nlmsg_end(skb, nlh);
+
+   /* fib_nl_{new,del}rule handling looks for net from skb->sk */
+   skb->sk = dev_net(dev)->rtnl;
+   if (add_it) {
+   err = fib_nl_newrule(skb, nlh);
+   if (err == -EEXIST)
+   err = 0;
+   } else {
+   err = fib_nl_delrule(skb, nlh);
+   if (err == -ENOENT)
+   err = 0;
+   }
+   nlmsg_free(skb);
+
+   return err;
+
+nla_put_failure:
+   nlmsg_free(skb);
+
+   return -EMSGSIZE;
+}
+
+static void vrf_del_fib_rules(const struct net_device *dev)
+{
+   if (vrf_fib_rule(dev, AF_INET,  false) ||
+   vrf_fib_rule(dev, AF_INET6, false)) {
+   netdev_err(dev, "Failed to delete FIB rules.\n");
+   }
+}
+
+static int vrf_add_fib_rules(const struct net_device *dev)
+{
+   int err;
+
+   err = vrf_fib_rule(dev, AF_INET,  true);
+   if (err < 0)
+   goto out_err;
+
+   err = vrf_fib_rule(dev, AF_INET6, true);
+   if (err < 0)
+   goto out_err;
+
+   return 0;
+
+out_err:
+   netdev_err(dev, "Failed to add FIB rules.\n");
+   vrf_del_fib_rules(dev);
+
+   return err;
+}
+
 static void vrf_setup(struct net_device *dev)
 {
ether_setup(dev);
@@ -763,12 +861,17 @@ static int vrf_validate(struct nlattr *tb[], struct 
nlattr *data[])
 static void vrf_dellink(struct net_device *dev, struct list_head *head)
 {
unregister_netdevice_queue(dev, head);
+
+   atomic_dec(_vrfs);
+   if (!atomic_read(_vrfs))
+   vrf_del_fib_rules(dev);
 }
 
 static int vrf_newlink(struct net *src_net, struct net_device *dev,
   struct nlattr *tb[], struct nlattr *data[])
 {
struct net_vrf *vrf = netdev_priv(dev);
+   int err;
 
if (!data || !data[IFLA_VRF_TABLE])
return -EINVAL;
@@ -777,7 +880,21 @@ static int vrf_newlink(struct net *src_net, struct 
net_device *dev,
 
dev->priv_flags |= IFF_L3MDEV_MASTER;
 
-   return register_netdevice(dev);
+   err = register_netdevice(dev);
+   if (err)
+   goto out;
+
+   if (!atomic_read(_vrfs)) {
+   err = vrf_add_fib_rules(dev);
+   if (err) {
+   unregister_netdevice(dev);
+   goto out;
+   }
+   }
+
+   atomic_inc(_vrfs);
+out:
+   return err;
 }
 
 static size_t vrf_nl_getsize(const struct net_device *dev)
-- 
2.1.4

[PATCH net-next v2 0/2] net: vrf: Improve use of FIB rules

2016-06-02 Thread David Ahern

Currently, VRFs require 1 oif and 1 iif rule per address family per
VRF. As the number of VRF devices increases it brings scalability
issues with the increasing rule list. All of the VRF rules have the
same format with the exception of the specific table id to direct the
lookup. Since the table id is available from the oif or iif in the
loopup, the VRF rules can be consolidated to a single rule that pulls
the table from the VRF device.

This solution still allows a user to insert their own rules for VRFs,
including rules with additional attributes. Accordingly, it is backwards
compatible with existing setups and allows other policy routing as
desired.

David Ahern (2):
  net: Add l3mdev rule
  net: vrf: Add l3mdev rules on first device create

 drivers/net/vrf.c  | 119 -
 include/net/fib_rules.h|  24 -
 include/net/l3mdev.h   |  12 +
 include/uapi/linux/fib_rules.h |   1 +
 net/core/fib_rules.c   |  33 ++--
 net/ipv4/fib_rules.c   |   6 ++-
 net/ipv6/fib6_rules.c  |   6 ++-
 net/l3mdev/l3mdev.c|  38 +
 8 files changed, 227 insertions(+), 12 deletions(-)

-- 
2.1.4

Re: [PATCH 0/2] Quiet noisy LSM denial when accessing net sysctl

2016-06-02 Thread James Morris

On Thu, 2 Jun 2016, Tyler Hicks wrote:

> On 05/17/2016 09:13 AM, Tyler Hicks wrote:
> > On 05/08/2016 10:56 PM, David Miller wrote:
> >> From: Tyler Hicks 
> >> Date: Fri,  6 May 2016 18:04:12 -0500
> >>
> >>> This pair of patches does away with what I believe is a useless denial
> >>> audit message when a privileged process initially accesses a net sysctl.
> >>
> >> The LSM folks can apply this if they agree with you.
> > 
> > Hi James - Could you pick up these two bug fix patches? Thanks!
> 
> Hello - Just checking in again to see if you plan on taking these
> through the security tree?

Sure, please resend.

-- 
James Morris

Re: [PATCH v2 2/7] DT: phy.txt: Add mdio-integrated-mux property

2016-06-02 Thread Andrew Lunn

On Thu, Jun 02, 2016 at 06:27:03PM -0500, Rob Herring wrote:
> On Tue, May 31, 2016 at 07:06:36PM +0530, Pramod Kumar wrote:
> > This property is used by integrated MDIO multiplexer
> > which has bus selection and mdio transaction generation logic,
> > integrated inside.
> > 
> > Signed-off-by: Pramod Kumar 
> > ---
> >  Documentation/devicetree/bindings/net/mdio-mux.txt | 9 -
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/Documentation/devicetree/bindings/net/mdio-mux.txt 
> > b/Documentation/devicetree/bindings/net/mdio-mux.txt
> > index 491f5bd..b5ad83e 100644
> > --- a/Documentation/devicetree/bindings/net/mdio-mux.txt
> > +++ b/Documentation/devicetree/bindings/net/mdio-mux.txt
> > @@ -5,13 +5,20 @@ numbered uniquely in a device dependent manner.  The 
> > nodes for an MDIO
> >  bus multiplexer/switch will have one child node for each child bus.
> >  
> >  Required properties:
> > -- mdio-parent-bus : phandle to the parent MDIO bus.
> >  - #address-cells = <1>;
> >  - #size-cells = <0>;
> >  
> >  Optional properties:
> > +- mdio-parent-bus : phandle to the parent MDIO bus. Should be used
> > +   if parent mdio bus is not part of multiplexer.
> 
> You don't appear to be using this. When would you?

He is moving it to optional. The mdio-mux-mmio and mdio-mux-gpio do
however use it, which follow this binding.

Andrew

RE: [PATCH] net: fjes: fjes_main: Remove create_workqueue

2016-06-02 Thread Izumi, Taku

Dear Bhaktipriya,

Thanks. Looks good to me.

Sincerely,
Taku Izumi

> -Original Message-
> From: Bhaktipriya Shridhar [mailto:bhaktipriy...@gmail.com]
> Sent: Thursday, June 02, 2016 6:31 PM
> To: David S. Miller; Izumi, Taku/泉 拓; Florian Westphal; Bhaktipriya Shridhar
> Cc: Tejun Heo; netdev@vger.kernel.org; linux-ker...@vger.kernel.org
> Subject: [PATCH] net: fjes: fjes_main: Remove create_workqueue
> 
> alloc_workqueue replaces deprecated create_workqueue().
> 
> The workqueue adapter->txrx_wq has workitem
> >raise_intr_rxdata_task per adapter. Extended Socket Network
> Device is shared memory based, so someone's transmission denotes other's
> reception.  raise_intr_rxdata_task raises interruption of receivers from
> the sender in order to notify receivers.
> 
> The workqueue adapter->control_wq has workitem
> >interrupt_watch_task per adapter. interrupt_watch_task is used
> to prevent delay of interrupts.
> 
> Dedicated workqueues have been used in both cases since the workitems
> on the workqueues are involved in normal device operation and require
> forward progress under memory pressure.
> 
> max_active has been set to 0 since there is no need for throttling
> the number of active work items.
> 
> Since network devices  may be used for memory reclaim,
> WQ_MEM_RECLAIM has been set to guarantee forward progress.
> 
> Signed-off-by: Bhaktipriya Shridhar 
> ---
>  drivers/net/fjes/fjes_main.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c
> index 86c331b..9006877 100644
> --- a/drivers/net/fjes/fjes_main.c
> +++ b/drivers/net/fjes/fjes_main.c
> @@ -1187,8 +1187,9 @@ static int fjes_probe(struct platform_device *plat_dev)
>   adapter->force_reset = false;
>   adapter->open_guard = false;
> 
> - adapter->txrx_wq = create_workqueue(DRV_NAME "/txrx");
> - adapter->control_wq = create_workqueue(DRV_NAME "/control");
> + adapter->txrx_wq = alloc_workqueue(DRV_NAME "/txrx", WQ_MEM_RECLAIM, 0);
> + adapter->control_wq = alloc_workqueue(DRV_NAME "/control",
> +   WQ_MEM_RECLAIM, 0);
> 
>   INIT_WORK(>tx_stall_task, fjes_tx_stall_task);
>   INIT_WORK(>raise_intr_rxdata_task,
> --
> 2.1.4
>

Re: [PATCH v2 2/7] DT: phy.txt: Add mdio-integrated-mux property

2016-06-02 Thread Rob Herring

On Tue, May 31, 2016 at 07:06:36PM +0530, Pramod Kumar wrote:
> This property is used by integrated MDIO multiplexer
> which has bus selection and mdio transaction generation logic,
> integrated inside.
> 
> Signed-off-by: Pramod Kumar 
> ---
>  Documentation/devicetree/bindings/net/mdio-mux.txt | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/devicetree/bindings/net/mdio-mux.txt 
> b/Documentation/devicetree/bindings/net/mdio-mux.txt
> index 491f5bd..b5ad83e 100644
> --- a/Documentation/devicetree/bindings/net/mdio-mux.txt
> +++ b/Documentation/devicetree/bindings/net/mdio-mux.txt
> @@ -5,13 +5,20 @@ numbered uniquely in a device dependent manner.  The nodes 
> for an MDIO
>  bus multiplexer/switch will have one child node for each child bus.
>  
>  Required properties:
> -- mdio-parent-bus : phandle to the parent MDIO bus.
>  - #address-cells = <1>;
>  - #size-cells = <0>;
>  
>  Optional properties:
> +- mdio-parent-bus : phandle to the parent MDIO bus. Should be used
> + if parent mdio bus is not part of multiplexer.

You don't appear to be using this. When would you?

> +- mdio-integrated-mux: boolean property indicateing that the hardware
> + is an integrated multiplex supporting muxed bus selection
> + and MDIO transaction logic generation.
>  - Other properties specific to the multiplexer/switch hardware.
>  
> +Note: one of mdio-parent-bus and mdio-integrated-mux is mandatory to
> +get parent bus regsitered.
> +
>  Required properties for child nodes:
>  - #address-cells = <1>;
>  - #size-cells = <0>;
> -- 
> 1.9.1
>

[PATCH] net: ethernet: ti: cpsw: remove rx_descs property

2016-06-02 Thread Ivan Khoronzhuk

There is no reason to hold s/w dependent parameter in device tree.
Even more, there is no reason in this parameter because davinici_cpdma
driver splits pool of descriptors equally between tx and rx channels.
That is, if number of descriptors 256, 128 of them are for rx
channels. While receiving, the descriptor is freed to the pool and
then allocated with new skb. And if in DT the "rx_descs" is set to
64, then 128 - 64 = 64 descriptors are always in the pool and cannot
be used, for tx, for instance. It's not correct resource usage,
better to set it to half of pool, then the rx pool can be used in
full. It will not have any impact on performance, as anyway, the
"redundant" descriptors were unused.

Signed-off-by: Ivan Khoronzhuk 
---

Based on master

 Documentation/devicetree/bindings/net/cpsw.txt |  3 ---
 arch/arm/boot/dts/am33xx.dtsi  |  1 -
 arch/arm/boot/dts/am4372.dtsi  |  1 -
 arch/arm/boot/dts/dm814x.dtsi  |  1 -
 arch/arm/boot/dts/dra7.dtsi|  1 -
 drivers/net/ethernet/ti/cpsw.c | 13 +++--
 drivers/net/ethernet/ti/cpsw.h |  1 -
 drivers/net/ethernet/ti/davinci_cpdma.c|  6 ++
 drivers/net/ethernet/ti/davinci_cpdma.h|  1 +
 9 files changed, 10 insertions(+), 18 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/cpsw.txt 
b/Documentation/devicetree/bindings/net/cpsw.txt
index 0ae0649..5fe6239 100644
--- a/Documentation/devicetree/bindings/net/cpsw.txt
+++ b/Documentation/devicetree/bindings/net/cpsw.txt
@@ -15,7 +15,6 @@ Required properties:
 - cpdma_channels   : Specifies number of channels in CPDMA
 - ale_entries  : Specifies No of entries ALE can hold
 - bd_ram_size  : Specifies internal descriptor RAM size
-- rx_descs : Specifies number of Rx descriptors
 - mac_control  : Specifies Default MAC control register content
  for the specific platform
 - slaves   : Specifies number for slaves
@@ -70,7 +69,6 @@ Examples:
ale_entries = <1024>;
bd_ram_size = <0x2000>;
no_bd_ram = <0>;
-   rx_descs = <64>;
mac_control = <0x20>;
slaves = <2>;
active_slave = <0>;
@@ -99,7 +97,6 @@ Examples:
ale_entries = <1024>;
bd_ram_size = <0x2000>;
no_bd_ram = <0>;
-   rx_descs = <64>;
mac_control = <0x20>;
slaves = <2>;
active_slave = <0>;
diff --git a/arch/arm/boot/dts/am33xx.dtsi b/arch/arm/boot/dts/am33xx.dtsi
index 52be48b..702126f 100644
--- a/arch/arm/boot/dts/am33xx.dtsi
+++ b/arch/arm/boot/dts/am33xx.dtsi
@@ -766,7 +766,6 @@
ale_entries = <1024>;
bd_ram_size = <0x2000>;
no_bd_ram = <0>;
-   rx_descs = <64>;
mac_control = <0x20>;
slaves = <2>;
active_slave = <0>;
diff --git a/arch/arm/boot/dts/am4372.dtsi b/arch/arm/boot/dts/am4372.dtsi
index 12fcde4..a10fa7f 100644
--- a/arch/arm/boot/dts/am4372.dtsi
+++ b/arch/arm/boot/dts/am4372.dtsi
@@ -626,7 +626,6 @@
ale_entries = <1024>;
bd_ram_size = <0x2000>;
no_bd_ram = <0>;
-   rx_descs = <64>;
mac_control = <0x20>;
slaves = <2>;
active_slave = <0>;
diff --git a/arch/arm/boot/dts/dm814x.dtsi b/arch/arm/boot/dts/dm814x.dtsi
index d4537dc..f23cae0c 100644
--- a/arch/arm/boot/dts/dm814x.dtsi
+++ b/arch/arm/boot/dts/dm814x.dtsi
@@ -509,7 +509,6 @@
ale_entries = <1024>;
bd_ram_size = <0x2000>;
no_bd_ram = <0>;
-   rx_descs = <64>;
mac_control = <0x20>;
slaves = <2>;
active_slave = <0>;
diff --git a/arch/arm/boot/dts/dra7.dtsi b/arch/arm/boot/dts/dra7.dtsi
index e007401..b7ddc64 100644
--- a/arch/arm/boot/dts/dra7.dtsi
+++ b/arch/arm/boot/dts/dra7.dtsi
@@ -1626,7 +1626,6 @@
ale_entries = <1024>;
bd_ram_size = <0x2000>;
no_bd_ram = <0>;
-   rx_descs = <64>;
mac_control = <0x20>;
slaves = <2>;
active_slave = <0>;
diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index 4b08a2f..635be3e 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -1277,6 +1277,7 @@ static int cpsw_ndo_open(struct net_device *ndev)
  ALE_ALL_PORTS, ALE_ALL_PORTS, 0, 0);
 
if (!cpsw_common_res_usage_state(priv)) {

[PATCH] net: ethernet: ti: cpsw: remove unused priv lock

2016-06-02 Thread Ivan Khoronzhuk

There is no reason in this lock. At least for now.

Signed-off-by: Ivan Khoronzhuk 
---

Based on master

 drivers/net/ethernet/ti/cpsw.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index 9919cb3..8d1d373 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -365,7 +365,6 @@ static inline void slave_write(struct cpsw_slave *slave, 
u32 val, u32 offset)
 }
 
 struct cpsw_priv {
-   spinlock_t  lock;
struct platform_device  *pdev;
struct net_device   *ndev;
struct napi_struct  napi_rx;
@@ -2413,7 +2412,6 @@ static int cpsw_probe_dual_emac(struct platform_device 
*pdev,
}
 
priv_sl2 = netdev_priv(ndev);
-   spin_lock_init(_sl2->lock);
priv_sl2->data = *data;
priv_sl2->pdev = pdev;
priv_sl2->ndev = ndev;
@@ -2533,7 +2531,6 @@ static int cpsw_probe(struct platform_device *pdev)
 
platform_set_drvdata(pdev, ndev);
priv = netdev_priv(ndev);
-   spin_lock_init(>lock);
priv->pdev = pdev;
priv->ndev = ndev;
priv->dev  = >dev;
-- 
1.9.1

Re: Possible problem with e6afc8ac ("udp: remove headers from UDP packets before queueing")

2016-06-02 Thread David Miller

From: Eric Dumazet 
Date: Thu, 02 Jun 2016 14:52:43 -0700

> From: Eric Dumazet 
> 
> Paul Moore tracked a regression caused by a recent commit, which
> mistakenly assumed that sk_filter() could be avoided if socket
> had no current BPF filter.
> 
> The intent was to avoid udp_lib_checksum_complete() overhead.
> 
> But sk_filter() also checks skb_pfmemalloc() and
> security_sock_rcv_skb(), so better call it.
> 
> Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing")
> Signed-off-by: Eric Dumazet 
> Reported-by: Paul Moore 
> Tested-by: Paul Moore 
> Tested-by: Stephen Smalley 
> Cc: samanthakumar 

Applied, thanks Eric.

Re: [PATCH net-next] net: vrf: set operstate and mtu at link create

2016-06-02 Thread David Miller

From: David Ahern 
Date: Wed,  1 Jun 2016 21:16:39 -0700

> The VRF device exists to define L3 domains and guide FIB lookups. As
> such its operstate is not relevant. Seeing 'state UNKNOWN' in the
> output of 'ip link show' can be confusing, so set operstate at link
> create.
> 
> Similarly, the MTU for a VRF device is not used; any fragmentation
> of the payload is done on the output path based on the real egress
> device. An MTU of 1500 on the VRF device while enslaved devices
> have a higher MTU can lead to confusion. Since the VRF MTU is not
> relevant set to 64k similar to what is done for loopback.
> 
> Signed-off-by: David Ahern 

Applied, thanks.

Re: [PATCH net-next 2/2] net: vrf: Add l3mdev rules on first device create

2016-06-02 Thread David Miller

From: David Ahern 
Date: Wed,  1 Jun 2016 21:14:54 -0700

> Add l3mdev rule per address family when the first VRF device is
> created. Remove them when the last is deleted.
> 
> Signed-off-by: David Ahern 
> ---
>  drivers/net/vrf.c | 114 
> +-
>  1 file changed, 113 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
> index dff08842f26d..b3cb80e84ea7 100644
> --- a/drivers/net/vrf.c
> +++ b/drivers/net/vrf.c
> @@ -35,6 +35,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #define RT_FL_TOS(oldflp4) \
>   ((oldflp4)->flowi4_tos & (IPTOS_RT_MASK | RTO_ONLINK))
> @@ -42,6 +43,11 @@
>  #define DRV_NAME "vrf"
>  #define DRV_VERSION  "1.0"
>  
> +static atomic_t num_vrfs;
> +
> +static u32 rule_pref = 1000;
> +module_param(rule_pref, uint,  S_IRUGO);
> +
>  struct net_vrf {
>   struct rtable __rcu *rth;
>   struct rt6_info __rcu   *rt6;
> @@ -723,6 +729,93 @@ static const struct ethtool_ops vrf_ethtool_ops = {
>   .get_drvinfo= vrf_get_drvinfo,
>  };
>  
> +static inline size_t vrf_fib_rule_nl_size(void)
> +{
> + size_t sz;
> +
> + sz  = NLMSG_ALIGN(sizeof(struct fib_rule_hdr));
> + sz += nla_total_size(sizeof(u8));   /* FRA_L3MDEV */
> + sz += nla_total_size(sizeof(u32));  /* FRA_PRIORITY */
> +
> + return sz;
> +}
> +
> +static int vrf_fib_rule(const struct net_device *dev, __u8 family, bool 
> add_it)
 ...
> +static void vrf_del_fib_rules(const struct net_device *dev)
> +{
> + if (vrf_fib_rule(dev, AF_INET,  0) ||
> + vrf_fib_rule(dev, AF_INET6, 0)) {
 ...
> +static int vrf_add_fib_rules(const struct net_device *dev)
> +{
> + int err;
> +
> + err = vrf_fib_rule(dev, AF_INET,  1);
> + if (err < 0)
> + goto out_err;
> +
> + err = vrf_fib_rule(dev, AF_INET6, 1);

Since the third arg to vrf_fib_rule() is a bool, pass true/false.

Re: [PATCH net-next 1/2] net: Add l3mdev rule

2016-06-02 Thread David Miller

From: David Ahern 
Date: Wed,  1 Jun 2016 21:14:53 -0700

> @@ -76,6 +76,7 @@ static int fib4_rule_action(struct fib_rule *rule, struct 
> flowi *flp,
>  {
>   int err = -EAGAIN;
>   struct fib_table *tbl;
> + u32 tb_id = fib_rule_get_table(rule, arg);

Please order local variable lines from longest to shortest.

Re: [net-next] ovs: set name assign type of internal port

2016-06-02 Thread David Miller

From: Zhang Shengju 
Date: Tue, 31 May 2016 13:41:02 +

> Set name_assign_type of internal port to NET_NAME_USER.
> 
> Signed-off-by: Zhang Shengju 

Applied, thanks.

Re: [PATCH net-next v10 2/5] openvswitch: set skb protocol and mac_len when receiving on internal device

2016-06-02 Thread pravin shelar

On Wed, Jun 1, 2016 at 11:24 PM, Simon Horman
 wrote:
> * Set skb protocol based on contents of packet. I have observed this is
>   necessary to get actual protocol of a packet when it is injected into an
>   internal device e.g. by libnet in which case skb protocol will be set to
>   ETH_ALL.
>
> * Set the mac_len which has been observed to not be set up correctly when
>   an ARP packet is generated and sent via an openvswitch bridge.
>   My test case is a scenario where there are two open vswtich bridges.
>   One outputs to a tunnel port which egresses on the other.
>
> The motivation for this is that support for outputting to layer 3 (non-tap)
> GRE tunnels as implemented by a subsequent patch depends on protocol and
> mac_len being set correctly on receive.
>
> Signed-off-by: Simon Horman 
>
> ---
> v10
> * Set mac_len
>
> v9
> * New patch
> ---
>  net/openvswitch/vport-internal_dev.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/net/openvswitch/vport-internal_dev.c 
> b/net/openvswitch/vport-internal_dev.c
> index 2ee48e447b72..f89b1efa88f1 100644
> --- a/net/openvswitch/vport-internal_dev.c
> +++ b/net/openvswitch/vport-internal_dev.c
> @@ -48,6 +48,10 @@ static int internal_dev_xmit(struct sk_buff *skb, struct 
> net_device *netdev)
>  {
> int len, err;
>
> +   skb->protocol = eth_type_trans(skb, netdev);
> +   skb_push(skb, ETH_HLEN);
> +   skb_reset_mac_len(skb);
> +
resetting mac-len breaks the assumption about mac_len for referencing
MPLS header ref: skb_mpls_header().

Re: [PATCH net-next v10 4/5] openvswitch: add layer 3 flow/port support

2016-06-02 Thread pravin shelar

On Wed, Jun 1, 2016 at 11:24 PM, Simon Horman
 wrote:
> From: Lorand Jakab 
>
> Implementation of the pop_eth and push_eth actions in the kernel, and
> layer 3 flow support.
>
> This doesn't actually do anything yet as no layer 2 tunnel ports are
> supported yet. The original patch by Lorand was against the Open vSwitch
> tree which has L2 LISP tunnels but that is not supported in mainline Linux.
> I (Simon) plan to follow up with support for non-TEB GRE ports based on
> work by Thomas Morin.
>
> Cc: Thomas Morin 
> Signed-off-by: Lorand Jakab 
> Signed-off-by: Simon Horman 
>
> ---
> v10 [Simon Horman]
> * Move outermost VLAN into skb metadata in pop_eth and
>   leave any VLAN as-is in push_eth. The effect is to allow the presence
>   of a vlan to be independent of pushing and popping ethernet headers.
> * Omit unnecessary type field from push_eth action
> * Squash with the following patches to make a more complete patch:
>   "openvswitch: add layer 3 support to ovs_packet_cmd_execute()"
>   "openvswitch: extend layer 3 support to cover non-IP packets"
>
> v9 [Simon Horman]
> * Rebase
> * Minor coding style updates
> * Prohibit push/pop MPLS on l3 packets
> * There are no layer 3 ports supported at this time so only
>   send and receive layer 2 packets: that is don't actually
>   use this new infrastructure yet
> * Expect that vports that can handle layer 3 packets will: have
>   a type other than ARPHRD_IPETHER; can also handle layer 2 packets;
>   and that packets can be differentiated by layer 2 packets having
>   skb->protocol set to htons(ETH_P_TEB)
>
> v1 - v8 [Lorand Jakub]
>
> wip: fix: openvswitch: add support to push and pop
>
> * Consistently use skb_hdr() in push_eth() by assigning
>   its value to a local variable.
> * Limit scope of hdr in push_mpls()
> * Recalculate csum for protocl change in push_mpls.
>   - Also needed for pop_mpls?
>   - Break out into a fix-patch
>
> Signed-off-by: Simon Horman 
...

> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> index 15f130e4c22b..5567529904fa 100644
> --- a/net/openvswitch/actions.c
> +++ b/net/openvswitch/actions.c
> @@ -300,6 +300,51 @@ static int set_eth_addr(struct sk_buff *skb, struct 
> sw_flow_key *flow_key,
> return 0;
>  }
>
> +static int pop_eth(struct sk_buff *skb, struct sw_flow_key *key)
> +{
> +   /* Pop outermost VLAN tag to skb metadata unless a VLAN tag
> +* is already present there.
> +*/
> +   if ((skb->protocol == htons(ETH_P_8021Q) ||
> +skb->protocol == htons(ETH_P_8021AD)) &&
> +   !skb_vlan_tag_present(skb)) {
> +   int err = skb_vlan_accel(skb);
> +   if (unlikely(err))
> +   return err;
> +   }
> +
I do not think we can keep just the vlan tag and pop ethernet header.
There are multiple issues with this.
First networking stack can not handle suck packet. second issue even
after this patch OVS can not parse this type of packet. third this
patch does not allow pop-eth action on vlan tagged packet.
There is already separate vlan related actions in OVS so lets keep it simple.

> +   skb_pull_rcsum(skb, ETH_HLEN);
> +   skb_reset_mac_header(skb);
> +   skb->mac_len -= ETH_HLEN;
> +
> +   invalidate_flow_key(key);
> +   return 0;
> +}
> +
...
...
> diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
> index 0ea128eeeab2..2d9777abcfc9 100644
> --- a/net/openvswitch/flow.c
> +++ b/net/openvswitch/flow.c
> @@ -468,28 +468,31 @@ static int key_extract(struct sk_buff *skb, struct 
> sw_flow_key *key)
>
> skb_reset_mac_header(skb);
>
> -   /* Link layer.  We are guaranteed to have at least the 14 byte 
> Ethernet
> -* header in the linear data area.
> -*/
> -   eth = eth_hdr(skb);
> -   ether_addr_copy(key->eth.src, eth->h_source);
> -   ether_addr_copy(key->eth.dst, eth->h_dest);
> +   /* Link layer. */
> +   key->eth.tci = 0;
> +   if (key->phy.is_layer3) {
> +   if (skb_vlan_tag_present(skb))
> +   key->eth.tci = htons(skb->vlan_tci);
> +   } else {
> +   eth = eth_hdr(skb);
eth can be moved to this block.

> +   ether_addr_copy(key->eth.src, eth->h_source);
> +   ether_addr_copy(key->eth.dst, eth->h_dest);
>
> -   __skb_pull(skb, 2 * ETH_ALEN);
> -   /* We are going to push all headers that we pull, so no need to
> -* update skb->csum here.
> -*/
> +   __skb_pull(skb, 2 * ETH_ALEN);
> +   /* We are going to push all headers that we pull, so no need 
> to
> +* update skb->csum here.
> +*/
>
> -   key->eth.tci = 0;
> -   if (skb_vlan_tag_present(skb))
> -   key->eth.tci = htons(skb->vlan_tci);
> -   else if

Re: [PATCH net-next v10 3/5] openvswitch: add support to push and pop mpls for layer3 packets

2016-06-02 Thread pravin shelar

On Wed, Jun 1, 2016 at 11:24 PM, Simon Horman
 wrote:
> Allow push and pop mpls actions to act on layer 3 packets by teaching
> them not to access non-existent L2 headers of such packets.
>
> Signed-off-by: Simon Horman 
> ---
> v10
> * Limit scope of hdr in {push,pop}_mpls()
>
> v9
> * New Patch
> ---
>  net/openvswitch/actions.c | 19 ---
>  1 file changed, 12 insertions(+), 7 deletions(-)
>
> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> index 9a3eb7a0ebf4..15f130e4c22b 100644
> --- a/net/openvswitch/actions.c
> +++ b/net/openvswitch/actions.c
> @@ -172,7 +172,8 @@ static int push_mpls(struct sk_buff *skb, struct 
> sw_flow_key *key,
>
> skb_postpush_rcsum(skb, new_mpls_lse, MPLS_HLEN);
>
> -   update_ethertype(skb, eth_hdr(skb), mpls->mpls_ethertype);
> +   if (skb->mac_len)
> +   update_ethertype(skb, eth_hdr(skb), mpls->mpls_ethertype);
We can move all ethernet related code in this if block. for example memmove().

> if (!skb->inner_protocol)
> skb_set_inner_protocol(skb, skb->protocol);
> skb->protocol = mpls->mpls_ethertype;
> @@ -184,7 +185,6 @@ static int push_mpls(struct sk_buff *skb, struct 
> sw_flow_key *key,
>  static int pop_mpls(struct sk_buff *skb, struct sw_flow_key *key,
> const __be16 ethertype)
>  {
> -   struct ethhdr *hdr;
> int err;
>
> err = skb_ensure_writable(skb, skb->mac_len + MPLS_HLEN);
> @@ -199,11 +199,16 @@ static int pop_mpls(struct sk_buff *skb, struct 
> sw_flow_key *key,
> __skb_pull(skb, MPLS_HLEN);
> skb_reset_mac_header(skb);
>
> -   /* skb_mpls_header() is used to locate the ethertype
> -* field correctly in the presence of VLAN tags.
> -*/
> -   hdr = (struct ethhdr *)(skb_mpls_header(skb) - ETH_HLEN);
> -   update_ethertype(skb, hdr, ethertype);
> +   if (skb->mac_len) {
> +   struct ethhdr *hdr;
> +
> +   /* skb_mpls_header() is used to locate the ethertype
> +* field correctly in the presence of VLAN tags.
> +*/
> +   hdr = (struct ethhdr *)(skb_mpls_header(skb) - ETH_HLEN);
> +   update_ethertype(skb, hdr, ethertype);
> +   }
same here.

Re: [PATCH net-next v10 1/5] net: add skb_vlan_accel helper

2016-06-02 Thread pravin shelar

On Wed, Jun 1, 2016 at 11:24 PM, Simon Horman
 wrote:
> This breaks out some of of skb_vlan_pop into a separate helper.
> This new helper moves the outer-most vlan tag present in packet data
> into metadata.
>
> The motivation is to allow acceleration VLAN tags without adding a new
> one. This is in preparation for a push ethernet header support in Open
> vSwitch.
>
> Signed-off-by: Simon Horman 
>

I am not sure we need this function at this point. I will post comment
on patch 4 where it is used.

Re: Possible problem with e6afc8ac ("udp: remove headers from UDP packets before queueing")

2016-06-02 Thread Eric Dumazet

From: Eric Dumazet 

Paul Moore tracked a regression caused by a recent commit, which
mistakenly assumed that sk_filter() could be avoided if socket
had no current BPF filter.

The intent was to avoid udp_lib_checksum_complete() overhead.

But sk_filter() also checks skb_pfmemalloc() and
security_sock_rcv_skb(), so better call it.

Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing")
Signed-off-by: Eric Dumazet 
Reported-by: Paul Moore 
Tested-by: Paul Moore 
Tested-by: Stephen Smalley 
Cc: samanthakumar 
---
 net/ipv4/udp.c |   10 +-
 net/ipv6/udp.c |   12 ++--
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index d56c0559b477..0ff31d97d485 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1618,12 +1618,12 @@ int udp_queue_rcv_skb(struct sock *sk, struct sk_buff 
*skb)
}
}
 
-   if (rcu_access_pointer(sk->sk_filter)) {
-   if (udp_lib_checksum_complete(skb))
+   if (rcu_access_pointer(sk->sk_filter) &&
+   udp_lib_checksum_complete(skb))
goto csum_error;
-   if (sk_filter(sk, skb))
-   goto drop;
-   }
+
+   if (sk_filter(sk, skb))
+   goto drop;
 
udp_csum_pull_header(skb);
if (sk_rcvqueues_full(sk, sk->sk_rcvbuf)) {
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 2da1896af934..f421c9f23c5b 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -653,12 +653,12 @@ int udpv6_queue_rcv_skb(struct sock *sk, struct sk_buff 
*skb)
}
}
 
-   if (rcu_access_pointer(sk->sk_filter)) {
-   if (udp_lib_checksum_complete(skb))
-   goto csum_error;
-   if (sk_filter(sk, skb))
-   goto drop;
-   }
+   if (rcu_access_pointer(sk->sk_filter) &&
+   udp_lib_checksum_complete(skb))
+   goto csum_error;
+
+   if (sk_filter(sk, skb))
+   goto drop;
 
udp_csum_pull_header(skb);
if (sk_rcvqueues_full(sk, sk->sk_rcvbuf)) {

Re: Possible problem with e6afc8ac ("udp: remove headers from UDP packets before queueing")

2016-06-02 Thread Eric Dumazet

On Thu, 2016-06-02 at 17:36 -0400, Paul Moore wrote:
> On Wed, Jun 1, 2016 at 4:44 PM, Stephen Smalley  wrote:
> > On 06/01/2016 03:18 PM, Eric Dumazet wrote:
> >> On Wed, 2016-06-01 at 15:01 -0400, Paul Moore wrote:
> >>> Hello,
> >>>
> >>> I'm currently trying to debug a problem with 4.7-rc1 and labeled
> >>> networking over UDP.  I'm having some difficulty with the latest
> >>> 4.7-rc1 builds on my test system at the moment so I haven't been able
> >>> to concisely identify the problem, but looking through the commits in
> >>> 4.7-rc1 I think there may be a problem with the following:
> >>>
> >>>   commit e6afc8ace6dd5cef5e812f26c72579da8806f5ac
> >>>   Author: samanthakumar 
> >>>   Date:   Tue Apr 5 12:41:15 2016 -0400
> >>>
> >>>udp: remove headers from UDP packets before queueing
> >>>
> >>>Remove UDP transport headers before queueing packets for reception.
> >>>This change simplifies a follow-up patch to add MSG_PEEK support.
> >>>
> >>>Signed-off-by: Sam Kumar 
> >>>Signed-off-by: Willem de Bruijn 
> >>>Signed-off-by: David S. Miller 
> >>>
> >>> ... it appears that this commit changes things so that sk_filter() is
> >>> only called when sk->sk_filter is not NULL.  While this is fine for
> >>> the traditional socket filter case, it causes problems with LSMs that
> >>> make use of security_sock_rcv_skb() to enforce per-packet access
> >>> controls.
> >>>
> >>> Hopefully I'll get 4.7-rc1 booting soon and I can do a proper
> >>> bisection test around this patch, but I wanted to mention this now in
> >>> case others are seeing the same problem.
> >>
> >> Thanks for the report. Please try following fix.
> >>
> >> sk_filter() got additional features like the skb_pfmemalloc() things and
> >> security_sock_rcv_skb()
> >
> > This resolved the SELinux regression for me.
> >
> > Tested-by: Stephen Smalley 
> 
> The patch works for me too.  Eric, are you going to send this to DaveM
> (assuming he isn't listening in on this thread and picking it up
> himself)?
> 
> Tested-by: Paul Moore 

I am going to send the official patch right away, thanks !

Re: Possible problem with e6afc8ac ("udp: remove headers from UDP packets before queueing")

2016-06-02 Thread Paul Moore

On Wed, Jun 1, 2016 at 4:44 PM, Stephen Smalley  wrote:
> On 06/01/2016 03:18 PM, Eric Dumazet wrote:
>> On Wed, 2016-06-01 at 15:01 -0400, Paul Moore wrote:
>>> Hello,
>>>
>>> I'm currently trying to debug a problem with 4.7-rc1 and labeled
>>> networking over UDP.  I'm having some difficulty with the latest
>>> 4.7-rc1 builds on my test system at the moment so I haven't been able
>>> to concisely identify the problem, but looking through the commits in
>>> 4.7-rc1 I think there may be a problem with the following:
>>>
>>>   commit e6afc8ace6dd5cef5e812f26c72579da8806f5ac
>>>   Author: samanthakumar 
>>>   Date:   Tue Apr 5 12:41:15 2016 -0400
>>>
>>>udp: remove headers from UDP packets before queueing
>>>
>>>Remove UDP transport headers before queueing packets for reception.
>>>This change simplifies a follow-up patch to add MSG_PEEK support.
>>>
>>>Signed-off-by: Sam Kumar 
>>>Signed-off-by: Willem de Bruijn 
>>>Signed-off-by: David S. Miller 
>>>
>>> ... it appears that this commit changes things so that sk_filter() is
>>> only called when sk->sk_filter is not NULL.  While this is fine for
>>> the traditional socket filter case, it causes problems with LSMs that
>>> make use of security_sock_rcv_skb() to enforce per-packet access
>>> controls.
>>>
>>> Hopefully I'll get 4.7-rc1 booting soon and I can do a proper
>>> bisection test around this patch, but I wanted to mention this now in
>>> case others are seeing the same problem.
>>
>> Thanks for the report. Please try following fix.
>>
>> sk_filter() got additional features like the skb_pfmemalloc() things and
>> security_sock_rcv_skb()
>
> This resolved the SELinux regression for me.
>
> Tested-by: Stephen Smalley 

The patch works for me too.  Eric, are you going to send this to DaveM
(assuming he isn't listening in on this thread and picking it up
himself)?

Tested-by: Paul Moore 

>> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
>> index d56c0559b477..0ff31d97d485 100644
>> --- a/net/ipv4/udp.c
>> +++ b/net/ipv4/udp.c
>> @@ -1618,12 +1618,12 @@ int udp_queue_rcv_skb(struct sock *sk, struct 
>> sk_buff *skb)
>>   }
>>   }
>>
>> - if (rcu_access_pointer(sk->sk_filter)) {
>> - if (udp_lib_checksum_complete(skb))
>> + if (rcu_access_pointer(sk->sk_filter) &&
>> + udp_lib_checksum_complete(skb))
>>   goto csum_error;
>> - if (sk_filter(sk, skb))
>> - goto drop;
>> - }
>> +
>> + if (sk_filter(sk, skb))
>> + goto drop;
>>
>>   udp_csum_pull_header(skb);
>>   if (sk_rcvqueues_full(sk, sk->sk_rcvbuf)) {
>> diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
>> index 2da1896af934..f421c9f23c5b 100644
>> --- a/net/ipv6/udp.c
>> +++ b/net/ipv6/udp.c
>> @@ -653,12 +653,12 @@ int udpv6_queue_rcv_skb(struct sock *sk, struct 
>> sk_buff *skb)
>>   }
>>   }
>>
>> - if (rcu_access_pointer(sk->sk_filter)) {
>> - if (udp_lib_checksum_complete(skb))
>> - goto csum_error;
>> - if (sk_filter(sk, skb))
>> - goto drop;
>> - }
>> + if (rcu_access_pointer(sk->sk_filter) &&
>> + udp_lib_checksum_complete(skb))
>> + goto csum_error;
>> +
>> + if (sk_filter(sk, skb))
>> + goto drop;
>>
>>   udp_csum_pull_header(skb);
>>   if (sk_rcvqueues_full(sk, sk->sk_rcvbuf)) {
>>
>>

-- 
paul moore
www.paul-moore.com

[PATCH v2 -next] virtio-net: Add initial MTU advice feature

2016-06-02 Thread Aaron Conole

This commit adds the feature bit and associated mtu device entry for the
virtio network device.  When a virtio device comes up, it checks the
feature bit for the VIRTIO_NET_F_MTU feature.  If such feature bit is
enabled, the driver will read the advised MTU and use it as the initial
value.

Signed-off-by: Aaron Conole 
---
v1->v2:
* Fixed omitted hunk from virtio_net.h
* Squashed to a single commit
* Fixed commit message.

 drivers/net/virtio_net.c| 7 +++
 include/uapi/linux/virtio_net.h | 3 +++
 2 files changed, 10 insertions(+)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index e0638e5..ef5ee01 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1896,6 +1896,12 @@ static int virtnet_probe(struct virtio_device *vdev)
if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ))
vi->has_cvq = true;
 
+   if (virtio_has_feature(vdev, VIRTIO_NET_F_MTU)) {
+   dev->mtu = virtio_cread16(vdev,
+ offsetof(struct virtio_net_config,
+  mtu));
+   }
+
if (vi->any_header_sg)
dev->needed_headroom = vi->hdr_len;
 
@@ -2067,6 +2073,7 @@ static unsigned int features[] = {
VIRTIO_NET_F_GUEST_ANNOUNCE, VIRTIO_NET_F_MQ,
VIRTIO_NET_F_CTRL_MAC_ADDR,
VIRTIO_F_ANY_LAYOUT,
+   VIRTIO_NET_F_MTU,
 };
 
 static struct virtio_driver virtio_net_driver = {
diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h
index ec32293..1ab4ea6 100644
--- a/include/uapi/linux/virtio_net.h
+++ b/include/uapi/linux/virtio_net.h
@@ -55,6 +55,7 @@
 #define VIRTIO_NET_F_MQ22  /* Device supports Receive Flow
 * Steering */
 #define VIRTIO_NET_F_CTRL_MAC_ADDR 23  /* Set MAC address */
+#define VIRTIO_NET_F_MTU 25/* Initial MTU advice */
 
 #ifndef VIRTIO_NET_NO_LEGACY
 #define VIRTIO_NET_F_GSO   6   /* Host handles pkts w/ any GSO type */
@@ -73,6 +74,8 @@ struct virtio_net_config {
 * Legal values are between 1 and 0x8000
 */
__u16 max_virtqueue_pairs;
+   /* Default maximum transmit unit advice */
+   __u16 mtu;
 } __attribute__((packed));
 
 /*
-- 
2.5.5

Re: [PATCH v2 5/6] ethernet/intel: Use pci_(request|release)_mem_regions

2016-06-02 Thread Jeff Kirsher

On Thu, 2016-06-02 at 09:30 +0200, Johannes Thumshirn wrote:
> Now that we do have pci_request_mem_regions() and
> pci_release_mem_regions() at
> hand, use it in the Intel ethernet drivers.
> 
> Suggested-by: Christoph Hellwig 
> Signed-off-by: Johannes Thumshirn 
> Cc: Jeff Kirsher 
> Cc: David S. Miller 
> Cc: netdev@vger.kernel.org
> Cc: linux-ker...@vger.kernel.org
> Cc: intel-wired-...@lists.osuosl.org

Acked-by: Jeff Kirsher 

> ---
>  drivers/net/ethernet/intel/e1000e/netdev.c    |  6 ++
>  drivers/net/ethernet/intel/fm10k/fm10k_pci.c  | 11 +++
>  drivers/net/ethernet/intel/i40e/i40e_main.c   |  9 +++--
>  drivers/net/ethernet/intel/igb/igb_main.c | 10 +++---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |  9 +++--
>  5 files changed, 14 insertions(+), 31 deletions(-)


signature.asc
Description: This is a digitally signed message part

[PATCH net-next] net: disable fragment reassembly if high_thresh is zero

2016-06-02 Thread Michal Kubecek

Before commit 6d7b857d541e ("net: use lib/percpu_counter API for
fragmentation mem accounting"), setting the reassembly high threshold
to 0 prevented fragment reassembly as first fragment would be always
evicted before second could be added to the queue. While inefficient,
some users apparently relied on this method.

Since the commit mentioned above, a percpu counter is used for
reassembly memory accounting and high batch size avoids taking slow path
in most common scenarios. As a result, a whole full sized packet can be
reassembled without the percpu counter's main counter changing its value
so that even with high_thresh set to 0, fragmented packets can be still
reassembled and processed.

Add explicit check preventing reassembly if high threshold is zero.

Signed-off-by: Michal Kubecek 
---
 net/ipv4/inet_fragment.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c
index 3a88b0c73797..b5e9317eaf9e 100644
--- a/net/ipv4/inet_fragment.c
+++ b/net/ipv4/inet_fragment.c
@@ -355,7 +355,7 @@ static struct inet_frag_queue *inet_frag_alloc(struct 
netns_frags *nf,
 {
struct inet_frag_queue *q;
 
-   if (frag_mem_limit(nf) > nf->high_thresh) {
+   if (!nf->high_thresh || frag_mem_limit(nf) > nf->high_thresh) {
inet_frag_schedule_worker(f);
return NULL;
}
-- 
2.8.3

[PATCH net-next 2/3] net: vrf: ipv4 support for local traffic to local addresses

2016-06-02 Thread David Ahern

Add support for locally originated traffic to VRF-local addresses. If
destination device for an skb is the loopback or VRF device then set
its dst to a local version of the VRF cached dst_entry and call netif_rx
to insert the packet onto the rx queue - similar to what is done for
loopback. This patch handles IPv4 support; follow on patch handles IPv6.

With this patch, ping, tcp and udp packets to a local IPv4 address are
successfully routed:

$ ip addr show dev eth1
4: eth1:  mtu 1500 qdisc pfifo_fast master 
red state UP group default qlen 1000
link/ether 02:e0:f9:1c:b9:74 brd ff:ff:ff:ff:ff:ff
inet 10.100.1.1/24 brd 10.100.1.255 scope global eth1
   valid_lft forever preferred_lft forever
inet6 2100:1::1/120 scope global
   valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe1c:b974/64 scope link
   valid_lft forever preferred_lft forever

$ ping -c1 -I red 10.100.1.1
ping: Warning: source address might be selected on device other than red.
PING 10.100.1.1 (10.100.1.1) from 10.100.1.1 red: 56(84) bytes of data.
64 bytes from 10.100.1.1: icmp_seq=1 ttl=64 time=0.057 ms

This patch also enables use of IPv4 loopback address on the VRF device:
$ ip addr add dev red 127.0.0.1/8

$ ping -c1 -I red 127.0.0.1
PING 127.0.0.1 (127.0.0.1) from 127.0.0.1 red: 56(84) bytes of data.
64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.058 ms

Signed-off-by: David Ahern 
---
 drivers/net/vrf.c | 100 --
 1 file changed, 98 insertions(+), 2 deletions(-)

diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
index d678aaeba572..7df065456893 100644
--- a/drivers/net/vrf.c
+++ b/drivers/net/vrf.c
@@ -50,6 +50,7 @@ module_param(rule_pref, uint,  S_IRUGO);
 
 struct net_vrf {
struct rtable __rcu *rth;
+   struct rtable __rcu *rth_local;
struct rt6_info __rcu   *rt6;
u32 tb_id;
 };
@@ -60,9 +61,20 @@ struct pcpu_dstats {
u64 tx_drps;
u64 rx_pkts;
u64 rx_bytes;
+   u64 rx_drps;
struct u64_stats_sync   syncp;
 };
 
+static void vrf_rx_stats(struct net_device *dev, int len)
+{
+   struct pcpu_dstats *dstats = this_cpu_ptr(dev->dstats);
+
+   u64_stats_update_begin(>syncp);
+   dstats->rx_pkts++;
+   dstats->rx_bytes += len;
+   u64_stats_update_end(>syncp);
+}
+
 static void vrf_tx_error(struct net_device *vrf_dev, struct sk_buff *skb)
 {
vrf_dev->stats.tx_errors++;
@@ -97,6 +109,34 @@ static struct rtnl_link_stats64 *vrf_get_stats64(struct 
net_device *dev,
return stats;
 }
 
+/* Local traffic destined to local address. Reinsert the packet to rx
+ * path, similar to loopback handling.
+ */
+static int vrf_local_xmit(struct sk_buff *skb, struct net_device *dev,
+ struct dst_entry *dst)
+{
+   int len = skb->len;
+
+   skb_orphan(skb);
+
+   skb_dst_set(skb, dst);
+   skb_dst_force(skb);
+
+   /* set pkt_type to avoid skb hitting packet taps twice -
+* once on Tx and again in Rx processing
+*/
+   skb->pkt_type = PACKET_LOOPBACK;
+
+   skb->protocol = eth_type_trans(skb, dev);
+
+   if (likely(netif_rx(skb) == NET_RX_SUCCESS))
+   vrf_rx_stats(dev, len);
+   else
+   this_cpu_inc(dev->dstats->rx_drps);
+
+   return NETDEV_TX_OK;
+}
+
 #if IS_ENABLED(CONFIG_IPV6)
 static netdev_tx_t vrf_process_v6_outbound(struct sk_buff *skb,
   struct net_device *dev)
@@ -175,6 +215,34 @@ static netdev_tx_t vrf_process_v4_outbound(struct sk_buff 
*skb,
}
 
skb_dst_drop(skb);
+
+   /* if dst.dev is loopback or the VRF device again this is locally
+* originated traffic destined to a local address. Short circuit
+* to Rx path using our local dst
+*/
+   if (rt->dst.dev == net->loopback_dev || rt->dst.dev == vrf_dev) {
+   struct net_vrf *vrf = netdev_priv(vrf_dev);
+   struct rtable *rth_local;
+   struct dst_entry *dst = NULL;
+
+   ip_rt_put(rt);
+
+   rcu_read_lock();
+
+   rth_local = rcu_dereference(vrf->rth_local);
+   if (likely(rth_local)) {
+   dst = _local->dst;
+   dst_hold(dst);
+   }
+
+   rcu_read_unlock();
+
+   if (unlikely(!dst))
+   goto err;
+
+   return vrf_local_xmit(skb, vrf_dev, dst);
+   }
+
skb_dst_set(skb, >dst);
 
/* strip the ethernet header added for pass through VRF device */
@@ -381,29 +449,48 @@ static int vrf_output(struct net *net, struct sock *sk, 
struct sk_buff *skb)
 static void

[PATCH net-next 0/3] net: vrf: Add support for local traffic to local addresses

2016-06-02 Thread David Ahern

Add support for locally originated traffic to VRF-local addresses,
be it addresses on enslaved devices or addresses on the VRF device:

$ ip addr show dev red
33: red:  mtu 65536 qdisc pfifo_fast state UP group 
default qlen 1000
link/ether be:00:53:b5:e4:25 brd ff:ff:ff:ff:ff:ff
inet 1.1.1.1/32 scope global red
   valid_lft forever preferred_lft forever
inet6 :1::1/128 scope global
   valid_lft forever preferred_lft forever

$ ip addr show dev eth1
3: eth1:  mtu 1500 qdisc pfifo_fast master red 
state UP group default qlen 1000
link/ether 02:e0:f9:79:34:bd brd ff:ff:ff:ff:ff:ff
inet 10.100.1.1/24 brd 10.100.1.255 scope global eth1
   valid_lft forever preferred_lft forever
inet6 2100:1::1/120 scope global
   valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
   valid_lft forever preferred_lft forever

$ ping -c1 -I red 10.100.1.1
ping: Warning: source address might be selected on device other than red.
PING 10.100.1.1 (10.100.1.1) from 10.100.1.1 red: 56(84) bytes of data.
64 bytes from 10.100.1.1: icmp_seq=1 ttl=64 time=0.057 ms

$ ping -c1 -I red 1.1.1.1
PING 1.1.1.1 (1.1.1.1) from 1.1.1.1 red: 56(84) bytes of data.
64 bytes from 1.1.1.1: icmp_seq=1 ttl=64 time=0.136 ms

--- 1.1.1.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.136/0.136/0.136/0.000 ms

$ ping6 -c1 -I red  2100:1::1
ping6: Warning: source address might be selected on device other than red.
PING 2100:1::1(2100:1::1) from 2100:1::1 red: 56 data bytes
64 bytes from 2100:1::1: icmp_seq=1 ttl=64 time=0.167 ms

--- 2100:1::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.167/0.167/0.167/0.000 ms

$ ping6 -c1 -I red ::1
PING ::1(::1) from :1::1 red: 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.187 ms

--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.187/0.187/0.187/0.000 ms

This change also enables use of loopback address on the VRF device:
$ ip addr add dev red 127.0.0.1/8

$ ping -c1 -I red 127.0.0.1
PING 127.0.0.1 (127.0.0.1) from 127.0.0.1 red: 56(84) bytes of data.
64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.058 ms

David Ahern (3):
  net: vrf: Minor refactoring for local address patches
  net: vrf: ipv4 support for local traffic to local addresses
  net: vrf: ipv6 support for local traffic to local addresses

 drivers/net/vrf.c | 234 ++
 1 file changed, 201 insertions(+), 33 deletions(-)

-- 
2.1.4

[PATCH net-next 1/3] net: vrf: Minor refactoring for local address patches

2016-06-02 Thread David Ahern

Move the stripping of the ethernet header from is_ip_tx_frame into the
ipv4 and ipv6 outbound functions. If the packet is destined to a local
address the header is retained since the packet is sent back to netif_rx.

Collapse vrf_send_v4_prep into vrf_process_v4_outbound.

Signed-off-by: David Ahern 
---
 drivers/net/vrf.c | 45 ++---
 1 file changed, 18 insertions(+), 27 deletions(-)

diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
index aaac4c779047..d678aaeba572 100644
--- a/drivers/net/vrf.c
+++ b/drivers/net/vrf.c
@@ -125,6 +125,9 @@ static netdev_tx_t vrf_process_v6_outbound(struct sk_buff 
*skb,
skb_dst_drop(skb);
skb_dst_set(skb, dst);
 
+   /* strip the ethernet header added for pass through VRF device */
+   __skb_pull(skb, skb_network_offset(skb));
+
ret = ip6_local_out(net, skb->sk, skb);
if (unlikely(net_xmit_eval(ret)))
dev->stats.tx_errors++;
@@ -145,29 +148,6 @@ static netdev_tx_t vrf_process_v6_outbound(struct sk_buff 
*skb,
 }
 #endif
 
-static int vrf_send_v4_prep(struct sk_buff *skb, struct flowi4 *fl4,
-   struct net_device *vrf_dev)
-{
-   struct rtable *rt;
-   int err = 1;
-
-   rt = ip_route_output_flow(dev_net(vrf_dev), fl4, NULL);
-   if (IS_ERR(rt))
-   goto out;
-
-   /* TO-DO: what about broadcast ? */
-   if (rt->rt_type != RTN_UNICAST && rt->rt_type != RTN_LOCAL) {
-   ip_rt_put(rt);
-   goto out;
-   }
-
-   skb_dst_drop(skb);
-   skb_dst_set(skb, >dst);
-   err = 0;
-out:
-   return err;
-}
-
 static netdev_tx_t vrf_process_v4_outbound(struct sk_buff *skb,
   struct net_device *vrf_dev)
 {
@@ -182,10 +162,24 @@ static netdev_tx_t vrf_process_v4_outbound(struct sk_buff 
*skb,
FLOWI_FLAG_SKIP_NH_OIF,
.daddr = ip4h->daddr,
};
+   struct net *net = dev_net(vrf_dev);
+   struct rtable *rt;
 
-   if (vrf_send_v4_prep(skb, , vrf_dev))
+   rt = ip_route_output_flow(net, , NULL);
+   if (IS_ERR(rt))
goto err;
 
+   if (rt->rt_type != RTN_UNICAST && rt->rt_type != RTN_LOCAL) {
+   ip_rt_put(rt);
+   goto err;
+   }
+
+   skb_dst_drop(skb);
+   skb_dst_set(skb, >dst);
+
+   /* strip the ethernet header added for pass through VRF device */
+   __skb_pull(skb, skb_network_offset(skb));
+
if (!ip4h->saddr) {
ip4h->saddr = inet_select_addr(skb_dst(skb)->dev, 0,
   RT_SCOPE_LINK);
@@ -206,9 +200,6 @@ static netdev_tx_t vrf_process_v4_outbound(struct sk_buff 
*skb,
 
 static netdev_tx_t is_ip_tx_frame(struct sk_buff *skb, struct net_device *dev)
 {
-   /* strip the ethernet header added for pass through VRF device */
-   __skb_pull(skb, skb_network_offset(skb));
-
switch (skb->protocol) {
case htons(ETH_P_IP):
return vrf_process_v4_outbound(skb, dev);
-- 
2.1.4

[PATCH net-next 3/3] net: vrf: ipv6 support for local traffic to local addresses

2016-06-02 Thread David Ahern

Add support for locally originated traffic to VRF-local IPv6 addresses.
Similar to IPv4 a local dst is set on the skb and the packet is
reinserted with a call to netif_rx. With this patch, ping, tcp and udp
packets to a local IPv6 address are successfully routed:

$ ip addr show dev eth1
4: eth1:  mtu 1500 qdisc pfifo_fast master 
red state UP group default qlen 1000
link/ether 02:e0:f9:1c:b9:74 brd ff:ff:ff:ff:ff:ff
inet 10.100.1.1/24 brd 10.100.1.255 scope global eth1
   valid_lft forever preferred_lft forever
inet6 2100:1::1/120 scope global
   valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe1c:b974/64 scope link
   valid_lft forever preferred_lft forever

$ ping6 -c1 -I red 2100:1::1
ping6: Warning: source address might be selected on device other than red.
PING 2100:1::1(2100:1::1) from 2100:1::1 red: 56 data bytes
64 bytes from 2100:1::1: icmp_seq=1 ttl=64 time=0.098 ms

ip6_input is exported so the VRF driver can use it for the dst input
function. The dst_alloc function for IPv4 defaults to setting the input and
output functions; IPv6's does not. VRF does not need to duplicate the Rx path
so just export the ipv6 input function.

Signed-off-by: David Ahern 
---
 drivers/net/vrf.c | 89 ---
 1 file changed, 85 insertions(+), 4 deletions(-)

diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
index 7df065456893..a0ca158f6ad9 100644
--- a/drivers/net/vrf.c
+++ b/drivers/net/vrf.c
@@ -52,6 +52,7 @@ struct net_vrf {
struct rtable __rcu *rth;
struct rtable __rcu *rth_local;
struct rt6_info __rcu   *rt6;
+   struct rt6_info __rcu   *rt6_local;
u32 tb_id;
 };
 
@@ -163,6 +164,46 @@ static netdev_tx_t vrf_process_v6_outbound(struct sk_buff 
*skb,
goto err;
 
skb_dst_drop(skb);
+
+   /* if dst.dev is loopback or the VRF device again this is locally
+* originated traffic destined to a local address. Short circuit
+* to Rx path using our local dst
+*/
+   if (dst->dev == net->loopback_dev || dst->dev == dev) {
+   struct net_vrf *vrf = netdev_priv(dev);
+   struct rt6_info *rt6_local;
+
+   /* release looked up dst and use cached local dst */
+   dst_release(dst);
+
+   rcu_read_lock();
+
+   rt6_local = rcu_dereference(vrf->rt6_local);
+   if (unlikely(!rt6_local)) {
+   rcu_read_unlock();
+   goto err;
+   }
+
+   /* Ordering issue: cached local dst is created on newlink
+* before the IPv6 initialization. Using the local dst
+* requires rt6i_idev to be set so make sure it is.
+*/
+   if (unlikely(!rt6_local->rt6i_idev)) {
+   rt6_local->rt6i_idev = in6_dev_get(dev);
+   if (!rt6_local->rt6i_idev) {
+   rcu_read_unlock();
+   goto err;
+   }
+   }
+
+   dst = _local->dst;
+   dst_hold(dst);
+
+   rcu_read_unlock();
+
+   return vrf_local_xmit(skb, dev, _local->dst);
+   }
+
skb_dst_set(skb, dst);
 
/* strip the ethernet header added for pass through VRF device */
@@ -342,27 +383,38 @@ static int vrf_output6(struct net *net, struct sock *sk, 
struct sk_buff *skb)
 static void vrf_rt6_release(struct net_vrf *vrf)
 {
struct rt6_info *rt6 = rtnl_dereference(vrf->rt6);
+   struct rt6_info *rt6_local = rtnl_dereference(vrf->rt6_local);
 
-   rcu_assign_pointer(vrf->rt6, NULL);
+   RCU_INIT_POINTER(vrf->rt6, NULL);
+   RCU_INIT_POINTER(vrf->rt6_local, NULL);
+   synchronize_rcu();
 
if (rt6)
dst_release(>dst);
+
+   if (rt6_local) {
+   if (rt6_local->rt6i_idev)
+   in6_dev_put(rt6_local->rt6i_idev);
+
+   dst_release(_local->dst);
+   }
 }
 
 static int vrf_rt6_create(struct net_device *dev)
 {
+   int flags = DST_HOST | DST_NOPOLICY | DST_NOXFRM | DST_NOCACHE;
struct net_vrf *vrf = netdev_priv(dev);
struct net *net = dev_net(dev);
struct fib6_table *rt6i_table;
-   struct rt6_info *rt6;
+   struct rt6_info *rt6, *rt6_local;
int rc = -ENOMEM;
 
rt6i_table = fib6_new_table(net, vrf->tb_id);
if (!rt6i_table)
goto out;
 
-   rt6 = ip6_dst_alloc(net, dev,
-   DST_HOST | DST_NOPOLICY | DST_NOXFRM | DST_NOCACHE);
+   /* create a dst for routing packets out a VRF device */
+   rt6 = ip6_dst_alloc(net, dev, flags);
if (!rt6)
goto out;
 
@@ -370,7 +422,25 @@ static int

Re: [net-next] ovs: set name assign type of internal port

2016-06-02 Thread pravin shelar

On Tue, May 31, 2016 at 6:41 AM, Zhang Shengju
 wrote:
> Set name_assign_type of internal port to NET_NAME_USER.
>
> Signed-off-by: Zhang Shengju 
> ---
>  net/openvswitch/vport-internal_dev.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/openvswitch/vport-internal_dev.c 
> b/net/openvswitch/vport-internal_dev.c
> index 2ee48e4..434e04c 100644
> --- a/net/openvswitch/vport-internal_dev.c
> +++ b/net/openvswitch/vport-internal_dev.c
> @@ -195,7 +195,7 @@ static struct vport *internal_dev_create(const struct 
> vport_parms *parms)
> }
>
> vport->dev = alloc_netdev(sizeof(struct internal_dev),
> - parms->name, NET_NAME_UNKNOWN, do_setup);
> + parms->name, NET_NAME_USER, do_setup);

Looks good.
Acked-by: Pravin B Shelar

Re: [PATCH v2 2/5] fsl/qe: setup clock source for TDM mode

2016-06-02 Thread David Miller

From: Zhao Qiang 
Date: Thu, 2 Jun 2016 09:44:58 +0800

> +static int ucc_get_tdm_sync_source(u32 tdm_num, enum qe_clock clock,
> +enum comm_dir mode)
> +{
> + int source = -EINVAL;
> +
> + if (mode == COMM_DIR_RX && clock == QE_RSYNC_PIN) {
> + source = 0;
> + return source;
> + }
> + if (mode == COMM_DIR_TX && clock == QE_TSYNC_PIN) {
> + source = 0;
> + return source;
> + }
> +
> + switch (tdm_num) {
> + case 0:
> + case 1:
> + switch (clock) {
> + case QE_BRG9:
> + source = 1;
> + break;
> + case QE_BRG10:
> + source = 2;
> + break;

These switch case bodies are over indented.  Same goes for the rest of
this function.

RE: [PATCH] r8152: Add support for setting MAC to system's Auxiliary MAC address

2016-06-02 Thread Mario_Limonciello

> I have some other questions which answers should we know:
> 
> 1) Is that AUX MAC address implemented only in customized windows Dell
> driver? Or also in "upstream" windows Realtek driver and all users of
> Realtek hw can install it (or update via next driver update)?
> 

I don't have the information on this.  Realtek will have to comment here as this
part is a black box to me.  I'm asking my internal colleagues about this too.

> 2) Can you share pseudo code or description of algorithm which decide
> MAC address for newly connected r8152 device on windows? This could help
> us to decide if something similar/same cannot be implemented also on
> linux (either in kernel or userspace). What I would like to know are
> those situations when you connect more r8152 devices (some Dell and some
> non-Dell).
> 

This is another thing I don't have the information for right now.
I can install Windows on a laptop, install the Realtek driver and experiment,
but it would be better to get this directly from Realtek if at all possible.

> > I do have a way to query if a dock is plugged in via SMM, but I doubt
> > that's what Realtek is using on the Windows side.
> 
> So there is some way to check if Dell dock is plugged, right? But what
> happen if you connect Dell dock and also non-Dell r8152 device? Can you
> distinguish which device is Dell and which non-Dell?

Yes, when querying if a Dell dock is plugged in, a "location" and "count" 
parameter is returned.  I'd have to figure out how to translate that into
what the Linux kernel sees.  Actually the information for how to do this 
is already public too.  It's in a pull request for Dock FW updating in the 
fwupd project.

https://github.com/hughsie/fwupd/pull/49/files#diff-81b55c87ce1542a18b0a4b2b228b9129R189

> 
> Anyway, I think that by SMM you mean dell smbios API call. Cannot you
> guys in Dell release documentation of all smbios calls to community?

Well dell SMBIOS API call really means to use dcdbas kernel module which
does SMM..

> Time to time you release some small parts in libsmbios project which
> then we can use for implementing useful parts in kernel (e.g. LED driver
> for controlling keyboard backlight). But there are couple of
> undocumented APIs and maybe some can also help with this problem...
> 

Releasing different bits of our SMBIOS document requires approvals.
We can't just release the whole thing as there are lots of interfaces that
aren't intended for the OS to be using.  They're used only by Dell tools.

For example we just had approval for information about querying TPM
and dock information and those are present in the fwupd pull request
for dock and TPM FW updates you see above.

If you have some API's in particular you would like more information on,
I'm happy to have internal discussion to see if we can release information
on those.

> > I'd leave that as
> > a second to last resort (last resort being move back to userspace
> > again).
> >
> > > What you definitely should not do is to change the mac for some
> > > arbitrary "first" device.  Then you are better off with the
> > > userspace proposal where you and your users have some chance to
> > > implement a sensible policy based on e.g. usb port numbers.
> >
> > OK, if I can't come up with a way to key on the device being a Dell
> > dock I'll scrap this entirely kernel approach.
> 
> E.g. PCI devices have ordinary PCI device & vendor IDs, but have Dell
> specific subsystem IDs. And via subsystem IDs we can distinguish between
> Intel graphics card on Dell laptop and on non-Dell laptop.
> 
> Does not you have some special/modified firmware in those Dell realtek
> docks (and ability to check from OS some registers)?

I think so.  Otherwise there would be all the same concerns you have outlined
with generic devices.  Like I said this part is currently a black box to me.  
I hope Realtek can publicly comment on this, or I can get some information 
from my colleagues.

Re: [PATCH v2] r8152: Add support for setting MAC to system's Auxiliary MAC address

2016-06-02 Thread Andrew Lunn

On Thu, Jun 02, 2016 at 07:04:32PM +, mario_limoncie...@dell.com wrote:
> > -Original Message-
> > From: Andrew Lunn [mailto:and...@lunn.ch]
> > Sent: Thursday, June 2, 2016 2:03 PM
> > To: Limonciello, Mario 
> > Cc: gre...@linuxfoundation.org; hayesw...@realtek.com; linux-
> > ker...@vger.kernel.org; netdev@vger.kernel.org; linux-
> > u...@vger.kernel.org; pali.ro...@gmail.com; anthony.w...@canonical.com
> > Subject: Re: [PATCH v2] r8152: Add support for setting MAC to system's
> > Auxiliary MAC address
> > 
> > > > And you want to check this for all Dell devices?  Please be model
> > > > specific, I doubt a bunch of Dell servers wants to run this code...
> > > >
> > >
> > > Tracking model specific is really going to turn into a giant list never 
> > > ending
> > list.
> > > To drill down more specifically, I can match on chassis too.
> > 
> > Does Dell happen to use its own USB Vendor ID for the USB device in
> > the dock? You could go at this problem from the other direction if it
> > does have a unique vendor ID.
> > 
> >  Andrew
> 
> Unfortunately it's not a Dell specific VID/PID.  I'm asking around to find out
> if there is something else identifiable about this dock's NIC (maybe that 
> r8152 
> can query).

lsusb -v

I assume there is a USB hub in the dock, maybe that has a Dell VID?
Going one level up the USB tree hierarchy should not be too hard.

  Andrew

Re: [PATCH] net: ethernet: wiznet: Remove create_workqueue

2016-06-02 Thread David Miller

From: Bhaktipriya Shridhar 
Date: Wed, 1 Jun 2016 23:29:15 +0530

> alloc_workqueue replaces deprecated create_workqueue().
> 
> A dedicated workqueue has been used since the workitems are involved
> in normal device operation. Workitems >rx_work and >tx_work,
> map to w5100_rx_work and w5100_tx_work respectively and are involved in
> receiving and transmitting packets. Forward progress under
> memory pressure is a requirement here.
> 
> create_workqueue has been replaced with alloc_workqueue with max_active
> as 0 since there is no need for throttling the number of active work
> items.
> 
> Since the driver may be used in memory reclaim path,
> WQ_MEM_RECLAIM has been set to guarantee forward progress.
> 
> flush_workqueue is unnecessary since destroy_workqueue() itself calls
> drain_workqueue() which flushes repeatedly till the workqueue
> becomes empty. Hence the call to flush_workqueue() has been dropped.
> 
> Signed-off-by: Bhaktipriya Shridhar 

Applied to net-next, thanks.

Re: [PATCH] stmmac: do not sleep in atomic context for mdio_reset

2016-06-02 Thread David Miller

From: Vincent Palatin 
Date: Wed,  1 Jun 2016 08:53:48 -0700

> stmmac_mdio_reset() has been updated to use msleep rather udelay
> (as some PHY requires a one second delay there).
> It called from stmmac_resume() within the spin_lock_irqsave block
> atomic context triggering 'scheduling while atomic'.
> 
> The stmmac_priv lock usage is not fully documented, but it seems
> to protect the access to the MAC registers / DMA structures rather
> than the MDIO bus or the PHY (which have separate locking),
> so we can push the spin_lock after the stmmac_mdio_reset call.
> 
> Signed-off-by: Vincent Palatin 

Applied, thanks.

Re: [PATCH 0/2] Software workaround for i.MX6Q/DL ERR006687

2016-06-02 Thread David Miller

From: Lucas Stach 
Date: Wed,  1 Jun 2016 17:29:41 +0200

> I would prefer if this series gets merged through the imx achitecture
> tree with acks for the FEC changes from the network people.

Sure, this is fine:

Acked-by: David S. Miller

[PATCH] rxrpc: Use pr_ and pr_fmt, reduce object size a few KB

2016-06-02 Thread Joe Perches

Use the more common kernel logging style and reduce object size.

The logging message prefix changes from a mixture of
"RxRPC:" and "RXRPC:" to "af_rxrpc: ".

$ size net/rxrpc/built-in.o*
   textdata bss dec hex filename
  6417219728304   74448   122d0 net/rxrpc/built-in.o.new
  6751219728304   77788   12fdc net/rxrpc/built-in.o.old

Miscellanea:

o Consolidate the ASSERT macros to use a single pr_err call with
  decimal and hexadecimal output and a stringified #OP argument

Signed-off-by: Joe Perches 
---
 net/rxrpc/af_rxrpc.c  | 18 ++
 net/rxrpc/ar-accept.c |  2 ++
 net/rxrpc/ar-ack.c|  2 ++
 net/rxrpc/ar-call.c   | 12 ++--
 net/rxrpc/ar-connection.c |  2 ++
 net/rxrpc/ar-connevent.c  |  2 ++
 net/rxrpc/ar-input.c  |  2 ++
 net/rxrpc/ar-internal.h   | 30 --
 net/rxrpc/ar-key.c|  4 +++-
 net/rxrpc/ar-local.c  |  2 ++
 net/rxrpc/ar-output.c |  2 ++
 net/rxrpc/ar-peer.c   |  2 ++
 net/rxrpc/ar-recvmsg.c|  4 +++-
 net/rxrpc/ar-skbuff.c |  2 ++
 net/rxrpc/ar-transport.c  |  2 ++
 net/rxrpc/rxkad.c |  2 ++
 16 files changed, 56 insertions(+), 34 deletions(-)

diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c
index e45e94c..7840b8e 100644
--- a/net/rxrpc/af_rxrpc.c
+++ b/net/rxrpc/af_rxrpc.c
@@ -9,6 +9,8 @@
  * 2 of the License, or (at your option) any later version.
  */
 
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
 #include 
 #include 
 #include 
@@ -796,49 +798,49 @@ static int __init af_rxrpc_init(void)
"rxrpc_call_jar", sizeof(struct rxrpc_call), 0,
SLAB_HWCACHE_ALIGN, NULL);
if (!rxrpc_call_jar) {
-   printk(KERN_NOTICE "RxRPC: Failed to allocate call jar\n");
+   pr_notice("Failed to allocate call jar\n");
goto error_call_jar;
}
 
rxrpc_workqueue = alloc_workqueue("krxrpcd", 0, 1);
if (!rxrpc_workqueue) {
-   printk(KERN_NOTICE "RxRPC: Failed to allocate work queue\n");
+   pr_notice("Failed to allocate work queue\n");
goto error_work_queue;
}
 
ret = rxrpc_init_security();
if (ret < 0) {
-   printk(KERN_CRIT "RxRPC: Cannot initialise security\n");
+   pr_crit("Cannot initialise security\n");
goto error_security;
}
 
ret = proto_register(_proto, 1);
if (ret < 0) {
-   printk(KERN_CRIT "RxRPC: Cannot register protocol\n");
+   pr_crit("Cannot register protocol\n");
goto error_proto;
}
 
ret = sock_register(_family_ops);
if (ret < 0) {
-   printk(KERN_CRIT "RxRPC: Cannot register socket family\n");
+   pr_crit("Cannot register socket family\n");
goto error_sock;
}
 
ret = register_key_type(_type_rxrpc);
if (ret < 0) {
-   printk(KERN_CRIT "RxRPC: Cannot register client key type\n");
+   pr_crit("Cannot register client key type\n");
goto error_key_type;
}
 
ret = register_key_type(_type_rxrpc_s);
if (ret < 0) {
-   printk(KERN_CRIT "RxRPC: Cannot register server key type\n");
+   pr_crit("Cannot register server key type\n");
goto error_key_type_s;
}
 
ret = rxrpc_sysctl_init();
if (ret < 0) {
-   printk(KERN_CRIT "RxRPC: Cannot register sysctls\n");
+   pr_crit("Cannot register sysctls\n");
goto error_sysctls;
}
 
diff --git a/net/rxrpc/ar-accept.c b/net/rxrpc/ar-accept.c
index e7a7f05..eea5f4a 100644
--- a/net/rxrpc/ar-accept.c
+++ b/net/rxrpc/ar-accept.c
@@ -9,6 +9,8 @@
  * 2 of the License, or (at your option) any later version.
  */
 
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
 #include 
 #include 
 #include 
diff --git a/net/rxrpc/ar-ack.c b/net/rxrpc/ar-ack.c
index 374478e..1838178 100644
--- a/net/rxrpc/ar-ack.c
+++ b/net/rxrpc/ar-ack.c
@@ -9,6 +9,8 @@
  * 2 of the License, or (at your option) any later version.
  */
 
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
 #include 
 #include 
 #include 
diff --git a/net/rxrpc/ar-call.c b/net/rxrpc/ar-call.c
index 571a41f..1fbaae1 100644
--- a/net/rxrpc/ar-call.c
+++ b/net/rxrpc/ar-call.c
@@ -9,6 +9,8 @@
  * 2 of the License, or (at your option) any later version.
  */
 
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
 #include 
 #include 
 #include 
@@ -669,8 +671,7 @@ void rxrpc_release_call(struct rxrpc_call *call)
   conn->channels[3] == NULL);
break;
default:
-   printk(KERN_ERR "RxRPC: conn->avail_calls=%d\n",
-  conn->avail_calls);
+   pr_err("conn->avail_calls=%d\n", conn->avail_calls);
BUG();

Re: [PATCH V3 0/2] vhost_net polling optimization

2016-06-02 Thread David Miller

From: Jason Wang 
Date: Wed,  1 Jun 2016 01:56:32 -0400

> This series tries to optimize vhost_net polling at two points:
> 
> - Stop rx polling for reduicng the unnecessary wakeups during
>   handle_rx().
> - Conditonally enable tx polling for reducing the unnecessary
>   traversing and spinlock touching.
> 
> Test shows about 17% improvement on rx pps.
> 
> Please review
> 
> Changes from V2:
> - Don't enable rx vq if we meet an err or rx vq is empty
> Changes from V1:
> - use vhost_net_disable_vq()/vhost_net_enable_vq() instead of open
>   coding.
> - Add a new patch for conditionally enable tx polling.

Michael, please review this patch series.

Thanks.

Re: [net-next] ovs: set name assign type of internal port

2016-06-02 Thread David Miller

From: Zhang Shengju 
Date: Tue, 31 May 2016 13:41:02 +

> Set name_assign_type of internal port to NET_NAME_USER.
> 
> Signed-off-by: Zhang Shengju 

Pravin or some other OVS expert, please review this.

RE: [PATCH v2] r8152: Add support for setting MAC to system's Auxiliary MAC address

2016-06-02 Thread Mario_Limonciello

> -Original Message-
> From: Andrew Lunn [mailto:and...@lunn.ch]
> Sent: Thursday, June 2, 2016 2:03 PM
> To: Limonciello, Mario 
> Cc: gre...@linuxfoundation.org; hayesw...@realtek.com; linux-
> ker...@vger.kernel.org; netdev@vger.kernel.org; linux-
> u...@vger.kernel.org; pali.ro...@gmail.com; anthony.w...@canonical.com
> Subject: Re: [PATCH v2] r8152: Add support for setting MAC to system's
> Auxiliary MAC address
> 
> > > And you want to check this for all Dell devices?  Please be model
> > > specific, I doubt a bunch of Dell servers wants to run this code...
> > >
> >
> > Tracking model specific is really going to turn into a giant list never 
> > ending
> list.
> > To drill down more specifically, I can match on chassis too.
> 
> Does Dell happen to use its own USB Vendor ID for the USB device in
> the dock? You could go at this problem from the other direction if it
> does have a unique vendor ID.
> 
>  Andrew

Unfortunately it's not a Dell specific VID/PID.  I'm asking around to find out
if there is something else identifiable about this dock's NIC (maybe that r8152 
can query).

Re: [PATCH] r8152: Add support for setting MAC to system's Auxiliary MAC address

2016-06-02 Thread Pali Rohár

On Thursday 02 June 2016 20:28:33 mario_limoncie...@dell.com wrote:
> > -Original Message-
> > From: Bjørn Mork [mailto:bj...@mork.no]
> > Sent: Thursday, June 2, 2016 1:04 PM
> > To: Limonciello, Mario 
> > Cc: gre...@linuxfoundation.org; hayesw...@realtek.com; linux-
> > ker...@vger.kernel.org; netdev@vger.kernel.org; linux-
> > u...@vger.kernel.org; pali.ro...@gmail.com;
> > anthony.w...@canonical.com Subject: Re: [PATCH] r8152: Add support
> > for setting MAC to system's Auxiliary MAC address
> > 
> >  writes:
> > >> > 2) Track whether this is the first or second USB NIC plugged
> > >> > in.  Only
> > 
> > offer it
> > 
> > >> on the first NIC detected by r8152.  When the second NIC is
> > >> plugged in
> > 
> > don't
> > 
> > >> match from ACPI.
> > >> 
> > >> > There would be a question of what to do if the first NIC is
> > >> > removed and
> > >> 
> > >> added back if it should get the persistent system MAC or not.
> > >> 
> > >> > I'd say yes, just make sure that only one NIC can have it at a
> > >> > time.
> > >> 
> > >> You are going to get things very complex very quickly if you try
> > >> to do this.
> > > 
> > > It's really not that hard, track a module wide static variable
> > > whether the feature is in use.  Track in each device whether the
> > > feature was in use.  If it in use, don't assign the next device
> > > plugged in via the ACPI string.  If a device is removed that has
> > > the feature activated, change the module wide static variable.
> > 
> > Having the mac address jump around in an arbitrary way like this is
> > going to confuse the hell out of your users.  Consider what happens
> > if the user docks a laptop with an r8152 usb dongle already
> > plugged in... How are you going to explain that the dock gets some
> > other mac address in this case? How are you going to explain the
> > difference between using an r8152 based dongle and some other
> > ethernet usb dongle with your systems?
> 
> Yeah I understand the concern.  I agree that would be very confusing
> to a user.  This does need to match only on Dell docks then.
> 
> > Make it behave consistently if you're going to add this.  Which can
> > be done by specifically matching the Dell dock (doesn't it have an
> > unique Dell device ID?) and ignoring any other r8152 device.  You
> > could also choose to set the same mac for all r8152 devices. 
> > Which is fine, but will probably confuse many users.
> 
> Unfortunately there is no Dell specific VID/PID.  I checked a no-name
> dongle that used r8152 and it was the same (0bda:8153).  Maybe Hayes
> Wang can check with his Windows driver colleagues if there was
> anything else to key off when this was implemented on the Windows
> Realtek driver.  If there is something else to key off of, I'm not
> aware what it is.  I'll check with some of my colleagues too.

I have some other questions which answers should we know:

1) Is that AUX MAC address implemented only in customized windows Dell 
driver? Or also in "upstream" windows Realtek driver and all users of 
Realtek hw can install it (or update via next driver update)?

2) Can you share pseudo code or description of algorithm which decide 
MAC address for newly connected r8152 device on windows? This could help 
us to decide if something similar/same cannot be implemented also on 
linux (either in kernel or userspace). What I would like to know are 
those situations when you connect more r8152 devices (some Dell and some 
non-Dell).

> I do have a way to query if a dock is plugged in via SMM, but I doubt
> that's what Realtek is using on the Windows side.

So there is some way to check if Dell dock is plugged, right? But what 
happen if you connect Dell dock and also non-Dell r8152 device? Can you 
distinguish which device is Dell and which non-Dell?

Anyway, I think that by SMM you mean dell smbios API call. Cannot you 
guys in Dell release documentation of all smbios calls to community? 
Time to time you release some small parts in libsmbios project which 
then we can use for implementing useful parts in kernel (e.g. LED driver 
for controlling keyboard backlight). But there are couple of 
undocumented APIs and maybe some can also help with this problem... 

> I'd leave that as
> a second to last resort (last resort being move back to userspace
> again).
> 
> > What you definitely should not do is to change the mac for some
> > arbitrary "first" device.  Then you are better off with the
> > userspace proposal where you and your users have some chance to
> > implement a sensible policy based on e.g. usb port numbers.
> 
> OK, if I can't come up with a way to key on the device being a Dell
> dock I'll scrap this entirely kernel approach.

E.g. PCI devices have ordinary PCI device & vendor IDs, but have Dell 
specific subsystem IDs. And via subsystem IDs we can distinguish between 
Intel graphics card on Dell laptop and on non-Dell laptop.

Does not you have some

Re: [PATCH v2] r8152: Add support for setting MAC to system's Auxiliary MAC address

2016-06-02 Thread Andrew Lunn

> > And you want to check this for all Dell devices?  Please be model
> > specific, I doubt a bunch of Dell servers wants to run this code...
> > 
> 
> Tracking model specific is really going to turn into a giant list never 
> ending list.
> To drill down more specifically, I can match on chassis too.

Does Dell happen to use its own USB Vendor ID for the USB device in
the dock? You could go at this problem from the other direction if it
does have a unique vendor ID.

 Andrew

Re: [PATCH] r8152: Add support for setting MAC to system's Auxiliary MAC address

2016-06-02 Thread Pali Rohár

On Thursday 02 June 2016 20:04:02 Bjørn Mork wrote:
>  writes:
> >> > 2) Track whether this is the first or second USB NIC plugged in.
> >> >  Only offer it
> >> 
> >> on the first NIC detected by r8152.  When the second NIC is
> >> plugged in don't match from ACPI.
> >> 
> >> > There would be a question of what to do if the first NIC is
> >> > removed and
> >> 
> >> added back if it should get the persistent system MAC or not.
> >> 
> >> > I'd say yes, just make sure that only one NIC can have it at a
> >> > time.
> >> 
> >> You are going to get things very complex very quickly if you try
> >> to do this.
> > 
> > It's really not that hard, track a module wide static variable
> > whether the feature is in use.  Track in each device whether the
> > feature was in use.  If it in use, don't assign the next device
> > plugged in via the ACPI string.  If a device is removed that has
> > the feature activated, change the module wide static variable.
> 
> Having the mac address jump around in an arbitrary way like this is
> going to confuse the hell out of your users.  Consider what happens
> if the user docks a laptop with an r8152 usb dongle already plugged
> in... How are you going to explain that the dock gets some other mac
> address in this case? How are you going to explain the difference
> between using an r8152 based dongle and some other ethernet usb
> dongle with your systems?
> 
> Make it behave consistently if you're going to add this.  Which can
> be done by specifically matching the Dell dock (doesn't it have an
> unique Dell device ID?) and ignoring any other r8152 device.  You
> could also choose to set the same mac for all r8152 devices.  Which
> is fine, but will probably confuse many users.
> 
> What you definitely should not do is to change the mac for some
> arbitrary "first" device.  Then you are better off with the userspace
> proposal where you and your users have some chance to implement a
> sensible policy based on e.g. usb port numbers.

This is exactly what I wanted to write, but you were faster :-)

You can connect more Dell docks (with r8152 devices) and more non-Dell 
r8152 devices in random order into Dell laptop. In any case dependent on 
connect and disconnect order, devices always must have exactly same MAC 
addresses. Otherwise there will be problems! It confuse users and also 
admins of networks...

So if kernel approach is chosen then I think there are only two solution 
those satisfy above conditions:

First one is:
* all non-Dell devices have own MAC address
* all Dell devices have (one, same) AUX MAC address

Second one is:
* all devices (Dell and also non-Dell) have own address
* AUX MAC address is never used

So what do you (netdev maintainers) think about it?

-- 
Pali Rohár
pali.ro...@gmail.com


signature.asc
Description: This is a digitally signed message part.

Re: [PATCH v5 2/2] skb_array: ring test

2016-06-02 Thread Jesper Dangaard Brouer

On Tue, 24 May 2016 23:34:14 +0300
"Michael S. Tsirkin"  wrote:

> On Tue, May 24, 2016 at 07:03:20PM +0200, Jesper Dangaard Brouer wrote:
> > 
> > On Tue, 24 May 2016 12:28:09 +0200
> > Jesper Dangaard Brouer  wrote:
> >   
> > > I do like perf, but it does not answer my questions about the
> > > performance of this queue. I will code something up in my own
> > > framework[2] to answer my own performance questions.
> > > 
> > > Like what is be minimum overhead (in cycles) achievable with this type
> > > of queue, in the most optimal situation (e.g. same CPU enq+deq cache hot)
> > > for fastpath usage.  
> > 
> > Coded it up here:
> >  https://github.com/netoptimizer/prototype-kernel/commit/b16a3332184
> >  
> > https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/skb_array_bench01.c
> > 
> > This is a really fake benchmark, but it sort of shows the  
> > overhead achievable with this type of queue, where it is the same
> > CPU enqueuing and dequeuing, and cache is guaranteed to be hot.
> > 
> > Measured on a i7-4790K CPU @ 4.00GHz, the average cost of
> > enqueue+dequeue of a single object is around 102 cycles(tsc).
> > 
> > To compare this with below, where enq and deq is measured separately:
> >  102 / 2 = 51 cycles

The alf_queue[1] baseline is 26 cycles in this minimum overhead
achievable benchmark with a MPMC (Multi-Producer/Multi-Consumer) queue
which use a locked cmpxchg.  (SPSC variant is 5 cycles, thus most cost
comes from locked cmpxchg).

[1] 
https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/include/linux/alf_queue.h

> > > Then I also want to know how this performs when two CPUs are involved.
> > > As this is also a primary use-case, for you when sending packets into a
> > > guest.  
> > 
> > Coded it up here:
> >  https://github.com/netoptimizer/prototype-kernel/commit/75fe31ef62e
> >  
> > https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/skb_array_parallel01.c
> >  
> > This parallel benchmark try to keep two (or more) CPUs busy enqueuing or
> > dequeuing on the same skb_array queue.  It prefills the queue,
> > and stops the test as soon as queue is empty or full, or
> > completes a number of "loops"/cycles.
> > 
> > For two CPUs the results are really good:
> >  enqueue: 54 cycles(tsc)
> >  dequeue: 53 cycles(tsc)

As MST points out, a scheme like the alf_queue[1] have the issue that it
"reads" the opposite cacheline of the consumer.tail/producer.tail to
determine if space-is-left/queue-is-empty.  This cause an expensive
transition for the cache coherency protocol.

Coded up similar test for alf_queue:
 https://github.com/netoptimizer/prototype-kernel/commit/b3ff2624f1
 
https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/alf_queue_parallel01.c

For two CPUs MPMC results are, significantly worse, and demonstrate MSTs point:
 enqueue: 227 cycles(tsc)
 dequeue: 231 cycles(tsc)

Alf_queue also have a SPSC (Single-Producer/Single-Consumer) variant:
 enqueue: 24 cycles(tsc)
 dequeue: 23 cycles(tsc)


> > Going to 4 CPUs, things break down (but it was not primary use-case?):
> >  CPU(0) 927 cycles(tsc) enqueue
> >  CPU(1) 921 cycles(tsc) dequeue
> >  CPU(2) 927 cycles(tsc) enqueue
> >  CPU(3) 898 cycles(tsc) dequeue  
> 
> It's mostly the spinlock contention I guess.
> Maybe we don't need fair spinlocks in this case.
> Try replacing spinlocks with simple cmpxchg
> and see what happens?

The alf_queue uses a cmpxchg scheme, and it does scale better when the
number of CPUs increase:

 CPUs:4 Average: 586 cycles(tsc)
 CPUs:6 Average: 744 cycles(tsc)
 CPUs:8 Average: 1578 cycles(tsc)

Notice the alf_queue was designed with the purpose of bulking, to
mitigate the effect of this cacheline bouncing, but it was not covered
in this test.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

RE: [PATCH v2] r8152: Add support for setting MAC to system's Auxiliary MAC address

2016-06-02 Thread Mario_Limonciello

> -Original Message-
> From: Greg KH [mailto:gre...@linuxfoundation.org]
> Sent: Thursday, June 2, 2016 12:48 PM
> To: Limonciello, Mario 
> Cc: hayesw...@realtek.com; LKML ; Netdev
> ; Linux USB ;
> pali.ro...@gmail.com; anthony.w...@canonical.com
> Subject: Re: [PATCH v2] r8152: Add support for setting MAC to system's
> Auxiliary MAC address
> 
> On Thu, Jun 02, 2016 at 11:58:07AM -0500, Mario Limonciello wrote:
> > Dell systems with Type-C ports have support for a persistent system
> > specific MAC address when used with Dell Type-C docks and dongles.
> > This means a dock plugged into two different systems will show different
> > (but persistent) MAC addresses.  Dell Type-C docks and dongles use the
> > r8152 driver.
> >
> > This information for the system's persistent MAC address is burned in
> when
> > the HW is built and available under _SB\AMAC in the DSDT at runtime.
> >
> > More information about the technology is available here:
> > http://www.dell.com/support/article/us/en/04/SLN301147
> >
> > Signed-off-by: Mario Limonciello 
> > ---
> >  drivers/net/usb/r8152.c | 53
> +
> >  1 file changed, 53 insertions(+)
> >
> > diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
> > index 3f9f6ed..6dea542 100644
> > --- a/drivers/net/usb/r8152.c
> > +++ b/drivers/net/usb/r8152.c
> > @@ -26,6 +26,8 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> > +#include 
> >
> >  /* Information for net-next */
> >  #define NETNEXT_VERSION"08"
> > @@ -500,6 +502,7 @@ enum rtl8152_flags {
> > SELECTIVE_SUSPEND,
> > PHY_RESET,
> > SCHEDULE_NAPI,
> > +   MAC_PASSTHRU = 0,
> 
> Does setting that to 0 really work?  You just did this for two enum
> values, what is the compiler supposed to do?

Very silly of me.  I was rushing to send a v2.  
I'm surprised this worked.  Shouldn't be assigned to anything.

> 
> >  };
> >
> >  /* Define these values to match your device */
> > @@ -653,6 +656,7 @@ enum tx_csum_stat {
> >   */
> >  static const int multicast_filter_limit = 32;
> >  static unsigned int agg_buf_sz = 16384;
> > +static bool mac_passthru_active;
> 
> very generic name for a platform-specific feature :(

Once this is broken up into an x86 platform provided method I'll rename this 
to platform_mac_active (or something similar).

> 
> 
> >
> >  #define RTL_LIMITED_TSO_SIZE   (agg_buf_sz - sizeof(struct tx_desc) -
> \
> >  VLAN_ETH_HLEN - VLAN_HLEN)
> > @@ -1030,6 +1034,49 @@ out1:
> > return ret;
> >  }
> >
> > +static int get_auxiliary_addr(struct r8152 *tp, struct sockaddr *sa)
> 
> What about the platform mac address api that was pointed out?

I mentioned this in the cover letter - I haven't gotten a chance to move it 
over there yet.
I sent v2 before I did so that you can see what I've been doing as it was 
relevant to your
other comments.

> 
> > +{
> > +   acpi_status status;
> > +   struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
> > +   union acpi_object *obj;
> > +   int ret = -1;
> > +   unsigned char buf[6];
> > +
> > +   if (!dmi_name_in_vendors("Dell Inc.") || mac_passthru_active)
> > +   return -1;
> 
> Don't make up random error values, please use "real" ones.

OK.

> 
> And you want to check this for all Dell devices?  Please be model
> specific, I doubt a bunch of Dell servers wants to run this code...
> 

Tracking model specific is really going to turn into a giant list never ending 
list.
To drill down more specifically, I can match on chassis too.

> > +
> > +   /* returns _AUXMAC_#AABBCCDDEEFF# */
> > +   status = acpi_evaluate_object(NULL, "\\_SB.AMAC", NULL, );
> > +   obj = (union acpi_object *)buffer.pointer;
> > +   if (ACPI_SUCCESS(status)) {
> > +   if (obj->type != ACPI_TYPE_BUFFER ||
> > +   obj->string.length != 0x17) {
> > +   pr_warn("r8152: get_auxiliary_addr: Invalid buffer");
> > +   goto amacout;
> > +   }
> > +   if (strncmp(obj->string.pointer, "_AUXMAC_#", 9) != 0) {
> > +   pr_warn("r8152: get_auxiliary_addr: Invalid header");
> > +   goto amacout;
> > +   }
> > +   ret = hex2bin(buf, obj->string.pointer + 9, 6);
> > +   if (ret < 0) {
> > +   pr_warn("r8152: get_auxiliary_addr: Invalid MAC");
> > +   goto amacout;
> > +   }
> > +   memcpy(sa->sa_data, buf, 6);
> > +   ether_addr_copy(tp->netdev->dev_addr, sa->sa_data);
> > +   netdev_info(tp->netdev, "Using system MAC address
> %pM\n",
> > +   sa->sa_data);
> > +   set_bit(MAC_PASSTHRU, >flags);
> > +   mac_passthru_active = true;
> > +   ret = 1;
> 
> 1 is not a "all is good" return value.

OK will switch

RE: [PATCH] r8152: Add support for setting MAC to system's Auxiliary MAC address

2016-06-02 Thread Mario_Limonciello

> -Original Message-
> From: Bjørn Mork [mailto:bj...@mork.no]
> Sent: Thursday, June 2, 2016 1:04 PM
> To: Limonciello, Mario 
> Cc: gre...@linuxfoundation.org; hayesw...@realtek.com; linux-
> ker...@vger.kernel.org; netdev@vger.kernel.org; linux-
> u...@vger.kernel.org; pali.ro...@gmail.com; anthony.w...@canonical.com
> Subject: Re: [PATCH] r8152: Add support for setting MAC to system's
> Auxiliary MAC address
> 
>  writes:
> 
> >> > 2) Track whether this is the first or second USB NIC plugged in.  Only
> offer it
> >> on the first NIC detected by r8152.  When the second NIC is plugged in
> don't
> >> match from ACPI.
> >> > There would be a question of what to do if the first NIC is removed and
> >> added back if it should get the persistent system MAC or not.
> >> > I'd say yes, just make sure that only one NIC can have it at a time.
> >>
> >> You are going to get things very complex very quickly if you try to do 
> >> this.
> >
> > It's really not that hard, track a module wide static variable whether
> > the feature is in use.  Track in each device whether the feature was
> > in use.  If it in use, don't assign the next device plugged in via the
> > ACPI string.  If a device is removed that has the feature activated,
> > change the module wide static variable.
> 
> Having the mac address jump around in an arbitrary way like this is
> going to confuse the hell out of your users.  Consider what happens if
> the user docks a laptop with an r8152 usb dongle already plugged in...
> How are you going to explain that the dock gets some other mac address
> in this case? How are you going to explain the difference between using
> an r8152 based dongle and some other ethernet usb dongle with your
> systems?

Yeah I understand the concern.  I agree that would be very confusing
to a user.  This does need to match only on Dell docks then.

> 
> Make it behave consistently if you're going to add this.  Which can be
> done by specifically matching the Dell dock (doesn't it have an unique
> Dell device ID?) and ignoring any other r8152 device.  You could also
> choose to set the same mac for all r8152 devices.  Which is fine, but
> will probably confuse many users.

Unfortunately there is no Dell specific VID/PID.  I checked a no-name dongle
that used r8152 and it was the same (0bda:8153).  Maybe Hayes Wang can 
check with his Windows driver colleagues if there was anything else to key
off when this was implemented on the Windows Realtek driver.  If there 
is something else to key off of, I'm not aware what it is.  I'll check with 
some of my colleagues too.

I do have a way to query if a dock is plugged in via SMM, but I doubt that's
what Realtek is using on the Windows side.  I'd leave that as a second to
last resort (last resort being move back to userspace again).

> 
> What you definitely should not do is to change the mac for some
> arbitrary "first" device.  Then you are better off with the userspace
> proposal where you and your users have some chance to implement a
> sensible policy based on e.g. usb port numbers.

OK, if I can't come up with a way to key on the device being a Dell dock 
I'll scrap this entirely kernel approach.

Re: [PATCH -next 2/2] virtio_net: Read the advised MTU

2016-06-02 Thread Rick Jones


On 06/02/2016 10:06 AM, Aaron Conole wrote:

Rick Jones  writes:

One of the things I've been doing has been setting-up a cluster
(OpenStack) with JumboFrames, and then setting MTUs on instance vNICs
by hand to measure different MTU sizes.  It would be a shame if such a
thing were not possible in the future.  Keeping a warning if shrinking
the MTU would be good, leave the error (perhaps) to if an attempt is
made to go beyond the advised value.


This was cut because it didn't make sense for such a warning to
be issued, but it seems like perhaps you may want such a feature?  I
agree with Michael, after thinking about it, that I don't know what sort
of use the warning would serve.  After all, if you're changing the MTU,
you must have wanted such a change to occur?


I don't need a warning, was simply willing to live with one when 
shrinking the MTU.  Didn't want an error.


happy benchmarking,

rick jones

[PATCH v3 6/7] sctp: Add GSO support

2016-06-02 Thread Marcelo Ricardo Leitner

SCTP has this pecualiarity that its packets cannot be just segmented to
(P)MTU. Its chunks must be contained in IP segments, padding respected.
So we can't just generate a big skb, set gso_size to the fragmentation
point and deliver it to IP layer.

This patch takes a different approach. SCTP will now build a skb as it
would be if it was received using GRO. That is, there will be a cover
skb with protocol headers and children ones containing the actual
segments, already segmented to a way that respects SCTP RFCs.

With that, we can tell skb_segment() to just split based on frag_list,
trusting its sizes are already in accordance.

This way SCTP can benefit from GSO and instead of passing several
packets through the stack, it can pass a single large packet.

v2:
- Added support for receiving GSO frames, as requested by Dave Miller.
- Clear skb->cb if packet is GSO (otherwise it's not used by SCTP)
- Added heuristics similar to what we have in TCP for not generating
  single GSO packets that fills cwnd.
v3:
- consider sctphdr size in skb_gso_transport_seglen()
- rebased due to 5c7cdf339af5 ("gso: Remove arbitrary checks for
  unsupported GSO")

Signed-off-by: Marcelo Ricardo Leitner 
Tested-by: Xin Long 
---
 include/linux/netdev_features.h |   7 +-
 include/linux/netdevice.h   |   1 +
 include/linux/skbuff.h  |   2 +
 include/net/sctp/sctp.h |   4 +
 include/net/sctp/structs.h  |   5 +
 net/core/ethtool.c  |   1 +
 net/core/skbuff.c   |   3 +
 net/sctp/Makefile   |   3 +-
 net/sctp/input.c|  12 +-
 net/sctp/inqueue.c  |  51 +-
 net/sctp/offload.c  |  98 +++
 net/sctp/output.c   | 363 +++-
 net/sctp/protocol.c |   3 +
 net/sctp/socket.c   |   2 +
 14 files changed, 429 insertions(+), 126 deletions(-)
 create mode 100644 net/sctp/offload.c

diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h
index 
aa7b2400f98c584d29e83f0eddf7bf13766cedd1..9c6c8ef2e9e704513cc4272b0a3ee2fec6809d46
 100644
--- a/include/linux/netdev_features.h
+++ b/include/linux/netdev_features.h
@@ -53,8 +53,9 @@ enum {
 * headers in software.
 */
NETIF_F_GSO_TUNNEL_REMCSUM_BIT, /* ... TUNNEL with TSO & REMCSUM */
+   NETIF_F_GSO_SCTP_BIT,   /* ... SCTP fragmentation */
/**/NETIF_F_GSO_LAST =  /* last bit, see GSO_MASK */
-   NETIF_F_GSO_TUNNEL_REMCSUM_BIT,
+   NETIF_F_GSO_SCTP_BIT,
 
NETIF_F_FCOE_CRC_BIT,   /* FCoE CRC32 */
NETIF_F_SCTP_CRC_BIT,   /* SCTP checksum offload */
@@ -128,6 +129,7 @@ enum {
 #define NETIF_F_TSO_MANGLEID   __NETIF_F(TSO_MANGLEID)
 #define NETIF_F_GSO_PARTIAL __NETIF_F(GSO_PARTIAL)
 #define NETIF_F_GSO_TUNNEL_REMCSUM __NETIF_F(GSO_TUNNEL_REMCSUM)
+#define NETIF_F_GSO_SCTP   __NETIF_F(GSO_SCTP)
 #define NETIF_F_HW_VLAN_STAG_FILTER __NETIF_F(HW_VLAN_STAG_FILTER)
 #define NETIF_F_HW_VLAN_STAG_RX__NETIF_F(HW_VLAN_STAG_RX)
 #define NETIF_F_HW_VLAN_STAG_TX__NETIF_F(HW_VLAN_STAG_TX)
@@ -166,7 +168,8 @@ enum {
 NETIF_F_FSO)
 
 /* List of features with software fallbacks. */
-#define NETIF_F_GSO_SOFTWARE   (NETIF_F_ALL_TSO | NETIF_F_UFO)
+#define NETIF_F_GSO_SOFTWARE   (NETIF_F_ALL_TSO | NETIF_F_UFO | \
+NETIF_F_GSO_SCTP)
 
 /*
  * If one device supports one of these features, then enable them
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 
f45929ce815725d868261e9a2585ac53d0c8f128..fa6df2699532e4ad6deb37f1bdcfafc71d2580cb
 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -4012,6 +4012,7 @@ static inline bool net_gso_ok(netdev_features_t features, 
int gso_type)
BUILD_BUG_ON(SKB_GSO_UDP_TUNNEL_CSUM != (NETIF_F_GSO_UDP_TUNNEL_CSUM >> 
NETIF_F_GSO_SHIFT));
BUILD_BUG_ON(SKB_GSO_PARTIAL != (NETIF_F_GSO_PARTIAL >> 
NETIF_F_GSO_SHIFT));
BUILD_BUG_ON(SKB_GSO_TUNNEL_REMCSUM != (NETIF_F_GSO_TUNNEL_REMCSUM >> 
NETIF_F_GSO_SHIFT));
+   BUILD_BUG_ON(SKB_GSO_SCTP!= (NETIF_F_GSO_SCTP >> 
NETIF_F_GSO_SHIFT));
 
return (features & feature) == feature;
 }
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 
aa3f9d7e8d5ca455387efa22d4a9d3a079a56f0c..dc0fca747c5e1c5b23b1e52ce3e354667eb2a994
 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -487,6 +487,8 @@ enum {
SKB_GSO_PARTIAL = 1 << 13,
 
SKB_GSO_TUNNEL_REMCSUM = 1 << 14,
+
+   SKB_GSO_SCTP = 1 << 15,
 };
 
 #if BITS_PER_LONG > 32
diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
index 
b392ac8382f2bf0be118f797acc0eb4ddeb5..632e205ca54bfe85124753e09445251056e19aa7
 100644
--- a/include/net/sctp/sctp.h
+++

[PATCH v3 5/7] sctp: delay as much as possible skb_linearize

2016-06-02 Thread Marcelo Ricardo Leitner

This patch is a preparation for the GSO one. In order to successfully
handle GSO packets on rx path we must not call skb_linearize, otherwise
it defeats any gain GSO may have had.

This patch thus delays as much as possible the call to skb_linearize,
leaving it to sctp_inq_pop() moment. For that the sanity checks
performed now know how to deal with fragments.

One positive side-effect of this is that if the socket is backlogged it
will have the chance of doing it on backlog processing instead of
during softirq.

With this move, it's evident that a check for non-linearity in
sctp_inq_pop was ineffective and is now removed. Note that a similar
check is performed a bit below this one.

Signed-off-by: Marcelo Ricardo Leitner 
Tested-by: Xin Long 
---
 net/sctp/input.c   | 45 +
 net/sctp/inqueue.c | 29 ++---
 2 files changed, 43 insertions(+), 31 deletions(-)

diff --git a/net/sctp/input.c b/net/sctp/input.c
index 
a701527a9480faff1b8d91257e1dbf3c0f09ed68..5cff2546c3dd6d3823b5a28bac1e72880cd57756
 100644
--- a/net/sctp/input.c
+++ b/net/sctp/input.c
@@ -112,7 +112,6 @@ int sctp_rcv(struct sk_buff *skb)
struct sctp_ep_common *rcvr;
struct sctp_transport *transport = NULL;
struct sctp_chunk *chunk;
-   struct sctphdr *sh;
union sctp_addr src;
union sctp_addr dest;
int family;
@@ -124,15 +123,18 @@ int sctp_rcv(struct sk_buff *skb)
 
__SCTP_INC_STATS(net, SCTP_MIB_INSCTPPACKS);
 
-   if (skb_linearize(skb))
+   /* If packet is too small to contain a single chunk, let's not
+* waste time on it anymore.
+*/
+   if (skb->len < sizeof(struct sctphdr) + sizeof(struct sctp_chunkhdr) +
+  skb_transport_offset(skb))
goto discard_it;
 
-   sh = sctp_hdr(skb);
+   if (!pskb_may_pull(skb, sizeof(struct sctphdr)))
+   goto discard_it;
 
-   /* Pull up the IP and SCTP headers. */
+   /* Pull up the IP header. */
__skb_pull(skb, skb_transport_offset(skb));
-   if (skb->len < sizeof(struct sctphdr))
-   goto discard_it;
 
skb->csum_valid = 0; /* Previous value not applicable */
if (skb_csum_unnecessary(skb))
@@ -141,11 +143,7 @@ int sctp_rcv(struct sk_buff *skb)
goto discard_it;
skb->csum_valid = 1;
 
-   skb_pull(skb, sizeof(struct sctphdr));
-
-   /* Make sure we at least have chunk headers worth of data left. */
-   if (skb->len < sizeof(struct sctp_chunkhdr))
-   goto discard_it;
+   __skb_pull(skb, sizeof(struct sctphdr));
 
family = ipver2af(ip_hdr(skb)->version);
af = sctp_get_af_specific(family);
@@ -230,7 +228,7 @@ int sctp_rcv(struct sk_buff *skb)
chunk->rcvr = rcvr;
 
/* Remember the SCTP header. */
-   chunk->sctp_hdr = sh;
+   chunk->sctp_hdr = sctp_hdr(skb);
 
/* Set the source and destination addresses of the incoming chunk.  */
sctp_init_addrs(chunk, , );
@@ -660,19 +658,23 @@ out_unlock:
  */
 static int sctp_rcv_ootb(struct sk_buff *skb)
 {
-   sctp_chunkhdr_t *ch;
-   __u8 *ch_end;
-
-   ch = (sctp_chunkhdr_t *) skb->data;
+   sctp_chunkhdr_t *ch, _ch;
+   int ch_end, offset = 0;
 
/* Scan through all the chunks in the packet.  */
do {
+   /* Make sure we have at least the header there */
+   if (offset + sizeof(sctp_chunkhdr_t) > skb->len)
+   break;
+
+   ch = skb_header_pointer(skb, offset, sizeof(*ch), &_ch);
+
/* Break out if chunk length is less then minimal. */
if (ntohs(ch->length) < sizeof(sctp_chunkhdr_t))
break;
 
-   ch_end = ((__u8 *)ch) + WORD_ROUND(ntohs(ch->length));
-   if (ch_end > skb_tail_pointer(skb))
+   ch_end = offset + WORD_ROUND(ntohs(ch->length));
+   if (ch_end > skb->len)
break;
 
/* RFC 8.4, 2) If the OOTB packet contains an ABORT chunk, the
@@ -697,8 +699,8 @@ static int sctp_rcv_ootb(struct sk_buff *skb)
if (SCTP_CID_INIT == ch->type && (void *)ch != skb->data)
goto discard;
 
-   ch = (sctp_chunkhdr_t *) ch_end;
-   } while (ch_end < skb_tail_pointer(skb));
+   offset = ch_end;
+   } while (ch_end < skb->len);
 
return 0;
 
@@ -1173,6 +1175,9 @@ static struct sctp_association 
*__sctp_rcv_lookup_harder(struct net *net,
 {
sctp_chunkhdr_t *ch;
 
+   if (skb_linearize(skb))
+   return NULL;
+
ch = (sctp_chunkhdr_t *) skb->data;
 
/* The code below will attempt to walk the chunk and extract
diff --git a/net/sctp/inqueue.c b/net/sctp/inqueue.c
index

[PATCH v3 7/7] sctp: improve debug message to also log curr pkt and new chunk size

2016-06-02 Thread Marcelo Ricardo Leitner

This is useful for debugging packet sizes.

Signed-off-by: Marcelo Ricardo Leitner 
Tested-by: Xin Long 
---
 net/sctp/output.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/sctp/output.c b/net/sctp/output.c
index 
60499a69179d255c47da1fa19b73147917a050bf..90d2e125c2f5e0e1ecb33a7eab10772e5b39567c
 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -182,7 +182,8 @@ sctp_xmit_t sctp_packet_transmit_chunk(struct sctp_packet 
*packet,
sctp_xmit_t retval;
int error = 0;
 
-   pr_debug("%s: packet:%p chunk:%p\n", __func__, packet, chunk);
+   pr_debug("%s: packet:%p size:%lu chunk:%p size:%d\n", __func__,
+packet, packet->size, chunk, chunk->skb ? chunk->skb->len : 
-1);
 
switch ((retval = (sctp_packet_append_chunk(packet, chunk {
case SCTP_XMIT_PMTU_FULL:
-- 
2.5.5

[PATCH v3 3/7] sk_buff: allow segmenting based on frag sizes

2016-06-02 Thread Marcelo Ricardo Leitner

This patch allows segmenting a skb based on its frags sizes instead of
based on a fixed value.

Signed-off-by: Marcelo Ricardo Leitner 
Tested-by: Xin Long 
---
 include/linux/skbuff.h |  5 +
 net/core/skbuff.c  | 10 +++---
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 
ee38a41274759f279be1c0752a7fab63fac517c8..329a0a9ef67115cae03b7c1304de031116384148
 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -301,6 +301,11 @@ struct sk_buff;
 #endif
 extern int sysctl_max_skb_frags;
 
+/* Set skb_shinfo(skb)->gso_size to this in case you want skb_segment to
+ * segment using its current segmentation instead.
+ */
+#define GSO_BY_FRAGS   0x
+
 typedef struct skb_frag_struct skb_frag_t;
 
 struct skb_frag_struct {
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 
4724bcf9b0cae1cecbe5bc2c04e308bb70b3232a..97c32c75e704af1f31b064e8f1e0475ff1505d67
 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3116,9 +3116,13 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
int hsize;
int size;
 
-   len = head_skb->len - offset;
-   if (len > mss)
-   len = mss;
+   if (unlikely(mss == GSO_BY_FRAGS)) {
+   len = list_skb->len;
+   } else {
+   len = head_skb->len - offset;
+   if (len > mss)
+   len = mss;
+   }
 
hsize = skb_headlen(head_skb) - offset;
if (hsize < 0)
-- 
2.5.5

[PATCH v3 2/7] skbuff: export skb_gro_receive

2016-06-02 Thread Marcelo Ricardo Leitner

sctp GSO requires it and sctp can be compiled as a module, so we need to
export this function.

Signed-off-by: Marcelo Ricardo Leitner 
Tested-by: Xin Long 
---
 net/core/skbuff.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 
f2b77e549c03a771909cd9c87c40ec2b7826cd31..4724bcf9b0cae1cecbe5bc2c04e308bb70b3232a
 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3438,6 +3438,7 @@ done:
NAPI_GRO_CB(skb)->same_flow = 1;
return 0;
 }
+EXPORT_SYMBOL_GPL(skb_gro_receive);
 
 void __init skb_init(void)
 {
-- 
2.5.5

[PATCH v3 4/7] skbuff: introduce skb_gso_validate_mtu

2016-06-02 Thread Marcelo Ricardo Leitner

skb_gso_network_seglen is not enough for checking fragment sizes if
skb is using GSO_BY_FRAGS as we have to check frag per frag.

This patch introduces skb_gso_validate_mtu, based on the former, which
will wrap the use case inside it as all calls to skb_gso_network_seglen
were to validate if it fits on a given TMU, and improve the check.

Signed-off-by: Marcelo Ricardo Leitner 
Tested-by: Xin Long 
---
 include/linux/skbuff.h |  1 +
 net/core/skbuff.c  | 31 +++
 net/ipv4/ip_forward.c  |  2 +-
 net/ipv4/ip_output.c   |  2 +-
 net/ipv6/ip6_output.c  |  2 +-
 net/mpls/af_mpls.c |  2 +-
 6 files changed, 36 insertions(+), 4 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 
329a0a9ef67115cae03b7c1304de031116384148..aa3f9d7e8d5ca455387efa22d4a9d3a079a56f0c
 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2992,6 +2992,7 @@ void skb_split(struct sk_buff *skb, struct sk_buff *skb1, 
const u32 len);
 int skb_shift(struct sk_buff *tgt, struct sk_buff *skb, int shiftlen);
 void skb_scrub_packet(struct sk_buff *skb, bool xnet);
 unsigned int skb_gso_transport_seglen(const struct sk_buff *skb);
+bool skb_gso_validate_mtu(const struct sk_buff *skb, unsigned int mtu);
 struct sk_buff *skb_segment(struct sk_buff *skb, netdev_features_t features);
 struct sk_buff *skb_vlan_untag(struct sk_buff *skb);
 int skb_ensure_writable(struct sk_buff *skb, int write_len);
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 
97c32c75e704af1f31b064e8f1e0475ff1505d67..5ca562b56ec39d39e1225d96547e242732518ffe
 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4392,6 +4392,37 @@ unsigned int skb_gso_transport_seglen(const struct 
sk_buff *skb)
 }
 EXPORT_SYMBOL_GPL(skb_gso_transport_seglen);
 
+/**
+ * skb_gso_validate_mtu - Return in case such skb fits a given MTU
+ *
+ * @skb: GSO skb
+ *
+ * skb_gso_validate_mtu validates if a given skb will fit a wanted MTU
+ * once split.
+ */
+bool skb_gso_validate_mtu(const struct sk_buff *skb, unsigned int mtu)
+{
+   const struct skb_shared_info *shinfo = skb_shinfo(skb);
+   const struct sk_buff *iter;
+   unsigned int hlen;
+
+   hlen = skb_gso_network_seglen(skb);
+
+   if (shinfo->gso_size != GSO_BY_FRAGS)
+   return hlen <= mtu;
+
+   /* Undo this so we can re-use header sizes */
+   hlen -= GSO_BY_FRAGS;
+
+   skb_walk_frags(skb, iter) {
+   if (hlen + skb_headlen(iter) > mtu)
+   return false;
+   }
+
+   return true;
+}
+EXPORT_SYMBOL_GPL(skb_gso_validate_mtu);
+
 static struct sk_buff *skb_reorder_vlan_header(struct sk_buff *skb)
 {
if (skb_cow(skb, skb_headroom(skb)) < 0) {
diff --git a/net/ipv4/ip_forward.c b/net/ipv4/ip_forward.c
index 
cbfb1808fcc490b94dc0bbdab6142acb8fa37815..9f0a7b96646f368021d9cd51bc3f728ba49eed0d
 100644
--- a/net/ipv4/ip_forward.c
+++ b/net/ipv4/ip_forward.c
@@ -54,7 +54,7 @@ static bool ip_exceeds_mtu(const struct sk_buff *skb, 
unsigned int mtu)
if (skb->ignore_df)
return false;
 
-   if (skb_is_gso(skb) && skb_gso_network_seglen(skb) <= mtu)
+   if (skb_is_gso(skb) && skb_gso_validate_mtu(skb, mtu))
return false;
 
return true;
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 
124bf0a663283502deb03397343160d493a378b1..cbac493c913ac37b57a97314f9e7099b14b8246c
 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -225,7 +225,7 @@ static int ip_finish_output_gso(struct net *net, struct 
sock *sk,
 
/* common case: locally created skb or seglen is <= mtu */
if (((IPCB(skb)->flags & IPSKB_FORWARDED) == 0) ||
- skb_gso_network_seglen(skb) <= mtu)
+ skb_gso_validate_mtu(skb, mtu))
return ip_finish_output2(net, sk, skb);
 
/* Slowpath -  GSO segment length is exceeding the dst MTU.
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 
cbf127ae7c676650cc626cbf12cd61b6b570ea43..6b2f60a5c1de3063bb65c07b2b77c13f33890af8
 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -368,7 +368,7 @@ static bool ip6_pkt_too_big(const struct sk_buff *skb, 
unsigned int mtu)
if (skb->ignore_df)
return false;
 
-   if (skb_is_gso(skb) && skb_gso_network_seglen(skb) <= mtu)
+   if (skb_is_gso(skb) && skb_gso_validate_mtu(skb, mtu))
return false;
 
return true;
diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c
index 
0b80a7140cc494d8c39bd3efba2423272d1b8844..7a4aa3450dd71039e73516bd711ba7392493eb5e
 100644
--- a/net/mpls/af_mpls.c
+++ b/net/mpls/af_mpls.c
@@ -91,7 +91,7 @@ bool mpls_pkt_too_big(const struct sk_buff *skb, unsigned int 
mtu)
if (skb->len <= mtu)
return false;
 
-   if (skb_is_gso(skb) && skb_gso_network_seglen(skb) <= mtu)
+   if (skb_is_gso(skb) && skb_gso_validate_mtu(skb,

[PATCH v3 0/7] sctp: Add GSO support

2016-06-02 Thread Marcelo Ricardo Leitner

This patchset adds sctp GSO support.

Performance tests indicates that increases throughput by 10% if using
bigger chunk sizes, specially if bigger than MTU. For small chunks, it
doesn't help much if not using heavy firewall rules.

For small chunks it will probably be of more use once we get something
like MSG_MORE as David Laight had suggested.

overall changes:
v1->v2:
Added support for receiving GSO frames on SCTP stack, as requested by
Dave Miller.

v2->v3:
Consider sctphdr size in skb_gso_transport_seglen()
rebased due to 5c7cdf339af5 ("gso: Remove arbitrary checks for
unsupported GSO")

Marcelo Ricardo Leitner (7):
  loopback: make use of NETIF_F_GSO_SOFTWARE
  skbuff: export skb_gro_receive
  sk_buff: allow segmenting based on frag sizes
  skbuff: introduce skb_gso_validate_mtu
  sctp: delay as much as possible skb_linearize
  sctp: Add GSO support
  sctp: improve debug message to also log curr pkt and new chunk size

 drivers/net/loopback.c  |   5 +-
 include/linux/netdev_features.h |   7 +-
 include/linux/netdevice.h   |   1 +
 include/linux/skbuff.h  |   8 +
 include/net/sctp/sctp.h |   4 +
 include/net/sctp/structs.h  |   5 +
 net/core/ethtool.c  |   1 +
 net/core/skbuff.c   |  45 -
 net/ipv4/ip_forward.c   |   2 +-
 net/ipv4/ip_output.c|   2 +-
 net/ipv6/ip6_output.c   |   2 +-
 net/mpls/af_mpls.c  |   2 +-
 net/sctp/Makefile   |   3 +-
 net/sctp/input.c|  57 ---
 net/sctp/inqueue.c  |  78 +++--
 net/sctp/offload.c  |  98 +++
 net/sctp/output.c   | 366 +++-
 net/sctp/protocol.c |   3 +
 net/sctp/socket.c   |   2 +
 19 files changed, 524 insertions(+), 167 deletions(-)
 create mode 100644 net/sctp/offload.c

-- 
2.5.5

[PATCH v3 1/7] loopback: make use of NETIF_F_GSO_SOFTWARE

2016-06-02 Thread Marcelo Ricardo Leitner

NETIF_F_GSO_SOFTWARE was defined to list all GSO software types, so lets
make use of it in loopback code. Note that veth/vxlan/others already
uses it.

Within this patch series, this patch causes lo to pick up SCTP GSO feature
automatically (as it's added to NETIF_F_GSO_SOFTWARE) and thus avoiding
segmentation if possible.

Signed-off-by: Marcelo Ricardo Leitner 
Tested-by: Xin Long 
---
 drivers/net/loopback.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
index 
a400288cb37b9bfb6190f1bd7c64d02e97713956..6255973e3dda35fd41464ce51f0f9fb9f0b8364b
 100644
--- a/drivers/net/loopback.c
+++ b/drivers/net/loopback.c
@@ -169,10 +169,9 @@ static void loopback_setup(struct net_device *dev)
dev->flags  = IFF_LOOPBACK;
dev->priv_flags |= IFF_LIVE_ADDR_CHANGE | IFF_NO_QUEUE;
netif_keep_dst(dev);
-   dev->hw_features= NETIF_F_ALL_TSO | NETIF_F_UFO;
+   dev->hw_features= NETIF_F_GSO_SOFTWARE;
dev->features   = NETIF_F_SG | NETIF_F_FRAGLIST
-   | NETIF_F_ALL_TSO
-   | NETIF_F_UFO
+   | NETIF_F_GSO_SOFTWARE
| NETIF_F_HW_CSUM
| NETIF_F_RXCSUM
| NETIF_F_SCTP_CRC
-- 
2.5.5

Re: [PATCH] r8152: Add support for setting MAC to system's Auxiliary MAC address

2016-06-02 Thread Bjørn Mork

 writes:

>> > 2) Track whether this is the first or second USB NIC plugged in.  Only 
>> > offer it
>> on the first NIC detected by r8152.  When the second NIC is plugged in don't
>> match from ACPI.
>> > There would be a question of what to do if the first NIC is removed and
>> added back if it should get the persistent system MAC or not.
>> > I'd say yes, just make sure that only one NIC can have it at a time.
>> 
>> You are going to get things very complex very quickly if you try to do this.
>
> It's really not that hard, track a module wide static variable whether
> the feature is in use.  Track in each device whether the feature was
> in use.  If it in use, don't assign the next device plugged in via the
> ACPI string.  If a device is removed that has the feature activated,
> change the module wide static variable.

Having the mac address jump around in an arbitrary way like this is
going to confuse the hell out of your users.  Consider what happens if
the user docks a laptop with an r8152 usb dongle already plugged in...
How are you going to explain that the dock gets some other mac address
in this case? How are you going to explain the difference between using
an r8152 based dongle and some other ethernet usb dongle with your
systems?

Make it behave consistently if you're going to add this.  Which can be
done by specifically matching the Dell dock (doesn't it have an unique
Dell device ID?) and ignoring any other r8152 device.  You could also
choose to set the same mac for all r8152 devices.  Which is fine, but
will probably confuse many users.

What you definitely should not do is to change the mac for some
arbitrary "first" device.  Then you are better off with the userspace
proposal where you and your users have some chance to implement a
sensible policy based on e.g. usb port numbers.

Bjørn

Re: [PATCH -next 2/2] virtio_net: Read the advised MTU

2016-06-02 Thread Aaron Conole


kbuild test robot <l...@intel.com> writes:

> Hi,
>
> [auto build test ERROR on next-20160602]
>
> url:
> https://github.com/0day-ci/linux/commits/Aaron-Conole/virtio-net-Advised-MTU-feature/20160603-000714
> config: i386-allmodconfig (attached as .config)
> compiler: gcc-6 (Debian 6.1.1-1) 6.1.1 20160430
> reproduce:
> # save the attached .config to linux build tree
> make ARCH=i386 
>
> Note: the 
> linux-review/Aaron-Conole/virtio-net-Advised-MTU-feature/20160603-000714 HEAD 
> d909da4df3c52f78b4f5fcccd89aea5e38722d10 builds fine.
>   It only hurts bisectibility.
>
> All errors (new ones prefixed by >>):
>
>drivers/net/virtio_net.c: In function 'virtnet_probe':
>>> drivers/net/virtio_net.c:1899:31: error: 'VIRTIO_NET_F_MTU'
>>> undeclared (first use in this function)
>  if (virtio_has_feature(vdev, VIRTIO_NET_F_MTU)) {
>   ^~~~
>drivers/net/virtio_net.c:1899:31: note: each undeclared identifier is 
> reported only once for each function it appears in
>drivers/net/virtio_net.c: At top level:
>>> drivers/net/virtio_net.c:2076:2: error: 'VIRTIO_NET_F_MTU'
>>> undeclared here (not in a function)
>  VIRTIO_NET_F_MTU,
>  ^~~~

Oops, hunk was dropped during rebase.  Sorry for this, v2 will fix this
error, as well and I'll do a boot test before submission.

Thanks kbuild robot!

> vim +/VIRTIO_NET_F_MTU +1899 drivers/net/virtio_net.c
>
>   1893virtio_has_feature(vdev, VIRTIO_F_VERSION_1))
>   1894vi->any_header_sg = true;
>   1895
>   1896if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ))
>   1897vi->has_cvq = true;
>   1898
>> 1899 if (virtio_has_feature(vdev, VIRTIO_NET_F_MTU)) {
>   1900dev->mtu = virtio_cread16(vdev,
>   1901 offsetof(struct virtio_net_config,
>   1902   mtu));
>   1903}
>   1904
>   1905if (vi->any_header_sg)
>   1906dev->needed_headroom = vi->hdr_len;
>   1907
>   1908/* Use single tx/rx queue pair as default */
>   1909vi->curr_queue_pairs = 1;
>   1910vi->max_queue_pairs = max_queue_pairs;
>   1911
>   1912/* Allocate/initialize the rx/tx queues, and invoke 
> find_vqs */
>   1913err = init_vqs(vi);
>   1914if (err)
>   1915goto free_stats;
>   1916
>   1917#ifdef CONFIG_SYSFS
>   1918if (vi->mergeable_rx_bufs)
>   1919dev->sysfs_rx_queue_group = 
> _net_mrg_rx_group;
>   1920#endif
>   1921netif_set_real_num_tx_queues(dev, vi->curr_queue_pairs);
>   1922netif_set_real_num_rx_queues(dev, vi->curr_queue_pairs);
>   1923
>   1924virtnet_init_settings(dev);
>   1925
>   1926err = register_netdev(dev);
>   1927if (err) {
>   1928pr_debug("virtio_net: registering device 
> failed\n");
>   1929goto free_vqs;
>   1930}
>   1931
>   1932virtio_device_ready(vdev);
>   1933
>   1934vi->nb.notifier_call = _cpu_callback;
>   1935err = register_hotcpu_notifier(>nb);
>   1936if (err) {
>   1937pr_debug("virtio_net: registering cpu notifier 
> failed\n");
>   1938goto free_unregister_netdev;
>   1939}
>   1940
>   1941/* Assume link up if device can't report link status,
>   1942   otherwise get link status from config. */
>   1943if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_STATUS)) {
>   1944netif_carrier_off(dev);
>   1945schedule_work(>config_work);
>   1946} else {
>   1947vi->status = VIRTIO_NET_S_LINK_UP;
>   1948netif_carrier_on(dev);
>   1949}
>   1950
>   1951pr_debug("virtnet: registered device %s with %d RX and 
> TX vq's\n",
>   1952 dev->name, max_queue_pairs);
>   1953
>   1954return 0;
>   1955
>   1956free_unregister_netdev:
>   1957

Re: [PATCH v2] r8152: Add support for setting MAC to system's Auxiliary MAC address

2016-06-02 Thread Greg KH

On Thu, Jun 02, 2016 at 11:58:07AM -0500, Mario Limonciello wrote:
> Dell systems with Type-C ports have support for a persistent system
> specific MAC address when used with Dell Type-C docks and dongles.
> This means a dock plugged into two different systems will show different
> (but persistent) MAC addresses.  Dell Type-C docks and dongles use the
> r8152 driver.
> 
> This information for the system's persistent MAC address is burned in when
> the HW is built and available under _SB\AMAC in the DSDT at runtime.
> 
> More information about the technology is available here:
> http://www.dell.com/support/article/us/en/04/SLN301147
> 
> Signed-off-by: Mario Limonciello 
> ---
>  drivers/net/usb/r8152.c | 53 
> +
>  1 file changed, 53 insertions(+)
> 
> diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
> index 3f9f6ed..6dea542 100644
> --- a/drivers/net/usb/r8152.c
> +++ b/drivers/net/usb/r8152.c
> @@ -26,6 +26,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>  
>  /* Information for net-next */
>  #define NETNEXT_VERSION  "08"
> @@ -500,6 +502,7 @@ enum rtl8152_flags {
>   SELECTIVE_SUSPEND,
>   PHY_RESET,
>   SCHEDULE_NAPI,
> + MAC_PASSTHRU = 0,

Does setting that to 0 really work?  You just did this for two enum
values, what is the compiler supposed to do?

>  };
>  
>  /* Define these values to match your device */
> @@ -653,6 +656,7 @@ enum tx_csum_stat {
>   */
>  static const int multicast_filter_limit = 32;
>  static unsigned int agg_buf_sz = 16384;
> +static bool mac_passthru_active;

very generic name for a platform-specific feature :(


>  
>  #define RTL_LIMITED_TSO_SIZE (agg_buf_sz - sizeof(struct tx_desc) - \
>VLAN_ETH_HLEN - VLAN_HLEN)
> @@ -1030,6 +1034,49 @@ out1:
>   return ret;
>  }
>  
> +static int get_auxiliary_addr(struct r8152 *tp, struct sockaddr *sa)

What about the platform mac address api that was pointed out?

> +{
> + acpi_status status;
> + struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
> + union acpi_object *obj;
> + int ret = -1;
> + unsigned char buf[6];
> +
> + if (!dmi_name_in_vendors("Dell Inc.") || mac_passthru_active)
> + return -1;

Don't make up random error values, please use "real" ones.

And you want to check this for all Dell devices?  Please be model
specific, I doubt a bunch of Dell servers wants to run this code...

> +
> + /* returns _AUXMAC_#AABBCCDDEEFF# */
> + status = acpi_evaluate_object(NULL, "\\_SB.AMAC", NULL, );
> + obj = (union acpi_object *)buffer.pointer;
> + if (ACPI_SUCCESS(status)) {
> + if (obj->type != ACPI_TYPE_BUFFER ||
> + obj->string.length != 0x17) {
> + pr_warn("r8152: get_auxiliary_addr: Invalid buffer");
> + goto amacout;
> + }
> + if (strncmp(obj->string.pointer, "_AUXMAC_#", 9) != 0) {
> + pr_warn("r8152: get_auxiliary_addr: Invalid header");
> + goto amacout;
> + }
> + ret = hex2bin(buf, obj->string.pointer + 9, 6);
> + if (ret < 0) {
> + pr_warn("r8152: get_auxiliary_addr: Invalid MAC");
> + goto amacout;
> + }
> + memcpy(sa->sa_data, buf, 6);
> + ether_addr_copy(tp->netdev->dev_addr, sa->sa_data);
> + netdev_info(tp->netdev, "Using system MAC address %pM\n",
> + sa->sa_data);
> + set_bit(MAC_PASSTHRU, >flags);
> + mac_passthru_active = true;
> + ret = 1;

1 is not a "all is good" return value.

> + }
> +
> +amacout:
> + kfree(obj);
> + return ret;
> +}
> +
>  static int set_ethernet_addr(struct r8152 *tp)
>  {
>   struct net_device *dev = tp->netdev;
> @@ -1041,6 +1088,10 @@ static int set_ethernet_addr(struct r8152 *tp)
>   else
>   ret = pla_ocp_read(tp, PLA_BACKUP, 8, sa.sa_data);
>  
> + /* if system provides auxiliary MAC address */
> + if (get_auxiliary_addr(tp, ))
> + ret = 0;

ret = my_dell_specific_function();

But again, I don't like this, but I'm not the network subsystem
maintainer, I'll defer to them as to if this is something they want in
individual drivers...

thanks,

greg k-h

Re: [PATCH -next 2/2] virtio_net: Read the advised MTU

2016-06-02 Thread kbuild test robot

Hi,

[auto build test ERROR on next-20160602]

url:
https://github.com/0day-ci/linux/commits/Aaron-Conole/virtio-net-Advised-MTU-feature/20160603-000714
config: i386-allmodconfig (attached as .config)
compiler: gcc-6 (Debian 6.1.1-1) 6.1.1 20160430
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

Note: the 
linux-review/Aaron-Conole/virtio-net-Advised-MTU-feature/20160603-000714 HEAD 
d909da4df3c52f78b4f5fcccd89aea5e38722d10 builds fine.
  It only hurts bisectibility.

All errors (new ones prefixed by >>):

   drivers/net/virtio_net.c: In function 'virtnet_probe':
>> drivers/net/virtio_net.c:1899:31: error: 'VIRTIO_NET_F_MTU' undeclared 
>> (first use in this function)
 if (virtio_has_feature(vdev, VIRTIO_NET_F_MTU)) {
  ^~~~
   drivers/net/virtio_net.c:1899:31: note: each undeclared identifier is 
reported only once for each function it appears in
   drivers/net/virtio_net.c: At top level:
>> drivers/net/virtio_net.c:2076:2: error: 'VIRTIO_NET_F_MTU' undeclared here 
>> (not in a function)
 VIRTIO_NET_F_MTU,
 ^~~~

vim +/VIRTIO_NET_F_MTU +1899 drivers/net/virtio_net.c

  1893  virtio_has_feature(vdev, VIRTIO_F_VERSION_1))
  1894  vi->any_header_sg = true;
  1895  
  1896  if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ))
  1897  vi->has_cvq = true;
  1898  
> 1899  if (virtio_has_feature(vdev, VIRTIO_NET_F_MTU)) {
  1900  dev->mtu = virtio_cread16(vdev,
  1901offsetof(struct 
virtio_net_config,
  1902 mtu));
  1903  }
  1904  
  1905  if (vi->any_header_sg)
  1906  dev->needed_headroom = vi->hdr_len;
  1907  
  1908  /* Use single tx/rx queue pair as default */
  1909  vi->curr_queue_pairs = 1;
  1910  vi->max_queue_pairs = max_queue_pairs;
  1911  
  1912  /* Allocate/initialize the rx/tx queues, and invoke find_vqs */
  1913  err = init_vqs(vi);
  1914  if (err)
  1915  goto free_stats;
  1916  
  1917  #ifdef CONFIG_SYSFS
  1918  if (vi->mergeable_rx_bufs)
  1919  dev->sysfs_rx_queue_group = _net_mrg_rx_group;
  1920  #endif
  1921  netif_set_real_num_tx_queues(dev, vi->curr_queue_pairs);
  1922  netif_set_real_num_rx_queues(dev, vi->curr_queue_pairs);
  1923  
  1924  virtnet_init_settings(dev);
  1925  
  1926  err = register_netdev(dev);
  1927  if (err) {
  1928  pr_debug("virtio_net: registering device failed\n");
  1929  goto free_vqs;
  1930  }
  1931  
  1932  virtio_device_ready(vdev);
  1933  
  1934  vi->nb.notifier_call = _cpu_callback;
  1935  err = register_hotcpu_notifier(>nb);
  1936  if (err) {
  1937  pr_debug("virtio_net: registering cpu notifier 
failed\n");
  1938  goto free_unregister_netdev;
  1939  }
  1940  
  1941  /* Assume link up if device can't report link status,
  1942 otherwise get link status from config. */
  1943  if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_STATUS)) {
  1944  netif_carrier_off(dev);
  1945  schedule_work(>config_work);
  1946  } else {
  1947  vi->status = VIRTIO_NET_S_LINK_UP;
  1948  netif_carrier_on(dev);
  1949  }
  1950  
  1951  pr_debug("virtnet: registered device %s with %d RX and TX 
vq's\n",
  1952   dev->name, max_queue_pairs);
  1953  
  1954  return 0;
  1955  
  1956  free_unregister_netdev:
  1957  vi->vdev->config->reset(vdev);
  1958  
  1959  unregister_netdev(dev);
  1960  free_vqs:
  1961  cancel_delayed_work_sync(>refill);
  1962  free_receive_page_frags(vi);
  1963  virtnet_del_vqs(vi);
  1964  free_stats:
  1965  free_percpu(vi->stats);
  1966  free:
  1967  free_netdev(dev);
  1968  return err;
  1969  }
  1970  
  1971  static void remove_vq_common(struct virtnet_info *vi)
  1972  {
  1973  vi->vdev->config->reset(vi->vdev);
  1974  
  1975  /* Free unused buffers in both send and recv, if any. */
  1976  free_unused_bufs(vi);
  1977  
  1978  free_receive_bufs(vi);
  1979  
  1980  free_receive_page_frags(vi);
  1981  
  1982  virtnet_del_vqs(vi);
  1983  }
  1984  
  1985  static void virtnet_remove(struct virtio_device *vdev)
  1986  {
  1987  struct virtnet_info *vi = vdev->priv;
  1988  
  1989  unregister_hotcpu_notifier(>nb);
  1990  
  1991

[PATCH net-next] hv_netvsc: Fix VF register on vlan devices

2016-06-02 Thread Haiyang Zhang

Added a condition to avoid vlan devices with same MAC registering
as VF.

Signed-off-by: Haiyang Zhang 
Reviewed-by: K. Y. Srinivasan 
---
 drivers/net/hyperv/netvsc_drv.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 6a69b5c..5ac1267 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -1500,6 +1500,10 @@ static int netvsc_netdev_event(struct notifier_block 
*this,
 {
struct net_device *event_dev = netdev_notifier_info_to_dev(ptr);
 
+   /* Avoid Vlan dev with same MAC registering as VF */
+   if (event_dev->priv_flags & IFF_802_1Q_VLAN)
+   return NOTIFY_DONE;
+
switch (event) {
case NETDEV_REGISTER:
return netvsc_register_vf(event_dev);
-- 
1.7.4.1

[PATCH iputils v3] ping6: allow disabling of openssl/libgcrypt support

2016-06-02 Thread Mike Frysinger

Signed-off-by: Mike Frysinger 
---
 Makefile |  5 -
 iputils_md5dig.h |  2 +-
 ping6.c  | 28 +++-
 3 files changed, 32 insertions(+), 3 deletions(-)

diff --git a/Makefile b/Makefile
index b6cf512f22a5..8b9e2aa232e6 100644
--- a/Makefile
+++ b/Makefile
@@ -36,7 +36,7 @@ ARPING_DEFAULT_DEVICE=
 
 # Libgcrypt (for MD5) for ping6 [yes|no|static]
 USE_GCRYPT=yes
-# Crypto library for ping6 [shared|static]
+# Crypto library for ping6 [shared|static|no]
 USE_CRYPTO=shared
 # Resolv library for ping6 [yes|static]
 USE_RESOLV=yes
@@ -66,7 +66,10 @@ ifneq ($(USE_GCRYPT),no)
LIB_CRYPTO = $(call FUNC_LIB,$(USE_GCRYPT),$(LDFLAG_GCRYPT))
DEF_CRYPTO = -DUSE_GCRYPT
 else
+ifneq ($(USE_CRYPTO),no)
LIB_CRYPTO = $(call FUNC_LIB,$(USE_CRYPTO),$(LDFLAG_CRYPTO))
+   DEF_CRYPTO = -DUSE_OPENSSL
+endif
 endif
 
 # USE_RESOLV: LIB_RESOLV
diff --git a/iputils_md5dig.h b/iputils_md5dig.h
index 4cec86699465..9f09ba0a8c60 100644
--- a/iputils_md5dig.h
+++ b/iputils_md5dig.h
@@ -5,7 +5,7 @@
 # include 
 # include 
 # define IPUTILS_MD5DIG_LEN16
-#else
+#elif defined(USE_OPENSSL)
 # include 
 #endif
 
diff --git a/ping6.c b/ping6.c
index 6d1a6db37146..95568ec4fbaf 100644
--- a/ping6.c
+++ b/ping6.c
@@ -85,6 +85,12 @@ char copyright[] =
 #include "ping6_niquery.h"
 #include "in6_flowlabel.h"
 
+#if defined(USE_GCRYPT) || defined(USE_OPENSSL)
+# define ENABLE_NIQUERY 1
+#else
+# define ENABLE_NIQUERY 0
+#endif
+
 #ifndef SOL_IPV6
 #define SOL_IPV6 IPPROTO_IPV6
 #endif
@@ -238,6 +244,8 @@ unsigned int if_name2index(const char *ifname)
return i;
 }
 
+#if ENABLE_NIQUERY
+
 struct niquery_option {
char *name;
int namelen;
@@ -669,6 +677,12 @@ int niquery_option_handler(const char *opt_arg)
return ret;
 }
 
+#else
+
+# define niquery_is_enabled() 0
+
+#endif /* ENABLE_NIQUERY */
+
 static int hextoui(const char *str)
 {
unsigned long val;
@@ -790,6 +804,7 @@ int main(int argc, char *argv[])
printf("ping6 utility, iputils-%s\n", SNAPSHOT);
exit(0);
case 'N':
+#if ENABLE_NIQUERY
if (using_ping_socket) {
fprintf(stderr, "ping: -N requires raw socket 
permissions\n");
exit(2);
@@ -798,6 +813,10 @@ int main(int argc, char *argv[])
usage();
break;
}
+#else
+   fprintf(stderr, "ping: function not available; crypto 
disabled\n");
+   exit(2);
+#endif
break;
COMMON_OPTIONS
common_options(ch);
@@ -891,6 +910,7 @@ int main(int argc, char *argv[])
}
 #endif
 
+#if ENABLE_NIQUERY
if (niquery_is_enabled()) {
niquery_init_nonce();
 
@@ -900,6 +920,7 @@ int main(int argc, char *argv[])
ni_subject_type = NI_SUBJ_IPV6;
}
}
+#endif
 
if (argc > 1) {
 #ifndef ENABLE_PING6_RTHDR
@@ -1369,7 +1390,7 @@ int build_echo(__u8 *_icmph)
return cc;
 }
 
-
+#if ENABLE_NIQUERY
 int build_niquery(__u8 *_nih)
 {
struct ni_hdr *nih;
@@ -1391,6 +1412,7 @@ int build_niquery(__u8 *_nih)
 
return cc;
 }
+#endif
 
 int send_probe(void)
 {
@@ -1398,9 +1420,11 @@ int send_probe(void)
 
rcvd_clear(ntransmitted + 1);
 
+#if ENABLE_NIQUERY
if (niquery_is_enabled())
len = build_niquery(outpack);
else
+#endif
len = build_echo(outpack);
 
if (cmsglen == 0) {
@@ -1619,6 +1643,7 @@ parse_reply(struct msghdr *msg, int cc, void *addr, 
struct timeval *tv)
  hops, 0, tv, pr_addr(>sin6_addr),
  pr_echo_reply))
return 0;
+#if ENABLE_NIQUERY
} else if (icmph->icmp6_type == ICMPV6_NI_REPLY) {
struct ni_hdr *nih = (struct ni_hdr *)icmph;
int seq = niquery_check_nonce(nih->ni_nonce);
@@ -1629,6 +1654,7 @@ parse_reply(struct msghdr *msg, int cc, void *addr, 
struct timeval *tv)
  hops, 0, tv, pr_addr(>sin6_addr),
  pr_niquery_reply))
return 0;
+#endif
} else {
int nexthdr;
struct ip6_hdr *iph1 = (struct ip6_hdr*)(icmph+1);
-- 
2.8.2

Re: [PATCH -next 2/2] virtio_net: Read the advised MTU

2016-06-02 Thread Aaron Conole

"Michael S. Tsirkin"  writes:

> On Thu, Jun 02, 2016 at 11:43:31AM -0400, Aaron Conole wrote:
>> This patch checks the feature bit for the VIRTIO_NET_F_MTU feature. If it
>> exists, read the advised MTU and use it.
>> 
>> No proper error handling is provided for the case where a user changes the
>> negotiated MTU. A future commit will add proper error handling. Instead, a
>> warning is emitted if the guest changes the device MTU after previously
>> being given advice.
>
> I don't see a warning and I don't think it's needed.
> Patch is ok commit log isn't.

Okay, I'll fix it when I submit v2.

>> Signed-off-by: Aaron Conole 
>> ---
>>  drivers/net/virtio_net.c | 7 +++
>>  1 file changed, 7 insertions(+)
>> 
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index e0638e5..ef5ee01 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -1896,6 +1896,12 @@ static int virtnet_probe(struct virtio_device *vdev)
>>  if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ))
>>  vi->has_cvq = true;
>>  
>> +if (virtio_has_feature(vdev, VIRTIO_NET_F_MTU)) {
>> +dev->mtu = virtio_cread16(vdev,
>> +  offsetof(struct virtio_net_config,
>> +   mtu));
>> +}
>> +
>>  if (vi->any_header_sg)
>>  dev->needed_headroom = vi->hdr_len;
>>  
>> @@ -2067,6 +2073,7 @@ static unsigned int features[] = {
>>  VIRTIO_NET_F_GUEST_ANNOUNCE, VIRTIO_NET_F_MQ,
>>  VIRTIO_NET_F_CTRL_MAC_ADDR,
>>  VIRTIO_F_ANY_LAYOUT,
>> +VIRTIO_NET_F_MTU,
>>  };
>>  
>>  static struct virtio_driver virtio_net_driver = {
>> -- 
>> 2.5.5

Re: [PATCH -next 1/2] virtio: Start feature MTU support

2016-06-02 Thread Aaron Conole

"Michael S. Tsirkin"  writes:

> On Thu, Jun 02, 2016 at 11:43:30AM -0400, Aaron Conole wrote:
>> This commit adds the feature bit and associated mtu device entry for the
>> virtio network device. Future commits will make use of these bits to
>> support negotiated MTU.
>
> why split it out? Pls squash with the next patch.

Okay.  I thought the usual method was make a commit which introduces the
data structure changes, and then make a commit which hooks it up.  I'll
squash this in the future.

>> Signed-off-by: Aaron Conole 
>> ---
>>  include/uapi/linux/virtio_net.h | 2 ++
>>  1 file changed, 2 insertions(+)
>> 
>> diff --git a/include/uapi/linux/virtio_net.h 
>> b/include/uapi/linux/virtio_net.h
>> index ec32293..751ff59 100644
>> --- a/include/uapi/linux/virtio_net.h
>> +++ b/include/uapi/linux/virtio_net.h
>> @@ -73,6 +73,8 @@ struct virtio_net_config {
>>   * Legal values are between 1 and 0x8000
>>   */
>>  __u16 max_virtqueue_pairs;
>> +/* Default maximum transmit unit advice */
>> +__u16 mtu;
>>  } __attribute__((packed));
>>  
>>  /*
>> -- 
>> 2.5.5

Re: [RFC PATCH 0/4] Make inotify instance/watches be accounted per userns

2016-06-02 Thread Eric W. Biederman

Nikolay please see my question for you at the end.

Jan Kara  writes:

> On Wed 01-06-16 11:00:06, Eric W. Biederman wrote:
>> Cc'd the containers list.
>> 
>> Nikolay Borisov  writes:
>> 
>> > Currently the inotify instances/watches are being accounted in the 
>> > user_struct structure. This means that in setups where multiple 
>> > users in unprivileged containers map to the same underlying 
>> > real user (e.g. user_struct) the inotify limits are going to be 
>> > shared as well which can lead to unplesantries. This is a problem 
>> > since any user inside any of the containers can potentially exhaust 
>> > the instance/watches limit which in turn might prevent certain 
>> > services from other containers from starting.
>> 
>> On a high level this is a bit problematic as it appears to escapes the
>> current limits and allows anyone creating a user namespace to have their
>> own fresh set of limits.  Given that anyone should be able to create a
>> user namespace whenever they feel like escaping limits is a problem.
>> That however is solvable.
>> 
>> A practical question.  What kind of limits are we looking at here?
>> 
>> Are these loose limits for detecting buggy programs that have gone
>> off their rails?
>> 
>> Are these tight limits to ensure multitasking is possible?
>
> The original motivation for these limits is to limit resource usage.  There
> is in-kernel data structure that is associated with each notification mark
> you create and we don't want users to be able to DoS the system by creating
> too many of them. Thus we limit number of notification marks for each user.
> There is also a limit on the number of notification instances - those are
> naturally limited by the number of open file descriptors but admin may want
> to limit them more...
>
> So cgroups would be probably the best fit for this but I'm not sure whether
> it is not an overkill...

There is some level of kernel memory accounting in the memory cgroup.

That said my experience with cgroups is that while they are good for
some things the semantics that derive from the userspace API are
problematic.

In the cgroup model objects in the kernel don't belong to a cgroup they
belong to a task/process.  Those processes belong to a cgroup.
Processes under control of a sufficiently privileged parent are allowed
to switch cgroups.  This causes implementation challenges and sematic
mismatch in a world where things are typically considered to have an
owner.

Right now fs_notify groups (upon which all of the rest of the inotify
accounting is built upon) belong to a user.  So there is a semantic
mismatch with cgroups right out of the gate.

Given that cgroups have not choosen to account for individual kernel
objects or give that level of control, I think it reasonable to look to
other possible solutions.  Assuming the overhead can be kept under
control.

The implementation of a hierarchical counter in mm/page_counter.c
strongly suggests to me that the overhead can be kept under control.

And yes.  I am thinking of the problem space where you have a limit
based on the problem domain where if an application consumes more than
the limit, the application is likely bonkers.  Which does prevent a DOS
situation in kernel memory.  But is different from the problem I have
seen cgroups solve.

The problem I have seen cgroups solve looks like.  Hmm.  I have 8GB of
ram.  I have 3 containers.  Container A can have 4GB, Container B can
have 1GB and container C can have 3GB.  Then I know one container won't
push the other containers into swap.

Perhaps that would tend to be a top down/vs a bottom up approach to
coming up with limits.  As DOS preventions limits like the inotify ones
are generally written from the perspective of if you have more than X
you are crazy.  While cgroup limits tend to be thought about top down
from a total system management point of view.

So I think there is definitely something to look at.

All of that said there is definitely a practical question that needs to
be asked.  Nikolay how did you get into this situation?  A typical user
namespace configuration will set up uid and gid maps with the help of a
privileged program and not map the uid of the user who created the user
namespace.  Thus avoiding exhausting the limits of the user who created
the container.

Which makes me personally more worried about escaping the existing
limits than exhausting the limits of a particular user.

Eric

Re: [PATCH -next 2/2] virtio_net: Read the advised MTU

2016-06-02 Thread Aaron Conole

Hi Rick,

In the future, please don't cut the list.

Rick Jones  writes:

> On 06/02/2016 08:43 AM, Aaron Conole wrote:
>> This patch checks the feature bit for the VIRTIO_NET_F_MTU feature. If it
>> exists, read the advised MTU and use it.
>>
>> No proper error handling is provided for the case where a user changes the
>> negotiated MTU. A future commit will add proper error handling. Instead, a
>> warning is emitted if the guest changes the device MTU after previously
>> being given advice.
>
> One of the things I've been doing has been setting-up a cluster
> (OpenStack) with JumboFrames, and then setting MTUs on instance vNICs
> by hand to measure different MTU sizes.  It would be a shame if such a
> thing were not possible in the future.  Keeping a warning if shrinking
> the MTU would be good, leave the error (perhaps) to if an attempt is
> made to go beyond the advised value.

This was cut because it didn't make sense for such a warning to
be issued, but it seems like perhaps you may want such a feature?  I
agree with Michael, after thinking about it, that I don't know what sort
of use the warning would serve.  After all, if you're changing the MTU,
you must have wanted such a change to occur?

-Aaron

> happy benchmarking,
>
> rick jones

Re: [PATCH net-next] tcp: accept RST if SEQ matches right edge of SACK block

2016-06-02 Thread Pau Espin

On Thu, Jun 2, 2016 at 3:14 PM, Randall Stewart  wrote:
>
> Pau:
>
> Hopefully me setting the “plain text” in my Mac-Mail preferences will make 
> this
> plain text :-)
>
>
> >>
> >>
> >> Well yes the probability is increased but definitely not assured :-)
> >> Your scenario is specific to a very high loss path. Which is why
> >> the challenge ack is lost...
> >>
> >
> > Correct. But still an improvement in this particular situation with
> > the only drawback of checking against 5 (4+1) SEQ numbers instead of
> > 1.
> >
> >
> >>
> >>
> >> Possibly another alternative is to change the client where when sending a
> >> RST
> >> with a TCB instead of using snd_nxt you could use snd_una. Of course that
> >> could
> >> also result in a challenge ACK if the receiver has not yet received a
> >> ACK that is in flight (that was the whole purpose of the challenge ack). I
> >> think overall
> >> you will always have this problem i.e. the sender of the RST may not know
> >> precisely
> >> the state of the receiver.
> >>
> >>
> >> Indeed, I guess there will still be same problem in other scenarios.
> >> On top of that, it seems a bit weird to me to send a RST packet using
> >> a SEQ number which was already used to send a different packet (that's
> >> what would happen in this case right?).
> >>
> >>
> >> Why the trick here is you want to RST the connection. You need to use
> >> a seq number that is valid.. the seq numbers once a RST is being sent
> >> mean nothing the app will get the same thing.
> >>
> >> In fact snd_nxt in the scenario above is *also* a re-used sequence number.
> >> You
> >> retransmitted from snd_una for one segment, snd_nxt got left 1 segment up,
> >> so
> >> when you sent the RST it would be snd_nxt which was previously a data
> >> segment being marked with RST.
> >>
> >> If you wanted to assure that no other segment had been sent with that
> >> sequence you would have to put snd_max as the value, but of course
> >> that *would* return a challenge ack for sure.
> >>
> >> The trick here is you are trying to “guess” where the peer is. The only
> >> thing you know for sure is snd_una. Anything else in flight won’t  reset
> >> the peer. In fact in your scenario if you had sent snd_una instead of
> >> snd_nxt it would have worked. If you changed the sender to use
> >> snd_una then it would be interesting to see if that also gave you
> >> similar results...
> >>
> >>
> >>
> >>
> >> Your fix happens to work since the receiver happens to have the SACK blocks
> >> in question.. this is fine and if you don’t mind *weakening the security* 
> >> of
> >> the
> >> RST you could do that. I think for stack I am working on for FreeBSD I will
> >> change
> >> the stack I am working on to recognize the RST going out and use snd_una.
> >>
> >>
> >> You mean here you are always going to use snd_una or that you are
> >> going to try to figure out some heuristics to use either snd_next or
> >> snd_una depending on the scenario?
> >>
> >>
> >> For this scenario I will always use snd_una I am not sure you can reliably
> >> develop any heuristics to tell you to use snd_una/snd_nxt or some other
> >> block that as been sent :-)
> >>
> >> snd_una is actually the most actuate as to what you know at the time not
> >> snd_nxt. I am sure using snd_nxt is a hold over from before the RFC got
> >> implemented.
> >>
> >
> > As, as far as I understand now, it could be useful to improve the
> > situation in the sender too by checking if we recently received SACK
> > blocks from the receiver and in that case sending the RST using
> > snd_una instead of snd_nxt because in that case we will almost surely
> > receive a challenge_ack if we use snd_nxt as SEQ for the RST.
> >
>
> Checking the scoreboard and using snd_una instead of snd_nxt
> might help things.
>
> > On the other hand, using  always snd_una as SEQ for the RST would
> > cause other (even more usual) cases to be discarded or answered with a
> > challenge ACK which are accepted right now. I'm thinking for instance
> > any case in which you send packets (so in flight packets
> > sender->receiver) just before the RST is sent (with the snd_una).
> > Packets are received by the receiver and RCV_NXT is updated and then
> > you receive the RST which is < than RCV_NXT just updated. Am I missing
> > something? Please correct me if I'm wrong.
> >
>
> I think if packets are in flight either way you are taking a gamble on what
> you are sending, snd_nxt/snd_una and snd_max.
>
> If you are idle and send a RST then those should all be the same and
> you will win :)
>
> snd_nxt will be questionable if you are in the middle of retransmitting (your 
> case)
> since you really don’t know where it is.
>
> I do like your idea of using the scoreboard to tell if you need to use
> snd_una or snd_nxt. In theory if the scoreboard is empty using
> snd_nxt should be the equivalent to using snd_max.. but if they
> are not equal then you are doing a retransmit and it becomes a crap
>

[ANNOUNCE] nftables 0.6 release

2016-06-02 Thread Pablo Neira Ayuso

Hi!

The Netfilter project proudly presents:

nftables 0.6

This release contains many accumulated bug fixes and new features
availale up to the Linux 4.7-rc1 kernel release.

New features


* Rule replacement: You can replace any rule from the unique 64-bits
  handle. You have to retrieve the handle from the ruleset listing.

  # nft list ruleset -a
  table ip filter {
chain input {
...
ct state new tcp dport ssh accept counter packets 0 bytes 0 # 
handle 4
}
  }

  Then, indicate this handle from the new rule that you want to
  replace, eg.

  # nft replace rule filter input handle 4 ct state new \
tcp dport { 22, 80} counter accept

* Flow table support: This provides a native replacement for the
  hashlimit match in iptables. The rule below creates a 'ssh' flow table
  declares a ratelimit of 10 packets per second for each source IP address:

  # nft add rule filter input tcp dport 22 ct state new \
flow table ssh { ip saddr limit rate 10/second } accept

  This is actually way more than hashlimit since you can use any selector
  and build your own tuple of selectors through concatenations, eg.

  # nft add rule filter input \
flow table acct { iif . ip saddr timeout 60s counter }

  Then, if you want to list the content of the 'acct' flow table:

  # nft list flow table acct
  table ip filter {
flow table acct {
type iface_index . ipv4_addr
flags timeout
elements = { eth0 . 218.68.110.274 expires 3m56s : counter 
packets 1 bytes 98, eth0 . 180.29.103.19 expires 3m57s : counter packets 2 
bytes 80, eth0 . 8.8.8.8 expires 3m44s : counter packets 1 bytes 84}
}
  }

  Note that this listing format is still unstable though, so don't make
  tools to parse this output yet. Commands to empty flow tables and remove
  specific entries are still missing.

  Moreover, flow tables require a Linux kernel >= 4.3.

* New tracing infrastructure: Useful for ruleset debugging, you have
  to enable tracing via:

# nft filter input tcp dport 1 nftrace set 1
# nft filter input icmp type echo-request nftrace set 1

  Then, you can monitor traces through:

# nft -nn monitor trace

  That generates the following outputs:

trace id e1f5055f ip filter input packet: iif eth0 ether saddr
63:f6:4b:00:54:52 ether daddr c9:4b:a9:00:54:52 ip saddr 192.168.122.1
ip daddr 192.168.122.83 ip tos 0 ip ttl 64 ip id 32315 ip length 84
icmp type echo-request icmp code 0 icmp id 10087 icmp sequence 1
trace id e1f5055f ip filter input rule icmp type echo-request
nftrace set 1 (verdict continue)
trace id e1f5055f ip filter input verdict continue
trace id e1f5055f ip filter input
trace id 74e47ad2 ip filter input packet: iif vlan0 ether saddr
63:f6:4b:00:54:52 ether daddr c9:4b:a9:00:54:52 vlan pcp 0 vlan cfi 1
vlan id 1000 ip saddr 10.0.0.1 ip daddr 10.0.0.2 ip tos 0 ip ttl 64 ip
id 49030 ip length 84 icmp type echo-request icmp code 0 icmp id 10095
icmp sequence 1
trace id 74e47ad2 ip filter input rule icmp type echo-request
nftrace set 1 (verdict continue)
trace id 74e47ad2 ip filter input verdict continue
trace id 74e47ad2 ip filter input
trace id 3030de23 ip filter input packet: iif vlan0 ether saddr
63:f6:4b:00:54:52 ether daddr c9:4b:a9:00:54:52 vlan pcp 0 vlan cfi 1
vlan id 1000 ip saddr 10.0.0.1 ip daddr 10.0.0.2 ip tos 16 ip ttl 64
ip id 59062 ip length 60 tcp sport 55438 tcp dport 1 tcp flags ==
syn tcp window 29200
trace id 3030de23 ip filter input rule tcp dport 1 nftrace set
1 (verdict continue)
trace id 3030de23 ip filter input verdict continue
trace id 3030de23 ip filter input

  The trace id is unique for each packet, there above you can see the
  travel of this packet through the nft packet classifier.

* Ratelimiting enhancements: You can now specify ratelimits in terms
  of bytes/second, eg.

  # nft add rule filter forward \
limit rate 1024 mbytes/second counter accept

  The rule above matches packets under the specified ratelimit. This
  requires a Linux kernel >= 4.3 btw.

  You can also indicate the amount of traffic that can go over the
  threshold via 'burst', eg.

  # nft add rule filter forward \
limit rate 1024 mbytes/second burst 10240 bytes counter accept

  You may also need to match based on inverted logic, eg.

  # nft add rule filter forward \
limit rate over 1024 mbytes/second log prefix "OVERLIMIT: " drop

* VLAN matching: You can match any vlan header field and combine this
  with any of the existing upper layer header selectors, eg.

  # nft add rule bridge filter prerouting vlan id 24 \
  ip saddr 192.168.1.0/24 counter accept

* Packet duplication: When used from any of the supported layer 3 families,
  this allows you to clone packets to a given destination address, eg.
  duplicate all packets whose mark is 0x:

  # nft add rule filter

[PATCH v2] r8152: Add support for setting MAC to system's Auxiliary MAC address

2016-06-02 Thread Mario Limonciello

This adjusts a lot of concerns that have been raised on LKML.

Changes from v1:
 * Lots of error checking around bad ACPI data
 * Only activate on Dell system vendor DMI string
 * Use hex2bin instead of inventing a wheel
 * Copy MAC to both dev_addr and sa_data
 * Track and set only for one NIC at a time
 * If MAC lookup failed (bad ACPI data or bad return) fall back
   to regular r8152 MAC setting routine.

This has been tested with TB15, WD15 docks and Dell P/N 96NP5 dongle.
It's also been tested with two devices that use r8152 simultaneously.

Remaining discussion points:
 * Greg KH had asked this to only on machines that are known to have
   \\_SB.AMAC
 - I would rather avoid doing this because the list will just grow every
   year.  I've added lots of error checking and restricted this to Dell.
 * There was also a request to move this to an x86
   arch_get_platform_mac_address() implementation.  
 - I haven't yet done this.  If this is the right approach.
   I would like to know the proper place in arch/x86 to put this code.
   My initial thought was a new file in arch/x86/platform/intel

Mario Limonciello (1):
  r8152: Add support for setting MAC to system's Auxiliary MAC address

 drivers/net/usb/r8152.c | 53 +
 1 file changed, 53 insertions(+)

-- 
2.7.4

RE: [PATCH] r8152: Add support for setting MAC to system's Auxiliary MAC address

2016-06-02 Thread Mario_Limonciello

Some of my comments are getting stale with what I've done in response to all 
these emails.
Let me send a v2 that we can better iterate on, a few comments below though.

> -Original Message-
> From: Greg KH [mailto:gre...@linuxfoundation.org]
> Sent: Thursday, June 2, 2016 11:09 AM
> To: Limonciello, Mario 
> Cc: hayesw...@realtek.com; linux-ker...@vger.kernel.org;
> netdev@vger.kernel.org; linux-...@vger.kernel.org; pali.ro...@gmail.com;
> anthony.w...@canonical.com
> Subject: Re: [PATCH] r8152: Add support for setting MAC to system's
> Auxiliary MAC address
> 
> On Thu, Jun 02, 2016 at 03:46:41PM +, mario_limoncie...@dell.com
> wrote:
> > > >
> > > > This isn't something part of ACPI - it's been added specifically for a
> > > > selection of Dell machines.
> > >
> > > Ah, but isn't ACPI supposed to be a "standard"?  :)
> > >
> >
> > Heh.
> > It's also possible to get this from an SMM routine.  Lesser of two evils to
> fetch the information this way, right? :)
> 
> Yes, but again, please only do this for machines you _know_ this value
> will be present on.  Otherwise you will end up with problems.

I'm going to send a V2, I'd like to know where and how this could still break. 
I am having a hard time grasping this.  

> 
> > > And please wrap your email lines, there is a "standard" for that...
> >
> > I'm unfortunately not limited to an evil mail client at my workplace since 
> > our
> mail server migration.   My apologies, I've got it set to wrap at 76 
> characters
> and I'm trying to make it as LKML friendly as possible.
> 
> It's not working as you can see here :(

Ugh, sorry.  Stupid outlook.  It seems to only be doing it on replies.
I'll manually just chop the lines when they're around that size until I've got a
better solution.

> 
> > > > I would rather not hardcode to the specific DMI model strings of those
> > > > Dell machines as it's certainly going to be a feature that expands to
> > > > more machines.  Since it is Dell specific though, if you would rather
> > > > me just match to the sys vendor Dell Inc., that seems like a pretty
> > > > good compromise to me.
> > >
> > > You need to only do this on machines you "know" have this set to a
> correct
> > > value, otherwise if some other random BIOS happens to set that field to
> > > some random value, you will have problems.
> >
> > Pali had recommended in another message to check the buffer header.  I
> was intending to do this along with check ACPI buffer output type, and
> output size in the next revision I submitted.  By switching to hex2bin, I'll 
> also
> validate that the string has correct values (0-F or 0-f).  If somehow all of 
> that
> fails, the set_ethernet_addr  checks if the address is valid.  If it's 
> invalid it will
> generate a random one.
> 
> Why generate a random one and not just use the one that the network
> controler already provides?

That's how the flow works in r8152 already and I'm not overriding it.
Again, I'll send V2 and you'll see what I did.

> >
> > It's really not that hard, track a module wide static variable whether the
> feature is in use.  Track in each device whether the feature was in use.  If 
> it in
> use, don't assign the next device plugged in via the ACPI string.  If a 
> device is
> removed that has the feature activated, change the module wide static
> variable.
> 
> Ok, let's see the code before I say anymore about this.
> 
> > > What's wrong with a "simple" script to set the mac address from
> userspace if
> > > the user wants something like this?  Provide it as a system package and
> then
> > > no kernel changes are needed at all.  Much easier to support on your end
> > > (you don't have to maintain this odd kernel code for
> > > 10+ years), the default behavior is as Linux users expect, and your
> > > limited number of people who want this crazy behaviour can install your
> > > script if they want it.
> > >
> >
> > This was my original approach.  It involved a network manager script,
> network manager code changes to support this, and exposing this
> somewhere in a platform module (like dell-laptop).  I was told I'm better off
> doing it directly in the network module, so here I am.
> 
> Why not a small systemd unit file for this that sets things up when the
> device is found in the system?  Why mess with network manager and a
> platform kernel driver at all?  That seems very complex for such a
> simple operation where the kernel doesn't need to be involved at all,
> especially for such a "niche" product.
> 
> See this link:
>   https://wiki.archlinux.org/index.php/MAC_address_spoofing#Auto
> matically
> 

The ACPI subsystem doesn't create a sysfs node for a random buffer under _SB.
I don't think the ACPI guys would be crazy about this either.

So you need a platform kernel driver to pull this out of ACPI (or SMM) and 
expose
into userspace somewhere in the first place.  I was putting it into a random 
sysfs
attribute when I did my first attempts

[PATCH v2] r8152: Add support for setting MAC to system's Auxiliary MAC address

2016-06-02 Thread Mario Limonciello

Dell systems with Type-C ports have support for a persistent system
specific MAC address when used with Dell Type-C docks and dongles.
This means a dock plugged into two different systems will show different
(but persistent) MAC addresses.  Dell Type-C docks and dongles use the
r8152 driver.

This information for the system's persistent MAC address is burned in when
the HW is built and available under _SB\AMAC in the DSDT at runtime.

More information about the technology is available here:
http://www.dell.com/support/article/us/en/04/SLN301147

Signed-off-by: Mario Limonciello 
---
 drivers/net/usb/r8152.c | 53 +
 1 file changed, 53 insertions(+)

diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index 3f9f6ed..6dea542 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -26,6 +26,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 /* Information for net-next */
 #define NETNEXT_VERSION"08"
@@ -500,6 +502,7 @@ enum rtl8152_flags {
SELECTIVE_SUSPEND,
PHY_RESET,
SCHEDULE_NAPI,
+   MAC_PASSTHRU = 0,
 };
 
 /* Define these values to match your device */
@@ -653,6 +656,7 @@ enum tx_csum_stat {
  */
 static const int multicast_filter_limit = 32;
 static unsigned int agg_buf_sz = 16384;
+static bool mac_passthru_active;
 
 #define RTL_LIMITED_TSO_SIZE   (agg_buf_sz - sizeof(struct tx_desc) - \
 VLAN_ETH_HLEN - VLAN_HLEN)
@@ -1030,6 +1034,49 @@ out1:
return ret;
 }
 
+static int get_auxiliary_addr(struct r8152 *tp, struct sockaddr *sa)
+{
+   acpi_status status;
+   struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
+   union acpi_object *obj;
+   int ret = -1;
+   unsigned char buf[6];
+
+   if (!dmi_name_in_vendors("Dell Inc.") || mac_passthru_active)
+   return -1;
+
+   /* returns _AUXMAC_#AABBCCDDEEFF# */
+   status = acpi_evaluate_object(NULL, "\\_SB.AMAC", NULL, );
+   obj = (union acpi_object *)buffer.pointer;
+   if (ACPI_SUCCESS(status)) {
+   if (obj->type != ACPI_TYPE_BUFFER ||
+   obj->string.length != 0x17) {
+   pr_warn("r8152: get_auxiliary_addr: Invalid buffer");
+   goto amacout;
+   }
+   if (strncmp(obj->string.pointer, "_AUXMAC_#", 9) != 0) {
+   pr_warn("r8152: get_auxiliary_addr: Invalid header");
+   goto amacout;
+   }
+   ret = hex2bin(buf, obj->string.pointer + 9, 6);
+   if (ret < 0) {
+   pr_warn("r8152: get_auxiliary_addr: Invalid MAC");
+   goto amacout;
+   }
+   memcpy(sa->sa_data, buf, 6);
+   ether_addr_copy(tp->netdev->dev_addr, sa->sa_data);
+   netdev_info(tp->netdev, "Using system MAC address %pM\n",
+   sa->sa_data);
+   set_bit(MAC_PASSTHRU, >flags);
+   mac_passthru_active = true;
+   ret = 1;
+   }
+
+amacout:
+   kfree(obj);
+   return ret;
+}
+
 static int set_ethernet_addr(struct r8152 *tp)
 {
struct net_device *dev = tp->netdev;
@@ -1041,6 +1088,10 @@ static int set_ethernet_addr(struct r8152 *tp)
else
ret = pla_ocp_read(tp, PLA_BACKUP, 8, sa.sa_data);
 
+   /* if system provides auxiliary MAC address */
+   if (get_auxiliary_addr(tp, ))
+   ret = 0;
+
if (ret < 0) {
netif_err(tp, probe, dev, "Get ether addr fail\n");
} else if (!is_valid_ether_addr(sa.sa_data)) {
@@ -4268,6 +4319,8 @@ static void rtl8152_disconnect(struct usb_interface *intf)
if (udev->state == USB_STATE_NOTATTACHED)
set_bit(RTL8152_UNPLUG, >flags);
 
+   if (test_bit(MAC_PASSTHRU, >flags))
+   mac_passthru_active = false;
netif_napi_del(>napi);
unregister_netdev(tp->netdev);
tp->rtl_ops.unload(tp);
-- 
2.7.4

Re: [RFC PATCH 0/4] Make inotify instance/watches be accounted per userns

2016-06-02 Thread Eric W. Biederman

Nikolay Borisov  writes:

> On 06/01/2016 07:00 PM, Eric W. Biederman wrote:
>> Cc'd the containers list.
>> 
>> 
>> Nikolay Borisov  writes:
>> 
>>> Currently the inotify instances/watches are being accounted in the 
>>> user_struct structure. This means that in setups where multiple 
>>> users in unprivileged containers map to the same underlying 
>>> real user (e.g. user_struct) the inotify limits are going to be 
>>> shared as well which can lead to unplesantries. This is a problem 
>>> since any user inside any of the containers can potentially exhaust 
>>> the instance/watches limit which in turn might prevent certain 
>>> services from other containers from starting.
>> 
>> On a high level this is a bit problematic as it appears to escapes the
>> current limits and allows anyone creating a user namespace to have their
>> own fresh set of limits.  Given that anyone should be able to create a
>> user namespace whenever they feel like escaping limits is a problem.
>> That however is solvable.
>
> This is indeed a problem and the presented solution is rather dumb in
> that regard. I'm happy to work with you on suggestions so that I arrive
> at a solution that is upstreamable.

The one in kernel solution to hierarchical resource limits that I am
aware of is the current include/linux/page_counter.h which evolved from
include/linux/res_counter.h

>> A practical question.  What kind of limits are we looking at here?
>> 
>> Are these loose limits for detecting buggy programs that have gone
>> off their rails?
>
> Loose limits.
>
>> 
>> Are these tight limits to ensure multitasking is possible?
>> 
>> 
>> 
>> For tight limits where something is actively controlling the limits you
>> probably want a cgroup base solution.
>> 
>> For loose limits that are the kind where you set a good default and
>> forget about I think a user namespace based solution is reasonable.
>
> That's exactly the use case I had in mind.
>
>> 
>>> The solution I propose is rather simple, instead of accounting the 
>>> watches/instances per user_struct, start accounting them in a hashtable, 
>>> where the index used is the hashed pointer of the userns. This way
>>> the administrator needn't set the inotify limits very high and also 
>>> the risk of one container breaching the limits and affecting every 
>>> other container is alleviated.
>> 
>> I don't think this is the right data structure for a user namespace
>> based solution, at least in part because it does not account for users
>> escaping.
>
> Admittedly this is a naive solution, what are you ideas on something
> which achieves my initial aim of having limits per users, yet not
> allowing them to just create another namespace and escape them. The
> current namespace code has a hard-coded limit of 32 for nesting user
> namespaces. So currently at the worst case one can escape the limits up
> to 32 * current_limits.

32 is the nesting depth not the width of the tree.  But see above.

Eric

Re: [PATCH 0/2] Quiet noisy LSM denial when accessing net sysctl

2016-06-02 Thread Tyler Hicks

On 05/17/2016 09:13 AM, Tyler Hicks wrote:
> On 05/08/2016 10:56 PM, David Miller wrote:
>> From: Tyler Hicks 
>> Date: Fri,  6 May 2016 18:04:12 -0500
>>
>>> This pair of patches does away with what I believe is a useless denial
>>> audit message when a privileged process initially accesses a net sysctl.
>>
>> The LSM folks can apply this if they agree with you.
> 
> Hi James - Could you pick up these two bug fix patches? Thanks!

Hello - Just checking in again to see if you plan on taking these
through the security tree?

Tyler




signature.asc
Description: OpenPGP digital signature

Re: [RFC 05/12] nfp: add BPF to NFP code translator

2016-06-02 Thread John Fastabend

On 16-06-01 01:15 PM, Alexei Starovoitov wrote:
> On Wed, Jun 01, 2016 at 10:03:04PM +0200, Daniel Borkmann wrote:
>> On 06/01/2016 06:50 PM, Jakub Kicinski wrote:
>>> Add translator for JITing eBPF to operations which
>>> can be executed on NFP's programmable engines.
>>>
>>> Signed-off-by: Jakub Kicinski 
>>> Reviewed-by: Dinan Gunawardena 
>>> Reviewed-by: Simon Horman 
>> [...]
>>> +int
>>> +nfp_bpf_jit(struct bpf_prog *filter, void *prog_mem, unsigned int 
>>> prog_start,
>>> +   unsigned int tgt_out, unsigned int tgt_abort,
>>> +   unsigned int prog_sz, struct nfp_bpf_result *res)
>>> +{
>>> +   struct nfp_prog *nfp_prog;
>>> +   int ret;
>>> +
>>> +   /* TODO: maybe make this dependent on bpf_jit_enable? */
>>
>> Probably makes sense to leave it independent from this.
>>
>> Maybe that would rather be an ethtool flag/setting?
> 
> Agree that it should be independent of bpf_jit_enable,
> since that's very different JIT. The whole point of hw offload
> is that bpf is translated into something hw understand natively.
> Gating it by sysctl or another flag doesn't make much sense to me.
> In this case the user will say 'do offload tc+cls_bpf into a nic'
> and nic should either do it or not. No need for ethtool flag either.
> One can argue that that bpf_jit_enable=2 was useful for debugging
> of JIT itself, but looks like it was only used by jit developers
> like us, but we would be fine with temp printk while debugging.
> At least there was never a case where jit had a bug and we would
> ask a person reporting a bug to send us back jit_enable=2 output.
> We cannot remove it now, but I wouldn't simply copy the behavior here.
> So I'm suggesting not to use bpf_jit_enable either 1 or 2 at all.
> 

In the default case (no flags to the tc command) the tc filter
tries to load itself in the hardware. The ethtool flag is there
to enable/disable this default behavior. The alternative to the
default load into hardware behavior is to specify it explicitly
via userspace using the 'do offload tc+cls_bpf' as you note. This
was the default behavior folks wanted at netdev conference so I
added it even though for many of my use cases users specify explicitly
if they want offload or not.

Thanks,
John

1 2 >

1 - 100 of 183 matches

Mail list logo