Re: [PATCH net 1/4] pegasus: Use heap buffers for all register access

2017-02-07 Thread Petko Manolov
On 17-02-07 10:32:16, Steve Calfee wrote:
> On Mon, Feb 6, 2017 at 4:51 AM, Petko Manolov  wrote:
> > On 17-02-06 09:28:22, Greg KH wrote:
> >> On Mon, Feb 06, 2017 at 10:14:44AM +0200, Petko Manolov wrote:
> >> > On 17-02-05 01:30:39, Greg KH wrote:
> >> > > On Sat, Feb 04, 2017 at 04:56:03PM +, Ben Hutchings wrote:
> >> > > > Allocating USB buffers on the stack is not portable, and no longer 
> >> > > > works
> >> > > > on x86_64 (with VMAP_STACK enabled as per default).
> >> > >
> >> > > It's never worked on other platforms, so these should go to the stable
> >> > > releases please.
> >> >
> >> > As far as i know both drivers works fine on other platforms, though I 
> >> > only
> >> > tested it on arm and mipsel. ;)
> >>
> >> It all depends on the arm and mips platforms, the ones that can not DMA 
> >> from stack memory are the ones that would always fail here (since the 2.2 
> >> kernel days).
> >
> > Seems like most modern SOCs have decent DMA controllers.
> >
> 
> The real problem is not DMA exactly, it is cache coherency.
> 
> X86 has a coherent cache and all the cpu cores watch DMA transfers and keep 
> the cpu caches up to date.

Yep, these cpus are more user friendly.

> Most ARMs and MIPS processors have incoherent cache, so DMA can change memory 
> without the CPU cache updates. CPU cache view of what is in memory can be 
> different from what was DMAed in, this makes failures very hard to detect, 
> reproduce and racy.

Except for a very few purposes (like framebuffer memory) one should be crazy to 
use anything but kseg1 for accessing the peripherals.  One of the real problems 
here is the 512MB limit of the uncached segment.  While the DMA controller can 
typically access all available RAM you'll run into the issues that you describe 
with more than that amount of memory.

MIPS like doing things small and simple.  Everybody knows that software can 
compensate for hardware omissions. ;)

> So all DMA buffers should always be separate allocations from the stack AND 
> not be embedded in structs. Memory allocations are always at least cache line 
> aligned, so coherency is not a problem.

I do agree.


cheers,
Petko


RE: [PATCH net v2 1/3] net: phy: Fix PHY module checks

2017-02-07 Thread maowenan


> -Original Message-
> From: Florian Fainelli [mailto:f.faine...@gmail.com]
> Sent: Wednesday, February 08, 2017 3:38 PM
> To: netdev@vger.kernel.org
> Cc: da...@davemloft.net; and...@lunn.ch; rmk+ker...@armlinux.org.uk;
> maowenan; Florian Fainelli
> Subject: [PATCH net v2 1/3] net: phy: Fix PHY module checks
> 
> The Generic PHY drivers gets assigned after we checked that the current PHY
> driver is NULL, so we need to check a few things before we can safely
> derference d->driver. Update phy_attach_direct() and phy_detach() accordingly
> to be resilient to these cases.
> 
> Even though the Generic PHY driver defaults to phy_probe() which can hardly
> fail at the moment, let's fix the label so we don't call phy_detach() on a 
> network
> device we have not attached yet.
> 
> Fixes: cafe8df8b9bc ("net: phy: Fix lack of reference count on PHY driver")
> Signed-off-by: Florian Fainelli 
> ---
>  drivers/net/phy/phy_device.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index
> 0d8f4d3847f6..bde240bf8d7b 100644
> --- a/drivers/net/phy/phy_device.c
> +++ b/drivers/net/phy/phy_device.c
> @@ -920,7 +920,7 @@ int phy_attach_direct(struct net_device *dev, struct
> phy_device *phydev,
>   return -EIO;
>   }
> 
> - if (!try_module_get(d->driver->owner)) {
> + if (d->driver && !try_module_get(d->driver->owner)) {
>   dev_err(>dev, "failed to get the device driver module\n");
>   return -EIO;

Hi Florian,
  Here "return -EIO" will miss decreasing bus->owner reference count.

Mao Wenan

>   }
> @@ -943,7 +943,7 @@ int phy_attach_direct(struct net_device *dev, struct
> phy_device *phydev,
>   err = device_bind_driver(d);
> 
>   if (err)
> - goto error;
> + goto error_put_device;
>   }
> 
>   if (phydev->attached_dev) {
> @@ -981,6 +981,7 @@ int phy_attach_direct(struct net_device *dev, struct
> phy_device *phydev,
> 
>  error:
>   phy_detach(phydev);
> +error_put_device:
>   put_device(d);
>   module_put(d->driver->owner);
>   if (ndev_owner != bus->owner)
> @@ -1065,7 +1066,8 @@ void phy_detach(struct phy_device *phydev)
>   bus = phydev->mdio.bus;
> 
>   put_device(>mdio.dev);
> - module_put(phydev->mdio.dev.driver->owner);
> + if (phydev->mdio.dev.driver)
> + module_put(phydev->mdio.dev.driver->owner);
>   if (ndev_owner != bus->owner)
>   module_put(bus->owner);
>  }
> --
> 2.9.3



Re: [PATCH v3 1/2] mac80211: fils_aead: Use crypto api CMAC shash rather than bare cipher

2017-02-07 Thread Johannes Berg
On Wed, 2017-02-08 at 07:45 +, Ard Biesheuvel wrote:
> On 8 February 2017 at 07:00, Johannes Berg  > wrote:
> > This looks strange to me:
> > 
> > > +static int aes_s2v(struct crypto_shash *tfm,
> > >  size_t num_elem, const u8 *addr[], size_t len[],
> > > u8 *v)
> > >  {
> > > - u8 d[AES_BLOCK_SIZE], tmp[AES_BLOCK_SIZE];
> > > + u8 d[AES_BLOCK_SIZE], tmp[AES_BLOCK_SIZE] = {};
> > > + SHASH_DESC_ON_STACK(desc, tfm);
> > 
> > desc declared
> > 
> > > 
> > > + crypto_shash_digest(desc, tmp, AES_BLOCK_SIZE, d);
> > 
> > used here
> > 
> 
> Each digest() call combines a init()/update()/final() sequence
> 
> > > + crypto_shash_init(desc);
> > 
> > but initialized now?
> > 
> 
> ... for the 6th time, or so. The final vector may require two
> update()s, so we cannot use digest() here. But we can use finup() for
> the last one, which combines update() and final().
> 
> Hence,
> 
> init()/finup()
> 
> or
> 
> init()/update()/finup()
> 
> depending on the length of the last vector.

Great, thanks for the explanation :)

johannes


Re: [PATCH v3 1/2] mac80211: fils_aead: Use crypto api CMAC shash rather than bare cipher

2017-02-07 Thread Ard Biesheuvel
On 8 February 2017 at 07:00, Johannes Berg  wrote:
> This looks strange to me:
>
>> +static int aes_s2v(struct crypto_shash *tfm,
>>  size_t num_elem, const u8 *addr[], size_t len[],
>> u8 *v)
>>  {
>> - u8 d[AES_BLOCK_SIZE], tmp[AES_BLOCK_SIZE];
>> + u8 d[AES_BLOCK_SIZE], tmp[AES_BLOCK_SIZE] = {};
>> + SHASH_DESC_ON_STACK(desc, tfm);
>
> desc declared
>
>>
>> + crypto_shash_digest(desc, tmp, AES_BLOCK_SIZE, d);
>
> used here
>

Each digest() call combines a init()/update()/final() sequence

>> + crypto_shash_init(desc);
>
> but initialized now?
>

... for the 6th time, or so. The final vector may require two
update()s, so we cannot use digest() here. But we can use finup() for
the last one, which combines update() and final().

Hence,

init()/finup()

or

init()/update()/finup()

depending on the length of the last vector.


[PATCH net v2 3/3] net: phy: Fix PHY driver bind and unbind events

2017-02-07 Thread Florian Fainelli
The PHY library does not deal very well with bind and unbind events. The first
thing we would see is that we were not properly canceling the PHY state machine
workqueue, so we would be crashing while dereferencing phydev->drv since there
is no driver attached anymore.

Once we fix that, there are several things that did not quite work as expected:

- if the PHY state machine was running, we were not stopping it properly, and
  the state machine state would not be marked as such
- when we rebind the driver, nothing would happen, since we would not know which
  state we were before the unbind

This patch takes the following approach:

- if the PHY was attached, and the state machine was running we would stop it,
  remember where we left, and schedule the state machine for restart upong
  driver bind
- if the PHY was attached, but HALTED, we would let it in that state, and do not
  alter the state upon driver bind
- in all other cases (detached) we would keep the PHY in DOWN state waiting for
  a network driver to show up, and set PHY_READY on driver bind

Suggested-by: Russell King 
Signed-off-by: Florian Fainelli 
---
 drivers/net/phy/phy_device.c | 27 +--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index bde240bf8d7b..5314e764a387 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -1711,6 +1711,7 @@ static int phy_probe(struct device *dev)
struct phy_device *phydev = to_phy_device(dev);
struct device_driver *drv = phydev->mdio.dev.driver;
struct phy_driver *phydrv = to_phy_driver(drv);
+   bool should_start = false;
int err = 0;
 
phydev->drv = phydrv;
@@ -1760,24 +1761,46 @@ static int phy_probe(struct device *dev)
}
 
/* Set the state to READY by default */
-   phydev->state = PHY_READY;
+   if (phydev->state > PHY_UP && phydev->state != PHY_HALTED)
+   should_start = true;
+   else
+   phydev->state = PHY_READY;
 
if (phydev->drv->probe)
err = phydev->drv->probe(phydev);
 
mutex_unlock(>lock);
 
+   if (should_start)
+   phy_start(phydev);
+
return err;
 }
 
 static int phy_remove(struct device *dev)
 {
struct phy_device *phydev = to_phy_device(dev);
+   bool should_stop = false;
+   enum phy_state state;
+
+   cancel_delayed_work_sync(>state_queue);
 
mutex_lock(>lock);
-   phydev->state = PHY_DOWN;
+   state = phydev->state;
+   if (state > PHY_UP && state != PHY_HALTED)
+   should_stop = true;
+   else
+   phydev->state = PHY_DOWN;
mutex_unlock(>lock);
 
+   /* phy_stop() sets the state to HALTED, undo that for the ->probe() 
function
+* to have a chance to resume where we left
+*/
+   if (should_stop) {
+   phy_stop(phydev);
+   phydev->state = state;
+   }
+
if (phydev->drv->remove)
phydev->drv->remove(phydev);
phydev->drv = NULL;
-- 
2.9.3



[PATCH 1/3 net-next] enic: add devcmds for vxlan offload

2017-02-07 Thread Govindarajulu Varadarajan
This patch adds devcmds needed for vxlan offload. Implement 3 new devcmd

overlay_offload_ctrl: enable/disable offload
overlay_offload_cfg: update offload udp port number
get_supported_feature_ver: get hw supported offload version. Each
   version has different bitmap for csum_ok/encap

Signed-off-by: Govindarajulu Varadarajan 
---
 drivers/net/ethernet/cisco/enic/vnic_dev.c| 42 ++
 drivers/net/ethernet/cisco/enic/vnic_dev.h|  5 +++
 drivers/net/ethernet/cisco/enic/vnic_devcmd.h | 51 +++
 drivers/net/ethernet/cisco/enic/vnic_enet.h   |  1 +
 4 files changed, 99 insertions(+)

diff --git a/drivers/net/ethernet/cisco/enic/vnic_dev.c 
b/drivers/net/ethernet/cisco/enic/vnic_dev.c
index 8f27df3207bc..9b3d670e1aa9 100644
--- a/drivers/net/ethernet/cisco/enic/vnic_dev.c
+++ b/drivers/net/ethernet/cisco/enic/vnic_dev.c
@@ -1247,3 +1247,45 @@ int vnic_dev_classifier(struct vnic_dev *vdev, u8 cmd, 
u16 *entry,
 
return ret;
 }
+
+int vnic_dev_overlay_offload_ctrl(struct vnic_dev *vdev, u8 overlay, u8 config)
+{
+   u64 a0;
+   u64 a1;
+   int wait = 1000;
+   int ret;
+
+   a0 = overlay;
+   a1 = config;
+
+   ret = vnic_dev_cmd(vdev, CMD_OVERLAY_OFFLOAD_CTRL, , , wait);
+
+   return ret;
+}
+
+int vnic_dev_overlay_offload_cfg(struct vnic_dev *vdev, u8 overlay,
+u16 vxlan_udp_port_number)
+{
+   u64 a0, a1;
+   int wait = 1000;
+
+   a0 = overlay;
+   a1 = vxlan_udp_port_number;
+
+   return vnic_dev_cmd(vdev, CMD_OVERLAY_OFFLOAD_CFG, , , wait);
+}
+
+int vnic_dev_get_supported_feature_ver(struct vnic_dev *vdev, u8 feature,
+  u64 *supported_versions)
+{
+   u64 a0 = feature;
+   u64  a1 = 0;
+   int wait = 1000;
+   int ret;
+
+   ret = vnic_dev_cmd(vdev, CMD_GET_SUPP_FEATURE_VER, , , wait);
+   if (!ret)
+   *supported_versions = a0;
+
+   return ret;
+}
diff --git a/drivers/net/ethernet/cisco/enic/vnic_dev.h 
b/drivers/net/ethernet/cisco/enic/vnic_dev.h
index 54156c484424..9d43d6bb9907 100644
--- a/drivers/net/ethernet/cisco/enic/vnic_dev.h
+++ b/drivers/net/ethernet/cisco/enic/vnic_dev.h
@@ -179,5 +179,10 @@ int vnic_dev_set_mac_addr(struct vnic_dev *vdev, u8 
*mac_addr);
 int vnic_dev_classifier(struct vnic_dev *vdev, u8 cmd, u16 *entry,
struct filter *data);
 int vnic_devcmd_init(struct vnic_dev *vdev);
+int vnic_dev_overlay_offload_ctrl(struct vnic_dev *vdev, u8 overlay, u8 
config);
+int vnic_dev_overlay_offload_cfg(struct vnic_dev *vdev, u8 overlay,
+u16 vxlan_udp_port_number);
+int vnic_dev_get_supported_feature_ver(struct vnic_dev *vdev, u8 feature,
+  u64 *supported_versions);
 
 #endif /* _VNIC_DEV_H_ */
diff --git a/drivers/net/ethernet/cisco/enic/vnic_devcmd.h 
b/drivers/net/ethernet/cisco/enic/vnic_devcmd.h
index 2a812880b884..d83880b0d468 100644
--- a/drivers/net/ethernet/cisco/enic/vnic_devcmd.h
+++ b/drivers/net/ethernet/cisco/enic/vnic_devcmd.h
@@ -406,6 +406,31 @@ enum vnic_devcmd_cmd {
 * in: (u32) a0=Queue Pair number
 */
CMD_QP_STATS_CLEAR = _CMDC(_CMD_DIR_WRITE, _CMD_VTYPE_ENET, 63),
+
+   /* Use this devcmd for agreeing on the highest common version supported
+* by both driver and fw for features who need such a facility.
+* in:  (u64) a0 = feature (driver requests for the supported versions
+*  on this feature)
+* out: (u64) a0 = bitmap of all supported versions for that feature
+*/
+   CMD_GET_SUPP_FEATURE_VER = _CMDC(_CMD_DIR_RW, _CMD_VTYPE_ENET, 69),
+
+   /* Control (Enable/Disable) overlay offloads on the given vnic
+* in: (u8) a0 = OVERLAY_FEATURE_NVGRE : NVGRE
+*  a0 = OVERLAY_FEATURE_VXLAN : VxLAN
+* in: (u8) a1 = OVERLAY_OFFLOAD_ENABLE : Enable or
+*  a1 = OVERLAY_OFFLOAD_DISABLE : Disable or
+*  a1 = OVERLAY_OFFLOAD_ENABLE_V2 : Enable with version 2
+*/
+   CMD_OVERLAY_OFFLOAD_CTRL = _CMDC(_CMD_DIR_WRITE, _CMD_VTYPE_ENET, 72),
+
+   /* Configuration of overlay offloads feature on a given vNIC
+* in: (u8) a0 = DEVCMD_OVERLAY_NVGRE : NVGRE
+*  a0 = DEVCMD_OVERLAY_VXLAN : VxLAN
+* in: (u8) a1 = VXLAN_PORT_UPDATE : VxLAN
+* in: (u16) a2 = unsigned short int port information
+*/
+   CMD_OVERLAY_OFFLOAD_CFG = _CMDC(_CMD_DIR_WRITE, _CMD_VTYPE_ENET, 73),
 };
 
 /* CMD_ENABLE2 flags */
@@ -657,4 +682,30 @@ struct devcmd2_result {
 #define DEVCMD2_RING_SIZE  32
 #define DEVCMD2_DESC_SIZE  128
 
+enum overlay_feature_t {
+   OVERLAY_FEATURE_NVGRE = 1,
+   OVERLAY_FEATURE_VXLAN,
+   OVERLAY_FEATURE_MAX,
+};
+
+enum overlay_ofld_cmd {
+   OVERLAY_OFFLOAD_ENABLE,
+   OVERLAY_OFFLOAD_DISABLE,
+   

[PATCH net v2 0/3] net: phy: Unbind/bind fixes

2017-02-07 Thread Florian Fainelli
Hi all,

This patch series addresses the inability to safely unbind and bind
PHY drivers by making the appropriate checks throught PHYLIB where we
may be directly responding to user-space queries, as well as from within
the kernel state machine.

The second patch makes the unbind -> bind working by taking care of the
PHY state machine state.

Changes in v2:

- fixed net: phy: Fix lack of reference count on PHY driver against
  the Generic PHY driver which is special

Florian Fainelli (3):
  net: phy: Fix PHY module checks
  net: phy: Check phydev->drv
  net: phy: Fix PHY driver bind and unbind events

 drivers/net/phy/phy.c| 26 ++
 drivers/net/phy/phy_device.c | 35 ++-
 include/linux/phy.h  |  3 +++
 3 files changed, 55 insertions(+), 9 deletions(-)

-- 
2.9.3



[PATCH net v2 1/3] net: phy: Fix PHY module checks

2017-02-07 Thread Florian Fainelli
The Generic PHY drivers gets assigned after we checked that the current PHY
driver is NULL, so we need to check a few things before we can safely
derference d->driver. Update phy_attach_direct() and phy_detach() accordingly
to be resilient to these cases.

Even though the Generic PHY driver defaults to phy_probe() which can hardly
fail at the moment, let's fix the label so we don't call phy_detach() on a
network device we have not attached yet.

Fixes: cafe8df8b9bc ("net: phy: Fix lack of reference count on PHY driver")
Signed-off-by: Florian Fainelli 
---
 drivers/net/phy/phy_device.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 0d8f4d3847f6..bde240bf8d7b 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -920,7 +920,7 @@ int phy_attach_direct(struct net_device *dev, struct 
phy_device *phydev,
return -EIO;
}
 
-   if (!try_module_get(d->driver->owner)) {
+   if (d->driver && !try_module_get(d->driver->owner)) {
dev_err(>dev, "failed to get the device driver module\n");
return -EIO;
}
@@ -943,7 +943,7 @@ int phy_attach_direct(struct net_device *dev, struct 
phy_device *phydev,
err = device_bind_driver(d);
 
if (err)
-   goto error;
+   goto error_put_device;
}
 
if (phydev->attached_dev) {
@@ -981,6 +981,7 @@ int phy_attach_direct(struct net_device *dev, struct 
phy_device *phydev,
 
 error:
phy_detach(phydev);
+error_put_device:
put_device(d);
module_put(d->driver->owner);
if (ndev_owner != bus->owner)
@@ -1065,7 +1066,8 @@ void phy_detach(struct phy_device *phydev)
bus = phydev->mdio.bus;
 
put_device(>mdio.dev);
-   module_put(phydev->mdio.dev.driver->owner);
+   if (phydev->mdio.dev.driver)
+   module_put(phydev->mdio.dev.driver->owner);
if (ndev_owner != bus->owner)
module_put(bus->owner);
 }
-- 
2.9.3



[PATCH net v2 2/3] net: phy: Check phydev->drv

2017-02-07 Thread Florian Fainelli
In preparation for supporting driver bind/unbind properly, sprinkle checks on
phydev->drv where we may call into PHYLIB from user-space or other parts of the
kernel.

Suggested-by: Russell King 
Signed-off-by: Florian Fainelli 
---
 drivers/net/phy/phy.c | 26 ++
 include/linux/phy.h   |  3 +++
 2 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 7cc1b7dcfe05..d6f7838455dd 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -580,7 +580,7 @@ int phy_mii_ioctl(struct phy_device *phydev, struct ifreq 
*ifr, int cmd)
return 0;
 
case SIOCSHWTSTAMP:
-   if (phydev->drv->hwtstamp)
+   if (phydev->drv && phydev->drv->hwtstamp)
return phydev->drv->hwtstamp(phydev, ifr);
/* fall through */
 
@@ -603,6 +603,9 @@ int phy_start_aneg(struct phy_device *phydev)
 {
int err;
 
+   if (!phydev->drv)
+   return -EIO;
+
mutex_lock(>lock);
 
if (AUTONEG_DISABLE == phydev->autoneg)
@@ -975,7 +978,7 @@ void phy_state_machine(struct work_struct *work)
 
old_state = phydev->state;
 
-   if (phydev->drv->link_change_notify)
+   if (phydev->drv && phydev->drv->link_change_notify)
phydev->drv->link_change_notify(phydev);
 
switch (phydev->state) {
@@ -1286,6 +1289,9 @@ EXPORT_SYMBOL(phy_write_mmd_indirect);
  */
 int phy_init_eee(struct phy_device *phydev, bool clk_stop_enable)
 {
+   if (!phydev->drv)
+   return -EIO;
+
/* According to 802.3az,the EEE is supported only in full duplex-mode.
 * Also EEE feature is active when core is operating with MII, GMII
 * or RGMII (all kinds). Internal PHYs are also allowed to proceed and
@@ -1363,6 +1369,9 @@ EXPORT_SYMBOL(phy_init_eee);
  */
 int phy_get_eee_err(struct phy_device *phydev)
 {
+   if (!phydev->drv)
+   return -EIO;
+
return phy_read_mmd_indirect(phydev, MDIO_PCS_EEE_WK_ERR, MDIO_MMD_PCS);
 }
 EXPORT_SYMBOL(phy_get_eee_err);
@@ -1379,6 +1388,9 @@ int phy_ethtool_get_eee(struct phy_device *phydev, struct 
ethtool_eee *data)
 {
int val;
 
+   if (!phydev->drv)
+   return -EIO;
+
/* Get Supported EEE */
val = phy_read_mmd_indirect(phydev, MDIO_PCS_EEE_ABLE, MDIO_MMD_PCS);
if (val < 0)
@@ -1412,6 +1424,9 @@ int phy_ethtool_set_eee(struct phy_device *phydev, struct 
ethtool_eee *data)
 {
int val = ethtool_adv_to_mmd_eee_adv_t(data->advertised);
 
+   if (!phydev->drv)
+   return -EIO;
+
/* Mask prohibited EEE modes */
val &= ~phydev->eee_broken_modes;
 
@@ -1423,7 +1438,7 @@ EXPORT_SYMBOL(phy_ethtool_set_eee);
 
 int phy_ethtool_set_wol(struct phy_device *phydev, struct ethtool_wolinfo *wol)
 {
-   if (phydev->drv->set_wol)
+   if (phydev->drv && phydev->drv->set_wol)
return phydev->drv->set_wol(phydev, wol);
 
return -EOPNOTSUPP;
@@ -1432,7 +1447,7 @@ EXPORT_SYMBOL(phy_ethtool_set_wol);
 
 void phy_ethtool_get_wol(struct phy_device *phydev, struct ethtool_wolinfo 
*wol)
 {
-   if (phydev->drv->get_wol)
+   if (phydev->drv && phydev->drv->get_wol)
phydev->drv->get_wol(phydev, wol);
 }
 EXPORT_SYMBOL(phy_ethtool_get_wol);
@@ -1468,6 +1483,9 @@ int phy_ethtool_nway_reset(struct net_device *ndev)
if (!phydev)
return -ENODEV;
 
+   if (!phydev->drv)
+   return -EIO;
+
return genphy_restart_aneg(phydev);
 }
 EXPORT_SYMBOL(phy_ethtool_nway_reset);
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 7fc1105605bf..231e07bb0d76 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -802,6 +802,9 @@ int phy_stop_interrupts(struct phy_device *phydev);
 
 static inline int phy_read_status(struct phy_device *phydev)
 {
+   if (!phydev->drv)
+   return -EIO;
+
return phydev->drv->read_status(phydev);
 }
 
-- 
2.9.3



[PATCH 2/3 net-next] enic: add udp_tunnel ndo for vxlan offload

2017-02-07 Thread Govindarajulu Varadarajan
Defines enic_udp_tunnel_add/del for configuring vxlan tunnel offload.
enic supports offload of only one ipv4/udp port.

There are two modes that fw supports for vxlan offload.

mode 0: fcoe bit is set for encapsulated packet. fcoe_fc_crc_ok is set
if checksum of csum is ok. This bit is or of ip_csum_ok and
tcp_udp_csum_ok

mode 2: BIT(0) in rss_hash is set if it is encapsulated packet.
BIT(1) is set if outer_ip_csum_ok/
BIT(2) is set if outer_tcp_csum_ok

tcp_udp_csum_ok/ipv4_csum_ok is set if inner csum is OK.

Signed-off-by: Govindarajulu Varadarajan 
---
 drivers/net/ethernet/cisco/enic/enic.h  |   6 ++
 drivers/net/ethernet/cisco/enic/enic_main.c | 156 +++-
 2 files changed, 159 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/cisco/enic/enic.h 
b/drivers/net/ethernet/cisco/enic/enic.h
index 9023c858715d..2b23f46b34d3 100644
--- a/drivers/net/ethernet/cisco/enic/enic.h
+++ b/drivers/net/ethernet/cisco/enic/enic.h
@@ -135,6 +135,11 @@ struct enic_rfs_flw_tbl {
struct timer_list rfs_may_expire;
 };
 
+struct vxlan_offload {
+   u16 vxlan_udp_port_number;
+   u8 patch_level;
+};
+
 /* Per-instance private data structure */
 struct enic {
struct net_device *netdev;
@@ -175,6 +180,7 @@ struct enic {
/* receive queue cache line section */
cacheline_aligned struct vnic_rq rq[ENIC_RQ_MAX];
unsigned int rq_count;
+   struct vxlan_offload vxlan;
u64 rq_truncated_pkts;
u64 rq_bad_fcs;
struct napi_struct napi[ENIC_RQ_MAX + ENIC_WQ_MAX];
diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c 
b/drivers/net/ethernet/cisco/enic/enic_main.c
index c009f6ddabf7..ce5ce394a810 100644
--- a/drivers/net/ethernet/cisco/enic/enic_main.c
+++ b/drivers/net/ethernet/cisco/enic/enic_main.c
@@ -45,6 +45,7 @@
 #endif
 #include 
 #include 
+#include 
 
 #include "cq_enet_desc.h"
 #include "vnic_dev.h"
@@ -176,6 +177,92 @@ static void enic_unset_affinity_hint(struct enic *enic)
irq_set_affinity_hint(enic->msix_entry[i].vector, NULL);
 }
 
+static void enic_udp_tunnel_add(struct net_device *netdev,
+   struct udp_tunnel_info *ti)
+{
+   struct enic *enic = netdev_priv(netdev);
+   int err;
+   __be16 port = ti->port;
+
+   spin_lock_bh(>devcmd_lock);
+
+   if (ti->type != UDP_TUNNEL_TYPE_VXLAN) {
+   netdev_info(netdev, "udp_tnl: only vxlan tunnel offload 
supported");
+   goto error;
+   }
+
+   if (ti->sa_family != AF_INET) {
+   netdev_info(netdev, "vxlan: only IPv4 offload supported");
+   goto error;
+   }
+
+   if (enic->vxlan.vxlan_udp_port_number) {
+   if (ntohs(port) == enic->vxlan.vxlan_udp_port_number)
+   netdev_warn(netdev, "vxlan: udp port already 
offloaded");
+   else
+   netdev_info(netdev, "vxlan: offload supported for only 
one UDP port");
+
+   goto error;
+   }
+
+   err = vnic_dev_overlay_offload_cfg(enic->vdev,
+  OVERLAY_CFG_VXLAN_PORT_UPDATE,
+  ntohs(port));
+   if (err)
+   goto error;
+
+   err = vnic_dev_overlay_offload_ctrl(enic->vdev, OVERLAY_FEATURE_VXLAN,
+   enic->vxlan.patch_level);
+   if (err)
+   goto error;
+
+   enic->vxlan.vxlan_udp_port_number = ntohs(port);
+
+   netdev_info(netdev, "vxlan fw-vers-%d: offload enabled for udp port: 
%d, sa_family: %d ",
+   (int)enic->vxlan.patch_level, ntohs(port), ti->sa_family);
+
+   goto unlock;
+
+error:
+   netdev_info(netdev, "failed to offload udp port: %d, sa_family: %d, 
type: %d",
+   ntohs(port), ti->sa_family, ti->type);
+unlock:
+   spin_unlock_bh(>devcmd_lock);
+}
+
+static void enic_udp_tunnel_del(struct net_device *netdev,
+   struct udp_tunnel_info *ti)
+{
+   struct enic *enic = netdev_priv(netdev);
+   int err;
+
+   spin_lock_bh(>devcmd_lock);
+
+   if ((ti->sa_family != AF_INET) ||
+   ((ntohs(ti->port) != enic->vxlan.vxlan_udp_port_number)) ||
+   (ti->type != UDP_TUNNEL_TYPE_VXLAN)) {
+   netdev_info(netdev, "udp_tnl: port:%d, sa_family: %d, type: %d 
not offloaded",
+   ntohs(ti->port), ti->sa_family, ti->type);
+   goto unlock;
+   }
+
+   err = vnic_dev_overlay_offload_ctrl(enic->vdev, OVERLAY_FEATURE_VXLAN,
+   OVERLAY_OFFLOAD_DISABLE);
+   if (err) {
+   netdev_err(netdev, "vxlan: del offload udp port: %d failed",
+  ntohs(ti->port));
+   goto unlock;
+   }
+
+   enic->vxlan.vxlan_udp_port_number = 0;
+
+   netdev_info(netdev, "vxlan: del 

[PATCH 3/3 net-next] enic: add vxlan offload on tx path

2017-02-07 Thread Govindarajulu Varadarajan
Define ndo_features_check. Hw supports offload only for ipv4 inner and
ipv4 outer pkt.

Code refactor for setting inner tcp pseudo csum.

Signed-off-by: Govindarajulu Varadarajan 
---
 drivers/net/ethernet/cisco/enic/enic_main.c | 126 +---
 1 file changed, 114 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c 
b/drivers/net/ethernet/cisco/enic/enic_main.c
index ce5ce394a810..9cfebdb22e81 100644
--- a/drivers/net/ethernet/cisco/enic/enic_main.c
+++ b/drivers/net/ethernet/cisco/enic/enic_main.c
@@ -263,6 +263,48 @@ static void enic_udp_tunnel_del(struct net_device *netdev,
spin_unlock_bh(>devcmd_lock);
 }
 
+static netdev_features_t enic_features_check(struct sk_buff *skb,
+struct net_device *dev,
+netdev_features_t features)
+{
+   struct enic *enic = netdev_priv(dev);
+   struct udphdr *udph;
+   u16 proto;
+   u16 port = 0;
+   const struct ethhdr *eth = (struct ethhdr *)skb_inner_mac_header(skb);
+
+   if (!skb->encapsulation)
+   return features;
+
+   features = vxlan_features_check(skb, features);
+
+   /* hardware only supports IPv4 vxlan tunnel */
+   if (vlan_get_protocol(skb) != htons(ETH_P_IP))
+   goto out;
+
+   /* hardware does not support offload of ipv6 inner pkt */
+   if (eth->h_proto != ntohs(ETH_P_IP))
+   goto out;
+
+   proto = ip_hdr(skb)->protocol;
+
+   if (proto == IPPROTO_UDP) {
+   udph = udp_hdr(skb);
+   port = be16_to_cpu(udph->dest);
+   }
+
+   /* HW supports offload of only one UDP port. Remove CSUM and GSO MASK
+* for other UDP port tunnels
+*/
+   if (port  != enic->vxlan.vxlan_udp_port_number)
+   goto out;
+
+   return features;
+
+out:
+   return features & ~(NETIF_F_CSUM_MASK | NETIF_F_GSO_MASK);
+}
+
 int enic_is_dynamic(struct enic *enic)
 {
return enic->pdev->device == PCI_DEVICE_ID_CISCO_VIC_ENET_DYN;
@@ -591,20 +633,19 @@ static int enic_queue_wq_skb_csum_l4(struct enic *enic, 
struct vnic_wq *wq,
return err;
 }
 
-static int enic_queue_wq_skb_tso(struct enic *enic, struct vnic_wq *wq,
-struct sk_buff *skb, unsigned int mss,
-int vlan_tag_insert, unsigned int vlan_tag,
-int loopback)
+static void enic_preload_tcp_csum_encap(struct sk_buff *skb)
 {
-   unsigned int frag_len_left = skb_headlen(skb);
-   unsigned int len_left = skb->len - frag_len_left;
-   unsigned int hdr_len = skb_transport_offset(skb) + tcp_hdrlen(skb);
-   int eop = (len_left == 0);
-   unsigned int len;
-   dma_addr_t dma_addr;
-   unsigned int offset = 0;
-   skb_frag_t *frag;
+   if (skb->protocol == cpu_to_be16(ETH_P_IP)) {
+   inner_ip_hdr(skb)->check = 0;
+   inner_tcp_hdr(skb)->check =
+   ~csum_tcpudp_magic(inner_ip_hdr(skb)->saddr,
+  inner_ip_hdr(skb)->daddr, 0,
+  IPPROTO_TCP, 0);
+   }
+}
 
+static void enic_preload_tcp_csum(struct sk_buff *skb)
+{
/* Preload TCP csum field with IP pseudo hdr calculated
 * with IP length set to zero.  HW will later add in length
 * to each TCP segment resulting from the TSO.
@@ -618,6 +659,30 @@ static int enic_queue_wq_skb_tso(struct enic *enic, struct 
vnic_wq *wq,
tcp_hdr(skb)->check = ~csum_ipv6_magic(_hdr(skb)->saddr,
_hdr(skb)->daddr, 0, IPPROTO_TCP, 0);
}
+}
+
+static int enic_queue_wq_skb_tso(struct enic *enic, struct vnic_wq *wq,
+struct sk_buff *skb, unsigned int mss,
+int vlan_tag_insert, unsigned int vlan_tag,
+int loopback)
+{
+   unsigned int frag_len_left = skb_headlen(skb);
+   unsigned int len_left = skb->len - frag_len_left;
+   unsigned int hdr_len;
+   int eop = (len_left == 0);
+   unsigned int len;
+   dma_addr_t dma_addr;
+   unsigned int offset = 0;
+   skb_frag_t *frag;
+
+   if (skb->encapsulation) {
+   hdr_len = skb_inner_transport_header(skb) - skb->data;
+   hdr_len += inner_tcp_hdrlen(skb);
+   enic_preload_tcp_csum_encap(skb);
+   } else {
+   hdr_len = skb_transport_offset(skb) + tcp_hdrlen(skb);
+   enic_preload_tcp_csum(skb);
+   }
 
/* Queue WQ_ENET_MAX_DESC_LEN length descriptors
 * for the main skb fragment
@@ -666,6 +731,38 @@ static int enic_queue_wq_skb_tso(struct enic *enic, struct 
vnic_wq *wq,
return 0;
 }
 
+static inline int enic_queue_wq_skb_encap(struct enic *enic, struct vnic_wq 
*wq,
+  

Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit

2017-02-07 Thread Shubham Bansal
Anybody willing to take swing at my following comments ?

On Wed, Feb 1, 2017 at 6:31 PM, Shubham Bansal
 wrote:
> Hi Kees & Daniel,
>
> On Tue, Jan 31, 2017 at 09:44:56AM -0800, Kees Cook wrote:
>> >> > 1.) Currently, as eBPF uses 64 bit registers, I am mapping 64 bit eBPF
>> >> > registers with 32 bit arm registers which looks wrong to me. Do anybody
>> >> > have some idea about how to map eBPF->arm 32 bit registers ?
>> >>
>> >> I was going to say "look at the x86 32-bit implementation." ... But
>> >> there isn't one. :( I'm going to guess that there isn't a very good
>> >> answer here. I assume you'll have to build some kind of stack scratch
>> >> space to load/save.
>> >
>> >
>> > Now I see why nobody has implemented eBPF JIT for the 32 bit systems. I
>> > think its very difficult to implement it without any complications and
>> > errors.
>>
>> Yeah, that does seem to make it much more difficult.
> I was thinking of first implementing only instructions with 32 bit
> register operands. It will hugely decrease the surface area of eBPF
> instructions that I have to cover for the first patch.
>
> So, What I am thinking is something like this :
>
> - bpf_mov r0(64),r1(64) will be JITed like this :
> - ar1(32) <- r1(64). Convert/Mask 64 bit ebpf register(r1) value into 32
> bit and store it in arm register(ar1).
> - Do MOV ar0(32),ar1(32) as an ARM instruction.
> - ar0(32) -> r0(64). Zero Extend the ar0 32 bit register value
> and store it in 64 bit ebpf register r0.
>
> - Similarly, For all BPF_ALU class instructions.
> - For BPF_ADD, I will mask the addition result to 32 bit only.
>  I am not sure, Overflow might be a problem.
> - For BPF_SUB, I will mask the subtraction result to 32 bit only.
>  I am not sure, Underflow might be problem.
> - For BPF_MUL, similar to BPF_ADD. Overflow Problem ?
> - For BPF_DIV, 32 bit masking should be fine, I guess.
> - For BPF_OR, BPF_AND, BPF_XOR, BPF_LSH, BPF_RSH, BPF_MOD 32 bit
>  masking should be fine.
> - For BPF_NEG and BPF_ARSH, might be a problem because of the sign bit.
> - For BPF_END, 32 bit masking should work fine.
>  Let me know if any of the above point is wrong or need your suggestion.
>
> - Although, for ALU instructions, there is a big problem of register
>   flag manipulations. Generally, architecture's ABI takes care of this
>   part but as we are doing 64 bit Instructions emulation(kind of) on 32
>   bit machine, it needs to be done manually. Does that sound correct ?
>
> - I am not JITing BPF_ALU64 class instructions as of now. As we have to
>   take care of atomic instructions and race conditions with these
>   instruction which looks complicated to me as of now. Will try to figure out
>   this part and implement it later. Currently, I will just let it be
>   interpreted by the ebpf interpreter.
>
> - For BPF_JMP class, I am assuming that, although eBPF is 64 bit ABI,
>   the address pointers on 32 bit arch like arm will be of 32 bit only.
>   So, for BPF_JMP, masking the 64 bit destination address to 32 bit
>   should do the trick and no address will be corrupted in this way. Am I
>   correct to assume this ?
>   Also, I need to check for address getting out of the allowed memory
>   range.
>
> - For BPF_LD, BPF_LDX, BPF_ST and BPF_STX class instructions, I am
>   assuming the same thing as above - All addresses and pointers are 32
>   bit - which can be taken care just by maksing the eBPF register
>   values. Does that sound correct ?
>   Also, I need to check for the address overflow, address getting out
>   of the allowed memory range and things like that.
>
>> > Do you have any code references for me to take a look? Otherwise, I think
>> > its not possible for me to implement it without using any reference.
>>
>> I don't know anything else, no.
>
> I think, I will give it a try. Otherwise, my last 1 month which I used
> to read about eBPF, eBPF linux code and arm32 ABI would be a complete
> waste.
>
>> >>
>> >>
>> >> > 2.) Also, is my current mapping good enough to make the JIT fast enough
>> >> > ?
>> >> > because as you might know, eBPF JIT mostly depends on 1-to-1 mapping of
>> >> > its instructions with native instructions.
>> >>
>> >> I don't know -- it might be tricky with needing to deal with 64-bit
>> >> registers. But if you can make it faster than the non-JIT, it should
>> >> be a win. :) Yay assembly.
>
> Well, As I mentioned above about my thinking towards the implementation,
> I am not sure it would be faster than non-JIT or even correct for that matter.
> It might be but I don't think I have enough knowledge to benchmark the
> implementation as of now.
>
>
> -Shubham Bansal

-Shubham


[PATCH net-next v4 1/2] qed: Add infrastructure for PTP support.

2017-02-07 Thread Sudarsana Kalluru
From: Sudarsana Reddy Kalluru 

The patch adds the required qed interfaces for configuring/reading
the PTP clock on the adapter.

Signed-off-by: Sudarsana Reddy Kalluru 
Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qed/Makefile   |   2 +-
 drivers/net/ethernet/qlogic/qed/qed.h  |   2 +
 drivers/net/ethernet/qlogic/qed/qed_l2.c   |   5 +
 drivers/net/ethernet/qlogic/qed/qed_l2.h   |   1 +
 drivers/net/ethernet/qlogic/qed/qed_main.c |  15 ++
 drivers/net/ethernet/qlogic/qed/qed_ptp.c  | 316 +
 drivers/net/ethernet/qlogic/qed/qed_ptp.h  |  47 
 drivers/net/ethernet/qlogic/qed/qed_reg_addr.h |  31 +++
 include/linux/qed/qed_eth_if.h |  22 ++
 9 files changed, 440 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/qlogic/qed/qed_ptp.c
 create mode 100644 drivers/net/ethernet/qlogic/qed/qed_ptp.h

diff --git a/drivers/net/ethernet/qlogic/qed/Makefile 
b/drivers/net/ethernet/qlogic/qed/Makefile
index 729e437..1a7300f 100644
--- a/drivers/net/ethernet/qlogic/qed/Makefile
+++ b/drivers/net/ethernet/qlogic/qed/Makefile
@@ -2,7 +2,7 @@ obj-$(CONFIG_QED) := qed.o
 
 qed-y := qed_cxt.o qed_dev.o qed_hw.o qed_init_fw_funcs.o qed_init_ops.o \
 qed_int.o qed_main.o qed_mcp.o qed_sp_commands.o qed_spq.o qed_l2.o \
-qed_selftest.o qed_dcbx.o qed_debug.o
+qed_selftest.o qed_dcbx.o qed_debug.o qed_ptp.o
 qed-$(CONFIG_QED_SRIOV) += qed_sriov.o qed_vf.o
 qed-$(CONFIG_QED_LL2) += qed_ll2.o
 qed-$(CONFIG_QED_RDMA) += qed_roce.o
diff --git a/drivers/net/ethernet/qlogic/qed/qed.h 
b/drivers/net/ethernet/qlogic/qed/qed.h
index 1f61cf3..6557f94 100644
--- a/drivers/net/ethernet/qlogic/qed/qed.h
+++ b/drivers/net/ethernet/qlogic/qed/qed.h
@@ -456,6 +456,8 @@ struct qed_hwfn {
u8 dcbx_no_edpm;
u8 db_bar_no_edpm;
 
+   /* p_ptp_ptt is valid for leading HWFN only */
+   struct qed_ptt *p_ptp_ptt;
struct qed_simd_fp_handler  simd_proto_handler[64];
 
 #ifdef CONFIG_QED_SRIOV
diff --git a/drivers/net/ethernet/qlogic/qed/qed_l2.c 
b/drivers/net/ethernet/qlogic/qed/qed_l2.c
index 7520eb3..df932be 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_l2.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_l2.c
@@ -214,6 +214,7 @@ int qed_sp_eth_vport_start(struct qed_hwfn *p_hwfn,
p_ramrod->vport_id  = abs_vport_id;
 
p_ramrod->mtu   = cpu_to_le16(p_params->mtu);
+   p_ramrod->handle_ptp_pkts   = p_params->handle_ptp_pkts;
p_ramrod->inner_vlan_removal_en = p_params->remove_inner_vlan;
p_ramrod->drop_ttl0_en  = p_params->drop_ttl0;
p_ramrod->untagged  = p_params->only_untagged;
@@ -1886,6 +1887,7 @@ static int qed_start_vport(struct qed_dev *cdev,
start.drop_ttl0 = params->drop_ttl0;
start.opaque_fid = p_hwfn->hw_info.opaque_fid;
start.concrete_fid = p_hwfn->hw_info.concrete_fid;
+   start.handle_ptp_pkts = params->handle_ptp_pkts;
start.vport_id = params->vport_id;
start.max_buffers_per_cqe = 16;
start.mtu = params->mtu;
@@ -2328,6 +2330,8 @@ static int qed_fp_cqe_completion(struct qed_dev *dev,
 extern const struct qed_eth_dcbnl_ops qed_dcbnl_ops_pass;
 #endif
 
+extern const struct qed_eth_ptp_ops qed_ptp_ops_pass;
+
 static const struct qed_eth_ops qed_eth_ops_pass = {
.common = _common_ops_pass,
 #ifdef CONFIG_QED_SRIOV
@@ -2336,6 +2340,7 @@ static int qed_fp_cqe_completion(struct qed_dev *dev,
 #ifdef CONFIG_DCB
.dcb = _dcbnl_ops_pass,
 #endif
+   .ptp = _ptp_ops_pass,
.fill_dev_info = _fill_eth_dev_info,
.register_ops = _register_eth_ops,
.check_mac = _check_mac,
diff --git a/drivers/net/ethernet/qlogic/qed/qed_l2.h 
b/drivers/net/ethernet/qlogic/qed/qed_l2.h
index 93cb932..e763abd 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_l2.h
+++ b/drivers/net/ethernet/qlogic/qed/qed_l2.h
@@ -156,6 +156,7 @@ struct qed_sp_vport_start_params {
enum qed_tpa_mode tpa_mode;
bool remove_inner_vlan;
bool tx_switching;
+   bool handle_ptp_pkts;
bool only_untagged;
bool drop_ttl0;
u8 max_buffers_per_cqe;
diff --git a/drivers/net/ethernet/qlogic/qed/qed_main.c 
b/drivers/net/ethernet/qlogic/qed/qed_main.c
index 93eee83..592e104 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_main.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_main.c
@@ -902,6 +902,7 @@ static int qed_slowpath_start(struct qed_dev *cdev,
struct qed_mcp_drv_version drv_version;
const u8 *data = NULL;
struct qed_hwfn *hwfn;
+   struct qed_ptt *p_ptt;
int rc = -EINVAL;
 
if (qed_iov_wq_start(cdev))
@@ -916,6 +917,14 @@ static int qed_slowpath_start(struct qed_dev *cdev,
  QED_FW_FILE_NAME);

[PATCH net-next v4 2/2] qede: Add driver support for PTP.

2017-02-07 Thread Sudarsana Kalluru
From: Sudarsana Reddy Kalluru 

This patch adds the driver support for,
  - Registering the ptp clock functionality with the OS.
  - Timestamping the Rx/Tx PTP packets.
  - Ethtool callbacks related to PTP.

Signed-off-by: Sudarsana Reddy Kalluru 
Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/Kconfig |   1 +
 drivers/net/ethernet/qlogic/qede/Makefile   |   2 +-
 drivers/net/ethernet/qlogic/qede/qede.h |   4 +
 drivers/net/ethernet/qlogic/qede/qede_ethtool.c |  10 +
 drivers/net/ethernet/qlogic/qede/qede_fp.c  |   5 +
 drivers/net/ethernet/qlogic/qede/qede_main.c|  39 ++
 drivers/net/ethernet/qlogic/qede/qede_ptp.c | 536 
 drivers/net/ethernet/qlogic/qede/qede_ptp.h |  65 +++
 8 files changed, 661 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/qlogic/qede/qede_ptp.c
 create mode 100644 drivers/net/ethernet/qlogic/qede/qede_ptp.h

diff --git a/drivers/net/ethernet/qlogic/Kconfig 
b/drivers/net/ethernet/qlogic/Kconfig
index 3cfd105..aaa1e85 100644
--- a/drivers/net/ethernet/qlogic/Kconfig
+++ b/drivers/net/ethernet/qlogic/Kconfig
@@ -104,6 +104,7 @@ config QED_SRIOV
 config QEDE
tristate "QLogic QED 25/40/100Gb Ethernet NIC"
depends on QED
+   imply PTP_1588_CLOCK
---help---
  This enables the support for ...
 
diff --git a/drivers/net/ethernet/qlogic/qede/Makefile 
b/drivers/net/ethernet/qlogic/qede/Makefile
index 38fbee6..bc5f7c3 100644
--- a/drivers/net/ethernet/qlogic/qede/Makefile
+++ b/drivers/net/ethernet/qlogic/qede/Makefile
@@ -1,5 +1,5 @@
 obj-$(CONFIG_QEDE) := qede.o
 
-qede-y := qede_main.o qede_fp.o qede_filter.o qede_ethtool.o
+qede-y := qede_main.o qede_fp.o qede_filter.o qede_ethtool.o qede_ptp.o
 qede-$(CONFIG_DCB) += qede_dcbnl.o
 qede-$(CONFIG_QED_RDMA) += qede_roce.o
diff --git a/drivers/net/ethernet/qlogic/qede/qede.h 
b/drivers/net/ethernet/qlogic/qede/qede.h
index b423406..f2aaef2 100644
--- a/drivers/net/ethernet/qlogic/qede/qede.h
+++ b/drivers/net/ethernet/qlogic/qede/qede.h
@@ -137,6 +137,8 @@ struct qede_rdma_dev {
struct workqueue_struct *roce_wq;
 };
 
+struct qede_ptp;
+
 struct qede_dev {
struct qed_dev  *cdev;
struct net_device   *ndev;
@@ -148,8 +150,10 @@ struct qede_dev {
u32 flags;
 #define QEDE_FLAG_IS_VFBIT(0)
 #define IS_VF(edev)(!!((edev)->flags & QEDE_FLAG_IS_VF))
+#define QEDE_TX_TIMESTAMPING_ENBIT(1)
 
const struct qed_eth_ops*ops;
+   struct qede_ptp *ptp;
 
struct qed_dev_eth_info dev_info;
 #define QEDE_MAX_RSS_CNT(edev) ((edev)->dev_info.num_queues)
diff --git a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c 
b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
index baf2642..c02754d 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
@@ -39,6 +39,7 @@
 #include 
 #include 
 #include "qede.h"
+#include "qede_ptp.h"
 
 #define QEDE_RQSTAT_OFFSET(stat_name) \
 (offsetof(struct qede_rx_queue, stat_name))
@@ -940,6 +941,14 @@ static int qede_set_channels(struct net_device *dev,
return 0;
 }
 
+static int qede_get_ts_info(struct net_device *dev,
+   struct ethtool_ts_info *info)
+{
+   struct qede_dev *edev = netdev_priv(dev);
+
+   return qede_ptp_get_ts_info(edev, info);
+}
+
 static int qede_set_phys_id(struct net_device *dev,
enum ethtool_phys_id_state state)
 {
@@ -1586,6 +1595,7 @@ static int qede_get_tunable(struct net_device *dev,
.get_rxfh_key_size = qede_get_rxfh_key_size,
.get_rxfh = qede_get_rxfh,
.set_rxfh = qede_set_rxfh,
+   .get_ts_info = qede_get_ts_info,
.get_channels = qede_get_channels,
.set_channels = qede_set_channels,
.self_test = qede_self_test,
diff --git a/drivers/net/ethernet/qlogic/qede/qede_fp.c 
b/drivers/net/ethernet/qlogic/qede/qede_fp.c
index 26848ee..1e65038 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_fp.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_fp.c
@@ -40,6 +40,7 @@
 #include 
 #include 
 #include 
+#include "qede_ptp.h"
 
 #include 
 #include "qede.h"
@@ -1277,6 +1278,7 @@ static int qede_rx_process_cqe(struct qede_dev *edev,
qede_get_rxhash(skb, fp_cqe->bitfields, fp_cqe->rss_hash);
qede_set_skb_csum(skb, csum_flag);
skb_record_rx_queue(skb, rxq->rxq_id);
+   qede_ptp_record_rx_ts(edev, cqe, skb);
 
/* SKB is prepared - pass it to stack */
qede_skb_receive(edev, fp, rxq, skb, le16_to_cpu(fp_cqe->vlan_tag));
@@ -1451,6 +1453,9 @@ netdev_tx_t qede_start_xmit(struct sk_buff *skb, struct 
net_device *ndev)
first_bd->data.bd_flags.bitfields =
1 << ETH_TX_1ST_BD_FLAGS_START_BD_SHIFT;
 
+   if 

[PATCH] net: fix description of skb_find_text() according to removed functionality

2017-02-07 Thread Igor Pylypiv
I am not planning to to add a new user of this functions.
Use of skb_find_text() was a part of my Linux study and its
description informed me that I can use textsearch_next()
which I cannot. Just want to fix this.

On Tue, Feb 7, 2017 at 7:02 PM, David Miller  wrote:
>
> How about you make edits to this interface when you add an in-tree
> user as we mentioned in our responses to your previous patch?
>
> Thank you.


[PATCH net] net: dsa: Do not destroy invalid network devices

2017-02-07 Thread Florian Fainelli
dsa_slave_create() can fail, and dsa_user_port_unapply() will properly check
for the network device not being NULL before attempting to destroy it. We were
not setting the slave network device as NULL if dsa_slave_create() failed, so
we would later on be calling dsa_slave_destroy() on a now free'd and
unitialized network device, causing crashes in dsa_slave_destroy().

Fixes: 83c0afaec7b7 ("net: dsa: Add new binding implementation")
Signed-off-by: Florian Fainelli 
---
 net/dsa/dsa2.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index 6f5f0a2ad256..737be6470c7f 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -271,6 +271,7 @@ static int dsa_user_port_apply(struct dsa_port *port, u32 
index,
if (err) {
dev_warn(ds->dev, "Failed to create slave %d: %d\n",
 index, err);
+   ds->ports[index].netdev = NULL;
return err;
}
 
-- 
2.9.3



RE: [RFC v3 00/11] HFI Virtual Network Interface Controller (VNIC)

2017-02-07 Thread Weiny, Ira
> 
> On Tue, 2017-02-07 at 16:54 -0800, Vishwanathapura, Niranjana wrote:
> > On Tue, Feb 07, 2017 at 09:58:50PM +, Bart Van Assche wrote:
> > > On Tue, 2017-02-07 at 21:44 +, Hefty, Sean wrote:
> > > > This is Ethernet - not IP - encapsulation over a non-InfiniBand
> device/protocol.
> > >
> > > That's more than clear from the cover letter. In my opinion the
> > > cover letter should explain why it is considered useful to have such
> > > a driver upstream and what the use cases are of encapsulating
> > > Ethernet frames inside RDMA packets.
> >
> > We believe on our HW, HFI VNIC design gives better hardware resource
> > usage which is also scalable and hence room for better performance.
> > Also as evident in the cover letter, it gives us better manageability
> > by defining virtual Ethernet switches overlaid on the fabric and use
> > standard Ethernet support provided by Linux.
> 
> That kind of language is appropriate for a marketing brochure but not for a
> technical forum.

Well.  That is not totally true.  Perhaps more detail on how we get better 
performance but we thought this has been covered already.

> Even reading your statement twice did not make me any wiser.
> You mentioned "better hardware resource usage". Compared to what? Is that
> perhaps compared to IPoIB?  Since Ethernet frames have an extra header and
> are larger than IPoIB frames, how can larger frames result in better hardware
> resource usage? 

Yes, as compared to IPoIB.  The problem with IPoIB is it introduces a 
significant amount of Verbs overhead which is not needed for Ethernet 
encapsulation.  Especially on hardware such as ours.  As Jason has mentioned 
having a more generic "skb_send" or "skb_qp" has been discussed in the past.

As we discussed at the plumbers conference not all send/receive paths are 
"Queue Pairs".  Yes we have a send queue (multiple send queues actually) and a 
recv queue (again multiple queues) but there is no pairing of the queues at 
all.  There are no completion semantics required either.  This reduced overhead 
results in better performance on our hardware.

> And what is a virtual Ethernet switch? Is this perhaps packet
> forwarding by software? If so, why are virtual Ethernet switches needed since
> the Linux networking stack already supports packet forwarding?

Virtual Ethernet switches provide packet switching through the native OPA 
switches via OPA Virtual Fabrics (a tuple of the path information including 
lid/pkey/sl/mtu).  This is not packet forwarding within the node.  A large 
advantage here is that the virtual switches are centrally managed by the EM in 
a very scalable way.  For example, the IPoIB configuration semantics such as 
multicast group join/create, Path Record queries, etc are all eliminated.  
Further reducing overhead.

Ira



Re: [PATCH v3 1/2] mac80211: fils_aead: Use crypto api CMAC shash rather than bare cipher

2017-02-07 Thread Johannes Berg
This looks strange to me:

> +static int aes_s2v(struct crypto_shash *tfm,
>      size_t num_elem, const u8 *addr[], size_t len[],
> u8 *v)
>  {
> - u8 d[AES_BLOCK_SIZE], tmp[AES_BLOCK_SIZE];
> + u8 d[AES_BLOCK_SIZE], tmp[AES_BLOCK_SIZE] = {};
> + SHASH_DESC_ON_STACK(desc, tfm);

desc declared

> 
> + crypto_shash_digest(desc, tmp, AES_BLOCK_SIZE, d);

used here
 
> + crypto_shash_init(desc);

but initialized now?

johannes


[PATCH net-next v4 0/2] qed*: Add support for PTP

2017-02-07 Thread Sudarsana Kalluru
From: Sudarsana Reddy Kalluru 

Hi David,
The patch series adds required changes for qed/qede drivers for
supporting the IEEE Precision Time Protocol (PTP).

Please consider applying this series to "net-next".

Thanks,
Sudarsana

Changes from previous versions:
---
v4: Remove the loop iteration for value '0' in the qed_ptp_hw_adjfreq()
implementation.

v3: Use div_s64 for 64-bit divisions as do_div gives error for signed
types.
Incorporated review comments from Richard Cochran.
  - Clear timestamp resgisters as soon as timestamp is read.
  - Use shift operation in the place of 'divide by 16'.

v2: Use do_div for 64-bit divisions.

Sudarsana Reddy Kalluru (2):
  qed: Add infrastructure for PTP support.
  qede: Add driver support for PTP.

 drivers/net/ethernet/qlogic/Kconfig |   1 +
 drivers/net/ethernet/qlogic/qed/Makefile|   2 +-
 drivers/net/ethernet/qlogic/qed/qed.h   |   2 +
 drivers/net/ethernet/qlogic/qed/qed_l2.c|   5 +
 drivers/net/ethernet/qlogic/qed/qed_l2.h|   1 +
 drivers/net/ethernet/qlogic/qed/qed_main.c  |  15 +
 drivers/net/ethernet/qlogic/qed/qed_ptp.c   | 316 ++
 drivers/net/ethernet/qlogic/qed/qed_ptp.h   |  47 +++
 drivers/net/ethernet/qlogic/qed/qed_reg_addr.h  |  31 ++
 drivers/net/ethernet/qlogic/qede/Makefile   |   2 +-
 drivers/net/ethernet/qlogic/qede/qede.h |   4 +
 drivers/net/ethernet/qlogic/qede/qede_ethtool.c |  10 +
 drivers/net/ethernet/qlogic/qede/qede_fp.c  |   5 +
 drivers/net/ethernet/qlogic/qede/qede_main.c|  39 ++
 drivers/net/ethernet/qlogic/qede/qede_ptp.c | 536 
 drivers/net/ethernet/qlogic/qede/qede_ptp.h |  65 +++
 include/linux/qed/qed_eth_if.h  |  22 +
 17 files changed, 1101 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/ethernet/qlogic/qed/qed_ptp.c
 create mode 100644 drivers/net/ethernet/qlogic/qed/qed_ptp.h
 create mode 100644 drivers/net/ethernet/qlogic/qede/qede_ptp.c
 create mode 100644 drivers/net/ethernet/qlogic/qede/qede_ptp.h

-- 
1.8.3.1



Re: [RFC v3 02/11] IB/hfi-vnic: Virtual Network Interface Controller (VNIC) interface

2017-02-07 Thread Leon Romanovsky
On Tue, Feb 07, 2017 at 02:19:01PM -0700, Jason Gunthorpe wrote:
> On Tue, Feb 07, 2017 at 12:23:01PM -0800, Vishwanathapura, Niranjana wrote:
> > Add rdma netdev interface to ib device structure allowing rdma netdev
> > devices to be allocated by ib clients.
> > Define HFI VNIC interface between hardware independent VNIC
> > functionality and the hardware dependent VNIC functionality.
>
> This commit message could be a bit clearer.
>
> The alloc_rdma_netdev multiplexer is inteded as a new general
> interface and this adds a protocol definition for ethernet VNIC on
> OPA.
>
> The hope is that ipoib can follow the same example and use the same
> alloc_rdma_netdev entry point. Hopefully Mellanox will look at this
> patch as I have talked to them in the past about doing this...

Jason,

We looked on it and it is useless for us, mainly because of the fact
that most  of the work is done in our net part of the driver.

So, as it looks for now, this ULP exercise will be for HFI only.

Thanks


signature.asc
Description: PGP signature


Re: [PATCH net 0/2] net: phy: Unbind/bind fixes

2017-02-07 Thread Florian Fainelli


On 02/05/2017 02:25 PM, Florian Fainelli wrote:
> Hi all,
> 
> This patch series addresses the inability to safely unbind and bind
> PHY drivers by making the appropriate checks throught PHYLIB where we
> may be directly responding to user-space queries, as well as from within
> the kernel state machine.
> 
> The second patch makes the unbind -> bind working by taking care of the
> PHY state machine state.

I need to make another set of fixes after "net: phy: Fix lack of
reference count on PHY driver" because the Generic PHY driver is of
course behaving differently and my testing was focused on every other
one but this one which is built-in...

> 
> Florian Fainelli (2):
>   net: phy: Check phydev->drv
>   net: phy: Fix PHY driver bind and unbind events
> 
>  drivers/net/phy/phy.c| 26 ++
>  drivers/net/phy/phy_device.c | 27 +--
>  include/linux/phy.h  |  3 +++
>  3 files changed, 50 insertions(+), 6 deletions(-)
> 

-- 
Florian


Re: Extending socket timestamping API for NTP

2017-02-07 Thread Denny Page

> On Feb 07, 2017, at 06:01, Miroslav Lichvar  wrote:
> 
> 1) new rx_filter for NTP
> 
>   Some NICs can't timestamp all received packets and are currently
>   unusable for NTP with HW timestamping. The new filter would allow
>   NTP support in new NICs and adding support to existing NICs with
>   firmware/driver updates. The filter would apply to IPv4 and IPv6
>   UDP packets received from or sent to the port number 123.
> 

I think this is a good idea. Even if the hardware doesn’t support it, the 
filtering could be done in the kernel. Save a huge number of context switches.



> 4) allow sockets to use both SW and HW TX timestamping at the same time
> 
>   When using a socket which is not bound to a specific interface, it
>   would be nice to get transmit SW timestamps when HW timestamps are
>   missing. I suspect it's difficult to predict if a HW timestamp will
>   be available. Maybe it would be acceptable to get from the error
>   queue two messages per transmission if the interface supports both
>   SW and HW timestamping?
> 

Highly agreed. The current interface pretty much forces a socket per physical 
interface, which should not be necessary.


> 5) new SO_TIMESTAMPING options to get transposed RX timestamps
> 
>   PTP uses preamble RX timestamps, but NTP works with trailer RX
>   timestamps. This means NTP implementations currently need to
>   transpose HW RX timestamps. The calculation requires the link speed
>   and the length of the packet at layer 2. It seems this can be
>   reliably done only using raw sockets. It would be very nice if the
>   kernel could tranpose the timestamps automatically.
> 
>   The existing SOF_TIMESTAMPING_RX_HARDWARE flag could be aliased to
>   SOF_TIMESTAMPING_RX_HARDWARE_PREAMBLE and the new flag could be
>   SOF_TIMESTAMPING_RX_HARDWARE_TRAILER.
> 
>   PTP has a similar problem with SW RX timestamps, which are closer
>   to the trailer timestamps rather than preamble timestamps. A new
>   SOF_TIMESTAMPING_RX_SOFTWARE_PREAMBLE flag could be added for PTP
>   implementations to get transposed timestamps in order to improve
>   accuracy.
> 

Also highly agreed.

Denny



Re: Extending socket timestamping API for NTP

2017-02-07 Thread Denny Page
On Feb 07, 2017, at 21:27, Richard Cochran  wrote:
> 
> On Tue, Feb 07, 2017 at 05:52:52PM -0800, Denny Page wrote:
>> Most, but not all. The TI DP83630 doesn’t support timestamping for all 
>> packets, but it does support either PTP or NTP:
> 
> That is the one and only device that explicitly supports NTP. This is
> a nice idea, of course, but it just did not take off among other
> products.

I have to say I haven’t gone looking for others, and will take your word 
regarding the DP83630 being the one and only. I only learned about the DP83630 
because I have a couple of stratum 1 devices that use this phy and have been 
working with the vendor regarding integration of hardware timestamps for NTP.

Denny



Re: [PATCH net-next 7/7] openvswitch: Pack struct sw_flow_key.

2017-02-07 Thread Jarno Rajahalme

> On Feb 6, 2017, at 11:15 PM, Joe Stringer  wrote:
> 
> On 2 February 2017 at 17:10, Jarno Rajahalme  wrote:
>> struct sw_flow_key has two 16-bit holes. Move the most matched
>> conntrack match fields there.  In some typical cases this reduces the
>> size of the key that needs to be hashed into half and into one cache
>> line.
>> 
>> Signed-off-by: Jarno Rajahalme 
> 
> Looks like this misses the zeroing in ovs_nla_get_flow_metadata();
> might want to double-check for any other memset/copies of the key->ct
> field.

Good catch. Looked, there are no other places to change.

Will rebase to current net-next and repost.

 Jarno



Re: [PATCH net-next 6/7] openvswitch: Add force commit.

2017-02-07 Thread Jarno Rajahalme

> On Feb 7, 2017, at 2:15 PM, Joe Stringer  wrote:
> 
> On 2 February 2017 at 17:10, Jarno Rajahalme  wrote:
>> Stateful network admission policy may allow connections to one
>> direction and reject connections initiated in the other direction.
>> After policy change it is possible that for a new connection an
>> overlapping conntrack entry already exist, where the connection
>> original direction is opposed to the new connection's initial packet.
>> 
>> Most importantly, conntrack state relating to the current packet gets
>> the "reply" designation based on whether the original direction tuple
>> or the reply direction tuple matched.  If this "directionality" is
>> wrong w.r.t. to the stateful network admission policy it may happen
>> that packets in neither direction are correctly admitted.
>> 
>> This patch adds a new "force commit" option to the OVS conntrack
>> action that checks the original direction of an existing conntrack
>> entry.  If that direction is opposed to the current packet, the
>> existing conntrack entry is deleted and a new one is subsequently
>> created in the correct direction.
>> 
>> Signed-off-by: Jarno Rajahalme 
>> ---
>> include/uapi/linux/openvswitch.h | 10 ++
>> net/openvswitch/conntrack.c  | 27 +--
>> 2 files changed, 35 insertions(+), 2 deletions(-)
>> 
>> diff --git a/include/uapi/linux/openvswitch.h 
>> b/include/uapi/linux/openvswitch.h
>> index 90af8b8..d5ba9a9 100644
>> --- a/include/uapi/linux/openvswitch.h
>> +++ b/include/uapi/linux/openvswitch.h
>> @@ -674,6 +674,10 @@ struct ovs_action_hash {
>>  * @OVS_CT_ATTR_HELPER: variable length string defining conntrack ALG.
>>  * @OVS_CT_ATTR_NAT: Nested OVS_NAT_ATTR_* for performing L3 network address
>>  * translation (NAT) on the packet.
>> + * @OVS_CT_ATTR_FORCE_COMMIT: Like %OVS_CT_ATTR_COMMIT, but instead of doing
>> + * nothing if the connection is already committed will check that the 
>> current
>> + * packet is in conntrack entry's original direction.  If directionality 
>> does
>> + * not match, will delete the existing conntrack entry and commit a new one.
>>  */
>> enum ovs_ct_attr {
>>OVS_CT_ATTR_UNSPEC,
>> @@ -684,6 +688,12 @@ enum ovs_ct_attr {
>>OVS_CT_ATTR_HELPER, /* netlink helper to assist detection of
>>   related connections. */
>>OVS_CT_ATTR_NAT,/* Nested OVS_NAT_ATTR_* */
>> +   OVS_CT_ATTR_FORCE_COMMIT,  /* No argument, commits connection.  If 
>> the
>> +   * conntrack entry original direction 
>> tuple
>> +   * does not match the current packet 
>> header
>> +   * values, will delete the current 
>> conntrack
>> +   * entry and create a new one.
>> +   */
> 
> We only need one copy of the explanation, keep it above the enum, then
> the inline comment can be /* No argument */.
> 

OK.

>>__OVS_CT_ATTR_MAX
>> };
>> 
>> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
>> index 1afe153..1f27f44 100644
>> --- a/net/openvswitch/conntrack.c
>> +++ b/net/openvswitch/conntrack.c
>> @@ -65,6 +65,7 @@ struct ovs_conntrack_info {
>>struct nf_conn *ct;
>>u8 commit : 1;
>>u8 nat : 3; /* enum ovs_ct_nat */
>> +   u8 force : 1;
>>u16 family;
>>struct md_mark mark;
>>struct md_labels labels;
>> @@ -631,10 +632,13 @@ static bool skb_nfct_cached(struct net *net,
>> */
>>if (!ct && key->ct.state & OVS_CS_F_TRACKED &&
>>!(key->ct.state & OVS_CS_F_INVALID) &&
>> -   key->ct.zone == info->zone.id)
>> +   key->ct.zone == info->zone.id) {
>>ct = ovs_ct_find_existing(net, >zone, info->family, skb,
>>  !!(key->ct.state
>> & OVS_CS_F_NAT_MASK));
>> +   if (ct)
>> +   nf_ct_get(skb, );
>> +   }
> 
> If ctinfo is only used with the new call below, we can unconditionally
> fetch this just before it's used...
> 
>>if (!ct)
>>return false;
>>if (!net_eq(net, read_pnet(>ct_net)))
>> @@ -648,6 +652,19 @@ static bool skb_nfct_cached(struct net *net,
>>if (help && rcu_access_pointer(help->helper) != info->helper)
>>return false;
>>}
>> +   /* Force conntrack entry direction to the current packet? */
> 
> Here.
> 

But then we would be executing nf_ct_get() twice in the common case?

>> +   if (info->force && CTINFO2DIR(ctinfo) != IP_CT_DIR_ORIGINAL) {
>> +   /* Delete the conntrack entry if confirmed, else just release
>> +* the reference.
>> +*/
>> +   if (nf_ct_is_confirmed(ct))
>> +   

Re: [PATCH net-next 4/7] openvswitch: Inherit master's labels.

2017-02-07 Thread Jarno Rajahalme

> On Feb 6, 2017, at 1:53 PM, Joe Stringer  wrote:
> 
> On 2 February 2017 at 17:10, Jarno Rajahalme  wrote:
>> We avoid calling into nf_conntrack_in() for expected connections, as
>> that would remove the expectation that we want to stick around until
>> we are ready to commit the connection.  Instead, we do a lookup in the
>> expectation table directly.  However, after a successful expectation
>> lookup we have set the flow key label field from the master
>> connection, whereas nf_conntrack_in() does not do this.  This leads to
>> master's labels being iherited after an expectation lookup, but those
>> labels not being inherited after the corresponding conntrack action
>> with a commit flag.
>> 
>> This patch resolves the problem by changing the commit code path to
>> also inherit the master's labels to the expected connection.
>> Resolving this conflict in favor or inheriting the labels allows
>> information be passed from the master connection to related
>> connections, which would otherwise be much harder.  Labels can still
>> be set explicitly, so this change only affects the default values of
>> the labels in presense of a master connection.
>> 
>> Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
>> Signed-off-by: Jarno Rajahalme 
>> ---
>> net/openvswitch/conntrack.c | 48 
>> -
>> 1 file changed, 34 insertions(+), 14 deletions(-)
>> 
>> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
>> index a163c44..738a4fa 100644
>> --- a/net/openvswitch/conntrack.c
>> +++ b/net/openvswitch/conntrack.c
>> @@ -265,6 +265,8 @@ static struct nf_conn_labels 
>> *ovs_ct_get_conn_labels(struct nf_conn *ct)
>>return cl;
>> }
>> 
>> +static bool labels_nonzero(const struct ovs_key_ct_labels *labels);
>> +
> 
> These declarations typically live at the top of the file, not
> somewhere in the middle.
> 

Moved.

>> /* Initialize labels for a new, to be committed conntrack entry.  Note that
>>  * since the new connection is not yet confirmed, and thus no-one else has
>>  * access to it's labels, we simply write them over.  Also, we refrain from
>> @@ -275,18 +277,35 @@ static int ovs_ct_init_labels(struct nf_conn *ct, 
>> struct sw_flow_key *key,
>>  const struct ovs_key_ct_labels *labels,
>>  const struct ovs_key_ct_labels *mask)
>> {
>> -   struct nf_conn_labels *cl;
>> -   u32 *dst;
>> -   int i;
>> +   struct nf_conn_labels *cl, *master_cl;
>> +   bool have_mask = labels_nonzero(mask);
>> +
>> +   /* Inherit master's labels to the related connection? */
>> +   master_cl = (ct->master) ? nf_ct_labels_find(ct->master) : NULL;
>> +
>> +   if (!master_cl && !have_mask)
>> +   return 0;   /* Nothing to do. */
>> 
>>cl = ovs_ct_get_conn_labels(ct);
>>if (!cl)
>>return -ENOSPC;
>> 
>> -   dst = (u32 *)cl->bits;
>> -   for (i = 0; i < OVS_CT_LABELS_LEN_32; i++)
>> -   dst[i] = (dst[i] & ~mask->ct_labels_32[i]) |
>> -   (labels->ct_labels_32[i] & mask->ct_labels_32[i]);
>> +   /* Inherit the master's labels, if any. */
>> +   if (master_cl) {
>> +   size_t len = sizeof(master_cl->bits);
>> +
>> +   memcpy(>bits, _cl->bits,
>> +  len > OVS_CT_LABELS_LEN ? OVS_CT_LABELS_LEN : len);
> 
> Looks like this is another spot where we're trying to handle differing
> label lengths, which we could simplify if there was a stronger
> guarantee they're the same.

Indeed, here the ‘cl’s are different instances of the same structure, so we do 
need not worry about the sizes at all. I’ll change this to an simple structure 
assignment for v2.

> 
>> +   }
>> +   if (have_mask) {
>> +   u32 *dst = (u32 *)cl->bits;
>> +   int i;
>> +
>> +   for (i = 0; i < OVS_CT_LABELS_LEN_32; i++)
>> +   dst[i] = (dst[i] & ~mask->ct_labels_32[i]) |
>> +   (labels->ct_labels_32[i]
>> +& mask->ct_labels_32[i]);
>> +   }
> 
> By the way, is this open-coding nf_connlabels_replace()? Can
> ovs_ct_set_labels() and this share the code?

nf_connlabels_replace() uses the compare-and-exchange function to change each 
32-bit unit individually, and also triggers the nf netlink change event, first 
of which we do not need and the second of which we do not want.

> 
>> 
>>memcpy(>ct.labels, cl->bits, OVS_CT_LABELS_LEN);
>> 
>> @@ -916,13 +935,14 @@ static int ovs_ct_commit(struct net *net, struct 
>> sw_flow_key *key,
>>if (err)
>>return err;
>>}
>> -   if (labels_nonzero(>labels.mask)) {
>> -   if (!nf_ct_is_confirmed(ct))
>> -   err = ovs_ct_init_labels(ct, key, 
>> >labels.value,
>> - 

Re: [PATCH net-next 3/7] openvswitch: Do not trigger events for unconfirmed connection.

2017-02-07 Thread Jarno Rajahalme
Thanks for the review! Comments below,

  Jarno

> On Feb 6, 2017, at 1:46 PM, Joe Stringer  wrote:
> 
> On 2 February 2017 at 17:10, Jarno Rajahalme  wrote:
>> Avoid triggering change events for setting conntrack mark or labels
>> before the conntrack entry has been confirmed.  Refactoring on this
>> patch also makes chenges in later patches easier to review.
>> 
>> Fixes: 182e3042e15d ("openvswitch: Allow matching on conntrack mark")
>> Fixes: c2ac66735870 ("openvswitch: Allow matching on conntrack label")
>> Signed-off-by: Jarno Rajahalme 
> 
> Functional and cosmetic changes should be in separate patches.
> 

OK, will split.

>> ---
>> net/openvswitch/conntrack.c | 87 
>> -
>> 1 file changed, 63 insertions(+), 24 deletions(-)
>> 
>> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
>> index 6730f09..a163c44 100644
>> --- a/net/openvswitch/conntrack.c
>> +++ b/net/openvswitch/conntrack.c
>> @@ -229,23 +229,17 @@ int ovs_ct_put_key(const struct sw_flow_key *key, 
>> struct sk_buff *skb)
>>return 0;
>> }
>> 
>> -static int ovs_ct_set_mark(struct sk_buff *skb, struct sw_flow_key *key,
>> +static int ovs_ct_set_mark(struct nf_conn *ct, struct sw_flow_key *key,
>>   u32 ct_mark, u32 mask)
>> {
>> #if IS_ENABLED(CONFIG_NF_CONNTRACK_MARK)
>> -   enum ip_conntrack_info ctinfo;
>> -   struct nf_conn *ct;
>>u32 new_mark;
>> 
>> -   /* The connection could be invalid, in which case set_mark is no-op. 
>> */
>> -   ct = nf_ct_get(skb, );
>> -   if (!ct)
>> -   return 0;
>> -
>>new_mark = ct_mark | (ct->mark & ~(mask));
>>if (ct->mark != new_mark) {
>>ct->mark = new_mark;
>> -   nf_conntrack_event_cache(IPCT_MARK, ct);
>> +   if (nf_ct_is_confirmed(ct))
>> +   nf_conntrack_event_cache(IPCT_MARK, ct);
>>key->ct.mark = new_mark;
>>}
>> 
>> @@ -255,26 +249,59 @@ static int ovs_ct_set_mark(struct sk_buff *skb, struct 
>> sw_flow_key *key,
>> #endif
>> }
>> 
>> -static int ovs_ct_set_labels(struct sk_buff *skb, struct sw_flow_key *key,
>> -const struct ovs_key_ct_labels *labels,
>> -const struct ovs_key_ct_labels *mask)
>> +static struct nf_conn_labels *ovs_ct_get_conn_labels(struct nf_conn *ct)
>> {
>> -   enum ip_conntrack_info ctinfo;
>>struct nf_conn_labels *cl;
>> -   struct nf_conn *ct;
>> -   int err;
>> -
>> -   /* The connection could be invalid, in which case set_label is 
>> no-op.*/
>> -   ct = nf_ct_get(skb, );
>> -   if (!ct)
>> -   return 0;
>> 
>>cl = nf_ct_labels_find(ct);
>>if (!cl) {
>>nf_ct_labels_ext_add(ct);
>>cl = nf_ct_labels_find(ct);
>>}
>> +
>>if (!cl || sizeof(cl->bits) < OVS_CT_LABELS_LEN)
>> +   return NULL;
>> +
>> +   return cl;
>> +}
>> +
>> +/* Initialize labels for a new, to be committed conntrack entry.  Note that
>> + * since the new connection is not yet confirmed, and thus no-one else has
>> + * access to it's labels, we simply write them over.  Also, we refrain from
>> + * triggering events, as receiving change events before the create event 
>> would
>> + * be confusing.
>> + */
>> +static int ovs_ct_init_labels(struct nf_conn *ct, struct sw_flow_key *key,
>> + const struct ovs_key_ct_labels *labels,
>> + const struct ovs_key_ct_labels *mask)
>> +{
>> +   struct nf_conn_labels *cl;
>> +   u32 *dst;
>> +   int i;
>> +
>> +   cl = ovs_ct_get_conn_labels(ct);
>> +   if (!cl)
>> +   return -ENOSPC;
>> +
>> +   dst = (u32 *)cl->bits;
> 
> Is it worth extending the union to include unsigned long, to avoid
> casting it to u32 here?
> 

This cast is on the struct nf_conn_labels, I would not unionize it at this 
point. This type of cast is typical in conntrack code.

>> +   for (i = 0; i < OVS_CT_LABELS_LEN_32; i++)
>> +   dst[i] = (dst[i] & ~mask->ct_labels_32[i]) |
>> +   (labels->ct_labels_32[i] & mask->ct_labels_32[i]);
>> +
>> +   memcpy(>ct.labels, cl->bits, OVS_CT_LABELS_LEN);
>> +
>> +   return 0;
>> +}
>> +
>> +static int ovs_ct_set_labels(struct nf_conn *ct, struct sw_flow_key *key,
>> +const struct ovs_key_ct_labels *labels,
>> +const struct ovs_key_ct_labels *mask)
>> +{
>> +   struct nf_conn_labels *cl;
>> +   int err;
>> +
>> +   cl = ovs_ct_get_conn_labels(ct);
>> +   if (!cl)
>>return -ENOSPC;
>> 
>>err = nf_connlabels_replace(ct, labels->ct_labels_32,
>> @@ -283,7 +310,8 @@ static int ovs_ct_set_labels(struct sk_buff *skb, struct 
>> sw_flow_key *key,
>>if (err)
>>

Re: [PATCH net-next 5/7] openvswitch: Add original direction conntrack tuple to sw_flow_key.

2017-02-07 Thread Jarno Rajahalme

> On Feb 6, 2017, at 11:15 PM, Joe Stringer  wrote:
> 
> On 2 February 2017 at 17:10, Jarno Rajahalme  wrote:
>> Add the fields of the conntrack original direction 5-tuple to struct
>> sw_flow_key.  The new fields are initially zeroed, and are populated
>> whenever a conntrack action is executed and either finds or generates
>> a conntrack entry.  This means that these fields exist for all packets
>> were not rejected by conntrack as untrackable.
>> 
>> The original tuple fields in the sw_flow_key are filled from the
>> original direction tuple of the conntrack entry relating to the
>> current packet, or from the original direction tuple of the master
>> conntrack entry, if the current conntrack entry has a master.
>> Generally, expected connections of connections having an assigned
>> helper (e.g., FTP), have a master conntrack entry.
>> 
>> The main purpose of the new conntrack original tuple fields is to
>> allow matching on them for policy decision purposes, with the premise
>> that the admissibility of tracked connections reply packets (as well
>> as original direction packets), and both direction packets of any
>> related connections may be based on ACL rules applying to the master
>> connection's original direction 5-tuple.  This also makes it easier to
>> make policy decisions when the actual packet headers might have been
>> transformed by NAT, as the original direction 5-tuple represents the
>> packet headers before any such transformation.
>> 
>> When using the original direction 5-tuple the admissibility of return
>> and/or related packets need not be based on the mere existence of a
>> conntrack entry, allowing separation of admission policy from the
>> established conntrack state.  While existence of a conntrack entry is
>> required for admission of the return or related packets, policy
>> changes can render connections that were initially admitted to be
>> rejected or dropped afterwards.  If the admission of the return and
>> related packets was based on mere conntrack state (e.g., connection
>> being in an established state), a policy change that would make the
>> connection rejected or dropped would need to find and delete all
>> conntrack entries affected by such a change.  When using the original
>> direction 5-tuple matching the affected conntrack entries can be
>> allowed to time out instead, as the established state of the
>> connection would not need to be the basis for packet admission any
>> more.
>> 
>> It should be noted that the directionality of related connections may
>> be the same or different than that of the master connection, and
>> neither the original direction 5-tuple nor the conntrack state bits
>> carry this information.  If needed, the directionality of the master
>> connection can be stored in master's conntrack mark or labels, which
>> are automatically inherited by the expected related connections.
>> 
>> The fact that neither ARP not ND packets are trackable by conntrack
>> allows mutual exclusion between ARP/ND and the new conntrack original
>> tuple fields.  Hence, the IP addresses are overlaid in union with ARP
>> and ND fields.  This allows the sw_flow_key to not grow much due to
>> this patch, but it also means that we must be careful to never use the
>> new key fields with ARP or ND packets.  ARP is easy to distinguish and
>> keep mutually exclusive based on the ethernet type, but ND being an
>> ICMPv6 protocol requires a bit more attention.
>> 
>> Signed-off-by: Jarno Rajahalme 
>> ---
> 
> OK, maybe we need to do something a bit more to handle the NATed
> related connections to address the problem in patch 1.
> 
> 
> 
>> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
>> index 738a4fa..1afe153 100644
>> --- a/net/openvswitch/conntrack.c
>> +++ b/net/openvswitch/conntrack.c
>> @@ -155,6 +155,59 @@ static void __ovs_ct_update_key(struct sw_flow_key 
>> *key, u8 state,
>>key->ct.zone = zone->id;
>>key->ct.mark = ovs_ct_get_mark(ct);
>>ovs_ct_get_labels(ct, >ct.labels);
>> +
>> +   /* Use the master if we have one. */
>> +   if (ct && ct->master)
>> +   ct = ct->master;
> 
> Perhaps:
> 
> if (!ct || sw_flow_key_is_nd(key) || !is_ip_any(key->eth.type)) {
>/* zero everything */
>return;
> }
> 
> One of the things this helps us to avoid is having a comment in the
> middle of an if statement.
> 
> Then afterwards,
> if (ct->master)
>ct = ct->master;
> 
>> +
>> +   key->ct.orig_proto = 0;
>> +   key->ct.orig_tp.src = 0;
>> +   key->ct.orig_tp.dst = 0;
>> +   if (key->eth.type == htons(ETH_P_IP)) {
>> +   /* IP version must match. */
>> +   if (ct && nf_ct_l3num(ct) == NFPROTO_IPV4) {
> 
> I don't quite understand how we could end up with a connection NFPROTO
> that is mismatched with an IP version that we should handle here, but
> if there are some legitimite cases perhaps we can pick them up and
> 

Re: [PATCH net-next 1/7] openvswitch: Use inverted tuple in ovs_ct_find_existing() if NATted.

2017-02-07 Thread Jarno Rajahalme

> On Feb 6, 2017, at 9:07 AM, Pravin Shelar  wrote:
> 
> On Thu, Feb 2, 2017 at 5:10 PM, Jarno Rajahalme  wrote:
>> When looking for an existing conntrack entry, the packet 5-tuple
>> must be inverted if NAT has already been applied, as the current
>> packet headers do not match any conntrack tuple.  For
>> example, if a packet from private address X to a public address B is
>> source-NATted to A, the conntrack entry will have the following tuples
>> (ignoring the protocol and port numbers) after the conntrack entry is
>> committed:
>> 
>> Original direction tuple: (X,B)
>> Reply direction tuple: (B,A)
>> 
>> Now, if a reply packet is already transformed back to the private
>> address space (e.g., with a CT(nat) action), the tuple corresponding
>> to the current packet headers is:
>> 
>> Current packet tuple: (B,X)
>> 
>> This does not match either of the conntrack tuples above.  Normally
>> this does not matter, as the conntrack lookup was already done using
>> the tuple (B,A), but if the current packet does not match any flow in
>> the OVS datapath, the packet is sent to userspace via an upcall,
>> during which the packet's skb is freed, and the conntrack entry
>> pointer in the skb is lost.  When the packet is reintroduced to the
>> datapath, any further conntrack action will need to perform a new
>> conntrack lookup to find the entry again.  Prior to this patch this
>> second lookup failed for NATted packets.  The datapath flow setup
>> corresponding to the upcall can succeed, however, allowing all further
>> packets in the reply direction to re-use the conntrack entry pointer
>> in the skb, so typically the lookup failure only causes a packet drop.
>> 
>> The solution is to invert the tuple derived from the current packet
>> headers in case the conntrack state stored in the packet metadata
>> indicates that the packet has been transformed by NAT:
>> 
>> Inverted tuple: (X,B)
>> 
>> With this the conntrack entry can be found, matching the original
>> direction tuple.
>> 
>> This same logic also works for the original direction packets:
>> 
>> Current packet tuple (after NAT): (A,B)
>> Inverted tuple: (B,A)
>> 
>> While the current packet tuple (A,B) does not match either of the
>> conntrack tuples, the inverted one (B,A) does match the reply
>> direction tuple.
>> 
>> Since the inverted tuple matches the reverse direction tuple the
>> direction of the packet must be reversed as well.
>> 
>> Fixes: 05752523e565 ("openvswitch: Interface with NAT.")
>> Signed-off-by: Jarno Rajahalme 
> 
> I could not apply this patch series to net-next branch. But it does
> applies to net, which branch are you targeting it for?

The patches were against net-next, but there likely was a merge from netfilter 
around the time of me sending the email out causing the difficulty. Will 
address all comments, rebase and post a v2 later today.

 Jarno




Re: [RFC v3 00/11] HFI Virtual Network Interface Controller (VNIC)

2017-02-07 Thread Leon Romanovsky
On Tue, Feb 07, 2017 at 09:00:05PM +, Hefty, Sean wrote:
> > I didn't read patches yet, and prefer to ask it in advance. Does this
> > new ULP work with all
> > drivers/infiniband/hw/* devices as it is expected from ULP?
>
> Like the way ipoib or srp work with all hw devices?  What is the real point 
> of this question?

Sorry, but I don't understand your response. Both IPoIB and SRP were 
standardized
and implemented years before hfi was brought into the RDMA stack, so on
time of introduction they clearly supported all the devices.

Does this VNIC interface have standard? Where can I see HFI wire
protocol to implement HFI VNIC support in our HW?

Thanks.


signature.asc
Description: PGP signature


Re: Extending socket timestamping API for NTP

2017-02-07 Thread Richard Cochran
On Tue, Feb 07, 2017 at 05:52:52PM -0800, Denny Page wrote:
> Most, but not all. The TI DP83630 doesn’t support timestamping for all 
> packets, but it does support either PTP or NTP:

That is the one and only device that explicitly supports NTP. This is
a nice idea, of course, but it just did not take off among other
products.

Thanks,
Richard


Re: [RFC v3 00/11] HFI Virtual Network Interface Controller (VNIC)

2017-02-07 Thread Leon Romanovsky
On Tue, Feb 07, 2017 at 01:43:03PM -0800, Vishwanathapura, Niranjana wrote:
> On Tue, Feb 07, 2017 at 01:00:05PM -0800, Hefty, Sean wrote:
> > > I didn't read patches yet, and prefer to ask it in advance. Does this
> > > new ULP work with all
> > > drivers/infiniband/hw/* devices as it is expected from ULP?
> >
> > Like the way ipoib or srp work with all hw devices?  What is the real point 
> > of this question?
>
> Leon,
> It was already discussed in below threads.
>
> https://www.spinics.net/lists/linux-rdma/msg44128.html
> https://www.spinics.net/lists/linux-rdma/msg44131.html
> https://www.spinics.net/lists/linux-rdma/msg44155.html

Yes, but you still didn't answer on my question.
From the first link:
--
  If that is your position then this should be a straight up IB ULP that
  works with any IB hardware.

Yes, see my comments in point #3 of my previous email...
--

Can I grab these patches and run on one of 14 drivers available in
drivers/inifiniband/hw/* ?

>
> Niranjana
>


signature.asc
Description: PGP signature


Re: [PATCH net-next v2 08/12] iscsi: fix build errors when linux/phy*.h is removed from net/dsa.h

2017-02-07 Thread Nicholas A. Bellinger
Hi Florian,

On Tue, 2017-02-07 at 15:03 -0800, Florian Fainelli wrote:
> From: Russell King 
> 
> drivers/target/iscsi/iscsi_target_login.c:1135:7: error: implicit declaration 
> of function 'try_module_get' [-Werror=implicit-function-declaration]
> 
> Add linux/module.h to iscsi_target_login.c.
> 
> Signed-off-by: Russell King 
> Reviewed-by: Bart Van Assche 
> ---

Acked-by: Nicholas Bellinger 



RE: [RFC v3 09/11] IB/hfi1: HFI_VNIC RDMA netdev support

2017-02-07 Thread Parav Pandit
Hi,

> -Original Message-
> From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-
> ow...@vger.kernel.org] On Behalf Of Vishwanathapura, Niranjana
> Sent: Tuesday, February 7, 2017 2:23 PM
> To: dledf...@redhat.com
> Cc: linux-r...@vger.kernel.org; netdev@vger.kernel.org;
> dennis.dalessan...@intel.com; ira.we...@intel.com; Niranjana
> Vishwanathapura ; Andrzej
> Kacprowski 
> Subject: [RFC v3 09/11] IB/hfi1: HFI_VNIC RDMA netdev support
> 
> Add support to create and free HFI_VNIC rdma netdev devices.
> Implement netstack interface functionality including xmit_skb, receive side
> NAPI etc. Also implement rdma netdev control functions.
> 

All code in this particular patch belong to netdev VNIC ULP driver.
There is nothing much that appears specific to IB/RDMA that makes 
drivers/infiniband/hw as better place to be.
It has netdev tx, rx, napi, stats in this patch.
If VNIC is a ULP than most of the VNIC specific code should reside in the ULP 
directory or drivers/net/ethernet ?


Re: [PATCH v2 1/2] sierra_net: Add support for IPv6 and Dual-Stack Link Sense Indications

2017-02-07 Thread kbuild test robot
Hi Stefan,

[auto build test WARNING on net-next/master]
[also build test WARNING on v4.10-rc7]
[cannot apply to next-20170207]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Stefan-Br-ns/sierra_net-Add-support-for-IPv6-and-Dual-Stack-Link-Sense-Indications/20170207-105111
config: x86_64-randconfig-b0-02081035 (attached as .config)
compiler: gcc-4.4 (Debian 4.4.7-8) 4.4.7
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All warnings (new ones prefixed by >>):

   drivers/net/usb/sierra_net.c: In function 'sierra_net_bind':
>> drivers/net/usb/sierra_net.c:687: warning: unused variable 'eth'

vim +/eth +687 drivers/net/usb/sierra_net.c

f7385ec9 Ming Lei  2012-10-24  671  if (result < 0)
eb4fd8cd Elina Pasheva 2010-04-27  672  return -EIO;
eb4fd8cd Elina Pasheva 2010-04-27  673  
f7385ec9 Ming Lei  2012-10-24  674  *datap = le16_to_cpu(attrdata);
eb4fd8cd Elina Pasheva 2010-04-27  675  return result;
eb4fd8cd Elina Pasheva 2010-04-27  676  }
eb4fd8cd Elina Pasheva 2010-04-27  677  
eb4fd8cd Elina Pasheva 2010-04-27  678  /*
eb4fd8cd Elina Pasheva 2010-04-27  679   * collects the bulk endpoints, the 
status endpoint.
eb4fd8cd Elina Pasheva 2010-04-27  680   */
eb4fd8cd Elina Pasheva 2010-04-27  681  static int sierra_net_bind(struct 
usbnet *dev, struct usb_interface *intf)
eb4fd8cd Elina Pasheva 2010-04-27  682  {
eb4fd8cd Elina Pasheva 2010-04-27  683  u8  ifacenum;
eb4fd8cd Elina Pasheva 2010-04-27  684  u8  numendpoints;
eb4fd8cd Elina Pasheva 2010-04-27  685  u16 fwattr = 0;
eb4fd8cd Elina Pasheva 2010-04-27  686  int status;
eb4fd8cd Elina Pasheva 2010-04-27 @687  struct ethhdr *eth;
eb4fd8cd Elina Pasheva 2010-04-27  688  struct sierra_net_data *priv;
eb4fd8cd Elina Pasheva 2010-04-27  689  static const u8 
sync_tmplate[sizeof(priv->sync_msg)] = {
eb4fd8cd Elina Pasheva 2010-04-27  690  0x00, 0x00, 
SIERRA_NET_HIP_MSYNC_ID, 0x00};
eb4fd8cd Elina Pasheva 2010-04-27  691  static const u8 
shdwn_tmplate[sizeof(priv->shdwn_msg)] = {
eb4fd8cd Elina Pasheva 2010-04-27  692  0x00, 0x00, 
SIERRA_NET_HIP_SHUTD_ID, 0x00};
eb4fd8cd Elina Pasheva 2010-04-27  693  
eb4fd8cd Elina Pasheva 2010-04-27  694  dev_dbg(>udev->dev, "%s", 
__func__);
eb4fd8cd Elina Pasheva 2010-04-27  695  

:: The code at line 687 was first introduced by commit
:: eb4fd8cd355c8ec425a12ec6cbdac614e8a4819d net/usb: add sierra_net.c driver

:: TO: Elina Pasheva <epash...@sierrawireless.com>
:: CC: David S. Miller <da...@davemloft.net>

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: Extending socket timestamping API for NTP

2017-02-07 Thread Denny Page
[Resend without rich text]

On Feb 07, 2017, at 12:17, sdncurious  wrote:
>  If the NTP has access to the physical layer, then the timestamps are
>associated with the beginning of the symbol after the start of frame.
>Otherwise, implementations should attempt to associate the timestamp
>to the earliest accessible point in the frame.

The spec is unfortunately a bit ambiguous and probably should be clarified.

NTP is sensitive to transmission asymmetry. While using the SFD is appropriate 
for transmit timestamps, it is not appropriate for receive timestamps. A simple 
reason for this is port speed mismatch. Consider a 1Gb entity communicating 
with a 100Mb entity on a local switch: leaving aside internal switch delays, if 
SFD timestamping is used for both transmit and receive, then there is a baked 
in asymmetry of 6768ns between the forward and reverse paths; if SFD is used 
for transmit, and FCS end is used for receive, there is no asymmetry.

There is a good explanation of this written by David Mills (NTP's author) here: 
https://www.eecis.udel.edu/~mills/stamp.html#require

Denny



Re: [PATCH] net: fix description of skb_find_text() according to removed functionality

2017-02-07 Thread David Miller

How about you make edits to this interface when you add an in-tree
user as we mentioned in our responses to your previous patch?

Thank you.


Re: [PATCH v2 1/5] bpf: Add missing header to the library

2017-02-07 Thread Wangnan (F)

Please add me into the cc list of all of the 5 patches.

Thank you.

On 2017/2/7 4:40, Mickaël Salaün wrote:

Include stddef.h to define size_t.

Signed-off-by: Mickaël Salaün 
Cc: Alexei Starovoitov 
Cc: Arnaldo Carvalho de Melo 
Cc: Daniel Borkmann 
Cc: Wang Nan 
---
  tools/lib/bpf/bpf.h | 1 +
  1 file changed, 1 insertion(+)

diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index a2f9853dd882..df6e186da788 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -22,6 +22,7 @@
  #define __BPF_BPF_H
  
  #include 

+#include 
  
  int bpf_create_map(enum bpf_map_type map_type, int key_size, int value_size,

   int max_entries, __u32 map_flags);





Re: [PATCH v3 1/5] bpf: Add missing header to the library

2017-02-07 Thread Wangnan (F)



On 2017/2/8 4:56, Mickaël Salaün wrote:

Include stddef.h to define size_t.

Signed-off-by: Mickaël Salaün 
Cc: Alexei Starovoitov 
Cc: Arnaldo Carvalho de Melo 
Cc: Daniel Borkmann 
Cc: Wang Nan 
---
  tools/lib/bpf/bpf.h | 1 +
  1 file changed, 1 insertion(+)

diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index a2f9853dd882..df6e186da788 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -22,6 +22,7 @@
  #define __BPF_BPF_H
  
  #include 

+#include 
  
  int bpf_create_map(enum bpf_map_type map_type, int key_size, int value_size,

   int max_entries, __u32 map_flags);

Looks good to me.

Thank you.



[PATCH] net: fix description of skb_find_text() according to removed functionality

2017-02-07 Thread Igor Pylypiv
Textsearch state parameter was moved to local scope of the function.
This eliminates usage of textsearch_next() to find subsequent occurrences.

Fixes: 59a2440fd3cf ("net: Remove state argument from skb_find_text()")
Signed-off-by: Igor Pylypiv 
---
 net/core/skbuff.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 9ccba86..90366c5 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2919,9 +2919,8 @@ static void skb_ts_finish(struct ts_config *conf, struct 
ts_state *state)
  * @config: textsearch configuration
  *
  * Finds a pattern in the skb data according to the specified
- * textsearch configuration. Use textsearch_next() to retrieve
- * subsequent occurrences of the pattern. Returns the offset
- * to the first occurrence or UINT_MAX if no match was found.
+ * textsearch configuration. Returns the offset to the first
+ * occurrence or UINT_MAX if no match was found.
  */
 unsigned int skb_find_text(struct sk_buff *skb, unsigned int from,
   unsigned int to, struct ts_config *config)
--
2.7.4



Re: [PATCH v3 2/5] bpf: Simplify bpf_load_program() error handling in the library

2017-02-07 Thread Wangnan (F)



On 2017/2/8 4:56, Mickaël Salaün wrote:

Do not call a second time bpf(2) when a program load failed.


BPF_PROG_LOAD should success most of the time. Setting log_level to
0 by default and require log buffer when failure can make it faster
in normal case.

Thank you.


Signed-off-by: Mickaël Salaün 
Cc: Alexei Starovoitov 
Cc: Arnaldo Carvalho de Melo 
Cc: Daniel Borkmann 
Cc: Wang Nan 
---
  tools/lib/bpf/bpf.c | 18 ++
  1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 3ddb58a36d3c..fda3f494f1cd 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -73,7 +73,6 @@ int bpf_load_program(enum bpf_prog_type type, struct bpf_insn 
*insns,
 size_t insns_cnt, char *license,
 __u32 kern_version, char *log_buf, size_t log_buf_sz)
  {
-   int fd;
union bpf_attr attr;
  
  	bzero(, sizeof(attr));

@@ -81,20 +80,15 @@ int bpf_load_program(enum bpf_prog_type type, struct 
bpf_insn *insns,
attr.insn_cnt = (__u32)insns_cnt;
attr.insns = ptr_to_u64(insns);
attr.license = ptr_to_u64(license);
-   attr.log_buf = ptr_to_u64(NULL);
-   attr.log_size = 0;
-   attr.log_level = 0;
+   attr.log_buf = ptr_to_u64(log_buf);
+   attr.log_size = log_buf_sz;
attr.kern_version = kern_version;
  
-	fd = sys_bpf(BPF_PROG_LOAD, , sizeof(attr));

-   if (fd >= 0 || !log_buf || !log_buf_sz)
-   return fd;
+   if (log_buf && log_buf_sz > 0) {
+   attr.log_level = 1;
+   log_buf[0] = 0;
+   }
  
-	/* Try again with log */

-   attr.log_buf = ptr_to_u64(log_buf);
-   attr.log_size = log_buf_sz;
-   attr.log_level = 1;
-   log_buf[0] = 0;
return sys_bpf(BPF_PROG_LOAD, , sizeof(attr));
  }
  





Re: [PATCH net-next 6/7] openvswitch: Add force commit.

2017-02-07 Thread Joe Stringer
On 7 February 2017 at 17:03, Jarno Rajahalme  wrote:
>
> On Feb 7, 2017, at 2:15 PM, Joe Stringer  wrote:
>
> On 2 February 2017 at 17:10, Jarno Rajahalme  wrote:
>
> Stateful network admission policy may allow connections to one
> direction and reject connections initiated in the other direction.
> After policy change it is possible that for a new connection an
> overlapping conntrack entry already exist, where the connection
> original direction is opposed to the new connection's initial packet.
>
> Most importantly, conntrack state relating to the current packet gets
> the "reply" designation based on whether the original direction tuple
> or the reply direction tuple matched.  If this "directionality" is
> wrong w.r.t. to the stateful network admission policy it may happen
> that packets in neither direction are correctly admitted.
>
> This patch adds a new "force commit" option to the OVS conntrack
> action that checks the original direction of an existing conntrack
> entry.  If that direction is opposed to the current packet, the
> existing conntrack entry is deleted and a new one is subsequently
> created in the correct direction.
>
> Signed-off-by: Jarno Rajahalme 
> ---
> include/uapi/linux/openvswitch.h | 10 ++
> net/openvswitch/conntrack.c  | 27 +--
> 2 files changed, 35 insertions(+), 2 deletions(-)
>
> diff --git a/include/uapi/linux/openvswitch.h
> b/include/uapi/linux/openvswitch.h
> index 90af8b8..d5ba9a9 100644
> --- a/include/uapi/linux/openvswitch.h
> +++ b/include/uapi/linux/openvswitch.h
> @@ -674,6 +674,10 @@ struct ovs_action_hash {
>  * @OVS_CT_ATTR_HELPER: variable length string defining conntrack ALG.
>  * @OVS_CT_ATTR_NAT: Nested OVS_NAT_ATTR_* for performing L3 network address
>  * translation (NAT) on the packet.
> + * @OVS_CT_ATTR_FORCE_COMMIT: Like %OVS_CT_ATTR_COMMIT, but instead of
> doing
> + * nothing if the connection is already committed will check that the
> current
> + * packet is in conntrack entry's original direction.  If directionality
> does
> + * not match, will delete the existing conntrack entry and commit a new
> one.
>  */
> enum ovs_ct_attr {
>OVS_CT_ATTR_UNSPEC,
> @@ -684,6 +688,12 @@ enum ovs_ct_attr {
>OVS_CT_ATTR_HELPER, /* netlink helper to assist detection of
>   related connections. */
>OVS_CT_ATTR_NAT,/* Nested OVS_NAT_ATTR_* */
> +   OVS_CT_ATTR_FORCE_COMMIT,  /* No argument, commits connection.  If
> the
> +   * conntrack entry original direction
> tuple
> +   * does not match the current packet
> header
> +   * values, will delete the current
> conntrack
> +   * entry and create a new one.
> +   */
>
>
> We only need one copy of the explanation, keep it above the enum, then
> the inline comment can be /* No argument */.
>
>
> OK.
>
>__OVS_CT_ATTR_MAX
> };
>
> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> index 1afe153..1f27f44 100644
> --- a/net/openvswitch/conntrack.c
> +++ b/net/openvswitch/conntrack.c
> @@ -65,6 +65,7 @@ struct ovs_conntrack_info {
>struct nf_conn *ct;
>u8 commit : 1;
>u8 nat : 3; /* enum ovs_ct_nat */
> +   u8 force : 1;
>u16 family;
>struct md_mark mark;
>struct md_labels labels;
> @@ -631,10 +632,13 @@ static bool skb_nfct_cached(struct net *net,
> */
>if (!ct && key->ct.state & OVS_CS_F_TRACKED &&
>!(key->ct.state & OVS_CS_F_INVALID) &&
> -   key->ct.zone == info->zone.id)
> +   key->ct.zone == info->zone.id) {
>ct = ovs_ct_find_existing(net, >zone, info->family,
> skb,
>  !!(key->ct.state
> & OVS_CS_F_NAT_MASK));
> +   if (ct)
> +   nf_ct_get(skb, );
> +   }
>
>
> If ctinfo is only used with the new call below, we can unconditionally
> fetch this just before it's used...
>
>if (!ct)
>return false;
>if (!net_eq(net, read_pnet(>ct_net)))
> @@ -648,6 +652,19 @@ static bool skb_nfct_cached(struct net *net,
>if (help && rcu_access_pointer(help->helper) != info->helper)
>return false;
>}
> +   /* Force conntrack entry direction to the current packet? */
>
>
> Here.
>
>
> But then we would be executing nf_ct_get() twice in the common case?

Ah, fair enough. It's fine here.


[PATCH v3 0/2] Fixes for sierra_net driver

2017-02-07 Thread Stefan Brüns
When trying to initiate a dual-stack (ipv4v6) connection, a MC7710, FW
version SWI9200X_03.05.24.00ap answers with an unsupported LSI. Add support
for this LSI.
Also the link_type should be ignored when going idle, otherwise the modem
is stuck in a bad link state.
Tested on MC7710, T-Mobile DE, APN internet.telekom, IPv4v6 PDP type. Both
IPv4 and IPv6 connections work.

v2: Do not overwrite protocol field in rx_fixup
v3: Remove leftover struct ethhdr *eth declaration

Stefan Brüns (2):
  sierra_net: Add support for IPv6 and Dual-Stack Link Sense Indications
  sierra_net: Skip validating irrelevant fields for IDLE LSIs

 drivers/net/usb/sierra_net.c | 111 +++
 1 file changed, 71 insertions(+), 40 deletions(-)

-- 
2.11.0



[PATCH v3 1/2] sierra_net: Add support for IPv6 and Dual-Stack Link Sense Indications

2017-02-07 Thread Stefan Brüns
If a context is configured as dualstack ("IPv4v6"), the modem indicates
the context activation with a slightly different indication message.
The dual-stack indication omits the link_type (IPv4/v6) and adds
additional address fields.
IPv6 LSIs are identical to IPv4 LSIs, but have a different link type.

Signed-off-by: Stefan Brüns 
---
v2: Do not overwrite protocol field in rx_fixup
v3: Remove leftover struct ethhdr *eth declaration

Example LSI LINK UP indication:

   00 ed 78 00 04 01 00 e9 0a 14 00 54 00 65 00 6c  ..xT.e.l
0010   00 65 00 6b 00 6f 00 6d 00 2e 00 64 00 65 48 03  .e.k.o.m...d.eH.
0020   c8 be d1 00 62 00 00 00 2c 80 f0 01 00 00 00 00  b...,...
0030   30 cb 04 4c 49 4e 4b 20 55 50 00 00 00 00 00 00  0..LINK UP..
0040   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
0050   00 00 00 00 04 0a 23 38 db 10 2a 01 05 98 88 c0  ..#8..*.
0060   1f da 00 01 00 01 91 23 a8 f9 00 00 00 00 00 00  ...#
0070   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
0080   00 04 0a 4a d2 d2 10 2a 01 05 98 07 ff 00 00 00  ...J...*
0090   10 00 74 02 10 02 10 04 0a 4a d2 d3 10 2a 01 05  ..t..J...*..
00a0   98 07 ff 00 00 00 10 00 74 02 10 02 11 00 00 00  t...
00b0   00 00 00 00 00 00 00 00 00 00 00 00 00 ff ff 00  
00c0   00 00 00 00 00 c3 50 04 00 00 00 00 10 fe 80 00  ..P.
00d0   00 00 00 00 00 00 00 00 00 00 00 00 05 00 00 00  
00e0   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
00f0   00
---
 drivers/net/usb/sierra_net.c | 101 ---
 1 file changed, 66 insertions(+), 35 deletions(-)

diff --git a/drivers/net/usb/sierra_net.c b/drivers/net/usb/sierra_net.c
index a251588762ec..6300a4454ae5 100644
--- a/drivers/net/usb/sierra_net.c
+++ b/drivers/net/usb/sierra_net.c
@@ -73,8 +73,6 @@ staticatomic_t iface_counter = ATOMIC_INIT(0);
 /* Private data structure */
 struct sierra_net_data {
 
-   u8 ethr_hdr_tmpl[ETH_HLEN]; /* ethernet header template for rx'd pkts */
-
u16 link_up;/* air link up or down */
u8 tx_hdr_template[4];  /* part of HIP hdr for tx'd packets */
 
@@ -122,6 +120,7 @@ struct param {
 
 /* LSI Protocol types */
 #define SIERRA_NET_PROTOCOL_UMTS  0x01
+#define SIERRA_NET_PROTOCOL_UMTS_DS   0x04
 /* LSI Coverage */
 #define SIERRA_NET_COVERAGE_NONE  0x00
 #define SIERRA_NET_COVERAGE_NOPACKET  0x01
@@ -129,7 +128,8 @@ struct param {
 /* LSI Session */
 #define SIERRA_NET_SESSION_IDLE   0x00
 /* LSI Link types */
-#define SIERRA_NET_AS_LINK_TYPE_IPv4  0x00
+#define SIERRA_NET_AS_LINK_TYPE_IPV4  0x00
+#define SIERRA_NET_AS_LINK_TYPE_IPV6  0x02
 
 struct lsi_umts {
u8 protocol;
@@ -137,9 +137,14 @@ struct lsi_umts {
__be16 length;
/* eventually use a union for the rest - assume umts for now */
u8 coverage;
-   u8 unused2[41];
+   u8 network_len; /* network name len */
+   u8 network[40]; /* network name (UCS2, bigendian) */
u8 session_state;
u8 unused3[33];
+} __packed;
+
+struct lsi_umts_single {
+   struct lsi_umts lsi;
u8 link_type;
u8 pdp_addr_len; /* NW-supplied PDP address len */
u8 pdp_addr[16]; /* NW-supplied PDP address (bigendian)) */
@@ -158,10 +163,31 @@ struct lsi_umts {
u8 reserved[8];
 } __packed;
 
+struct lsi_umts_dual {
+   struct lsi_umts lsi;
+   u8 pdp_addr4_len; /* NW-supplied PDP IPv4 address len */
+   u8 pdp_addr4[4];  /* NW-supplied PDP IPv4 address (bigendian)) */
+   u8 pdp_addr6_len; /* NW-supplied PDP IPv6 address len */
+   u8 pdp_addr6[16]; /* NW-supplied PDP IPv6 address (bigendian)) */
+   u8 unused4[23];
+   u8 dns1_addr4_len; /* NW-supplied 1st DNS v4 address len (bigendian) */
+   u8 dns1_addr4[4];  /* NW-supplied 1st DNS v4 address */
+   u8 dns1_addr6_len; /* NW-supplied 1st DNS v6 address len */
+   u8 dns1_addr6[16]; /* NW-supplied 1st DNS v6 address (bigendian)*/
+   u8 dns2_addr4_len; /* NW-supplied 2nd DNS v4 address len (bigendian) */
+   u8 dns2_addr4[4];  /* NW-supplied 2nd DNS v4 address */
+   u8 dns2_addr6_len; /* NW-supplied 2nd DNS v6 address len */
+   u8 dns2_addr6[16]; /* NW-supplied 2nd DNS v6 address (bigendian)*/
+   u8 unused5[68];
+} __packed;
+
 #define SIERRA_NET_LSI_COMMON_LEN  4
-#define SIERRA_NET_LSI_UMTS_LEN(sizeof(struct lsi_umts))
+#define SIERRA_NET_LSI_UMTS_LEN(sizeof(struct lsi_umts_single))
 #define SIERRA_NET_LSI_UMTS_STATUS_LEN \
(SIERRA_NET_LSI_UMTS_LEN - SIERRA_NET_LSI_COMMON_LEN)
+#define SIERRA_NET_LSI_UMTS_DS_LEN (sizeof(struct lsi_umts_dual))
+#define SIERRA_NET_LSI_UMTS_DS_STATUS_LEN \
+   (SIERRA_NET_LSI_UMTS_DS_LEN - SIERRA_NET_LSI_COMMON_LEN)
 
 /* Forward definitions */
 static void sierra_sync_timer(unsigned long syncdata);
@@ -191,10 +217,11 @@ 

[PATCH v3 2/2] sierra_net: Skip validating irrelevant fields for IDLE LSIs

2017-02-07 Thread Stefan Brüns
When the context is deactivated, the link_type is set to 0xff, which
triggers a warning message, and results in a wrong link status, as
the LSI is ignored.

Signed-off-by: Stefan Brüns 
---
 drivers/net/usb/sierra_net.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/usb/sierra_net.c b/drivers/net/usb/sierra_net.c
index 6300a4454ae5..0b5a84c9022c 100644
--- a/drivers/net/usb/sierra_net.c
+++ b/drivers/net/usb/sierra_net.c
@@ -385,6 +385,13 @@ static int sierra_net_parse_lsi(struct usbnet *dev, char 
*data, int datalen)
return -1;
}
 
+   /* Validate the session state */
+   if (lsi->session_state == SIERRA_NET_SESSION_IDLE) {
+   netdev_err(dev->net, "Session idle, 0x%02x\n",
+  lsi->session_state);
+   return 0;
+   }
+
/* Validate the protocol  - only support UMTS for now */
if (lsi->protocol == SIERRA_NET_PROTOCOL_UMTS) {
struct lsi_umts_single *single = (struct lsi_umts_single *)lsi;
@@ -418,13 +425,6 @@ static int sierra_net_parse_lsi(struct usbnet *dev, char 
*data, int datalen)
return 0;
}
 
-   /* Validate the session state */
-   if (lsi->session_state == SIERRA_NET_SESSION_IDLE) {
-   netdev_err(dev->net, "Session idle, 0x%02x\n",
-   lsi->session_state);
-   return 0;
-   }
-
/* Set link_sense true */
return 1;
 }
-- 
2.11.0



Re: Extending socket timestamping API for NTP

2017-02-07 Thread Denny Page
[Resend without rich text]

> On Feb 07, 2017, at 09:45, Keller, Jacob E  wrote:
> 
> The main problem here is that most hardware that *can't* timestamp all 
> packets is pretty limited to timestamping only PTP frames.


Most, but not all. The TI DP83630 doesn’t support timestamping for all packets, 
but it does support either PTP or NTP:

===
2.3.2.3 NTP Packet Timestamp
The DP83630 may be programmed to timestamp NTP packets instead of PTP packets. 
This operation is enabled by setting the NTP_TS_EN control in the PTP_TXCFG0 
register. When configured for NTP timestamps, the DP83630 will timestamp 
packets with the NTP UDP port number rather than the PTP port number (note that 
the device cannot be configured to timestamp both PTP and NTP packets). 
One-Step operation is not supported for NTP timestamps, so transmit timestamps 
cannot be inserted directly into outgoing NTP packets. Timestamp insertion is 
available for receive timestamps but must use a single, fixed location. 
===

Right now, there is no API to signal to the driver that NTP timestamping is 
desired.

Even if the hardware does not directly support filtering, it can be implemented 
in the driver.

Denny

linux-next: manual merge of the kspp tree with the net-next tree

2017-02-07 Thread Stephen Rothwell
Hi Kees,

Today's linux-next merge of the kspp tree got a conflict in:

  arch/Kconfig

between commit:

  1a8b6d76dc5b ("net:add one common config ARCH_WANT_RELAX_ORDER to support 
relax ordering")

from the net-next tree and commits:

  ad21fc4faa2a ("arch: Move CONFIG_DEBUG_RODATA and CONFIG_SET_MODULE_RONX to 
be common")
  0f5bf6d0afe4 ("arch: Rename CONFIG_DEBUG_RODATA and CONFIG_DEBUG_MODULE_RONX")

from the kspp tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc arch/Kconfig
index bd04eace455c,7425fde9c723..
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@@ -781,7 -843,38 +843,41 @@@ config VMAP_STAC
  the stack to map directly to the KASAN shadow map using a formula
  that is incorrect if the stack is in vmalloc space.
  
 +config ARCH_WANT_RELAX_ORDER
 +  bool
 +
+ config ARCH_OPTIONAL_KERNEL_RWX
+   def_bool n
+ 
+ config ARCH_OPTIONAL_KERNEL_RWX_DEFAULT
+   def_bool n
+ 
+ config ARCH_HAS_STRICT_KERNEL_RWX
+   def_bool n
+ 
+ config STRICT_KERNEL_RWX
+   bool "Make kernel text and rodata read-only" if ARCH_OPTIONAL_KERNEL_RWX
+   depends on ARCH_HAS_STRICT_KERNEL_RWX
+   default !ARCH_OPTIONAL_KERNEL_RWX || ARCH_OPTIONAL_KERNEL_RWX_DEFAULT
+   help
+ If this is set, kernel text and rodata memory will be made read-only,
+ and non-text memory will be made non-executable. This provides
+ protection against certain security exploits (e.g. executing the heap
+ or modifying text)
+ 
+ These features are considered standard security practice these days.
+ You should say Y here in almost all cases.
+ 
+ config ARCH_HAS_STRICT_MODULE_RWX
+   def_bool n
+ 
+ config STRICT_MODULE_RWX
+   bool "Set loadable kernel module data as NX and text as RO" if 
ARCH_OPTIONAL_KERNEL_RWX
+   depends on ARCH_HAS_STRICT_MODULE_RWX && MODULES
+   default !ARCH_OPTIONAL_KERNEL_RWX || ARCH_OPTIONAL_KERNEL_RWX_DEFAULT
+   help
+ If this is set, module text and rodata memory will be made read-only,
+ and non-text memory will be made non-executable. This provides
+ protection against certain security exploits (e.g. writing to text)
+ 
  source "kernel/gcov/Kconfig"


Re: [RFC v3 02/11] IB/hfi-vnic: Virtual Network Interface Controller (VNIC) interface

2017-02-07 Thread Vishwanathapura, Niranjana

On Wed, Feb 08, 2017 at 12:43:40AM +, Parav Pandit wrote:

@@ -2096,6 +2114,15 @@ struct ib_device {
   struct
ib_rwq_ind_table_init_attr *init_attr,
   struct ib_udata
*udata);
int(*destroy_rwq_ind_table)(struct 
ib_rwq_ind_table
*wq_ind_table);
+   /* rdma netdev operations */
+   struct net_device *(*alloc_rdma_netdev)(
+   struct ib_device *device,
+   u8 port_num,
+   enum rdma_netdev_t type,
+   const char *name,
+   unsigned char name_assign_type,
+   void (*setup)(struct net_device *));
+   void (*free_rdma_netdev)(struct net_device *netdev);
struct ib_dma_mapping_ops   *dma_ops;

struct module   *owner;


As its clear from the cover letter and from the request to place this in 
drivers/infiniband/ulp,
Instead of increasing the ib_dev structure further,
Can you change the code to make use of ib_register_client() and friend 
functions to register vnic as ULP.
(similar to other ULP such as uverbs, srp, ipoib).
This will also allow you get to get notified for removing the vnic device when 
underlying rdma device gets removed.
Based on the property that gets exposed by the ibdev, vnic driver filters 
whether it needs to load its vnic to specific device or not.
This way modules are isolated between core and ULP little better.
Would it work for you?


HFI_VNIC driver is using ib_register_client() and friend fucntions. Below patch 
in this series does that.

[RFC v3 08/11] IB/hfi-vnic: VNIC Ethernet Management Agent (VEMA) function

Niranjana





Re: [PATCH net-next 7/7] openvswitch: Pack struct sw_flow_key.

2017-02-07 Thread Jarno Rajahalme

> On Feb 6, 2017, at 11:15 PM, Joe Stringer  wrote:
> 
> On 2 February 2017 at 17:10, Jarno Rajahalme  wrote:
>> struct sw_flow_key has two 16-bit holes. Move the most matched
>> conntrack match fields there.  In some typical cases this reduces the
>> size of the key that needs to be hashed into half and into one cache
>> line.
>> 
>> Signed-off-by: Jarno Rajahalme 
> 
> Looks like this misses the zeroing in ovs_nla_get_flow_metadata();
> might want to double-check for any other memset/copies of the key->ct
> field.

Good catch. Looked, there are no other places to change.

Will rebase to current net-next and repost.

  Jarno



Re: [RFC v3 00/11] HFI Virtual Network Interface Controller (VNIC)

2017-02-07 Thread Bart Van Assche
On Tue, 2017-02-07 at 16:54 -0800, Vishwanathapura, Niranjana wrote:
> On Tue, Feb 07, 2017 at 09:58:50PM +, Bart Van Assche wrote:
> > On Tue, 2017-02-07 at 21:44 +, Hefty, Sean wrote:
> > > This is Ethernet - not IP - encapsulation over a non-InfiniBand 
> > > device/protocol.
> > 
> > That's more than clear from the cover letter. In my opinion the cover letter
> > should explain why it is considered useful to have such a driver upstream
> > and what the use cases are of encapsulating Ethernet frames inside RDMA
> > packets.
> 
> We believe on our HW, HFI VNIC design gives better hardware resource usage 
> which is also scalable and hence room for better performance.
> Also as evident in the cover letter, it gives us better manageability by 
> defining virtual Ethernet switches overlaid on the fabric and
> use standard Ethernet support provided by Linux.

That kind of language is appropriate for a marketing brochure but not for a
technical forum. Even reading your statement twice did not make me any wiser.
You mentioned "better hardware resource usage". Compared to what? Is that
perhaps compared to IPoIB? Since Ethernet frames have an extra header and are
larger than IPoIB frames, how can larger frames result in better hardware
resource usage? And what is a virtual Ethernet switch? Is this perhaps packet
forwarding by software? If so, why are virtual Ethernet switches needed since
the Linux networking stack already supports packet forwarding?

Thanks,

Bart.


[PATCHv2 net-next 0/2] mv88e6xxx Watchdog support

2017-02-07 Thread Andrew Lunn
The Marvell switches have an in built watchdog over some of the
internal state machine. The watchdog can be configured to raise an
interrupt on error. The problem the watchdog found is then logged to
the kernel log.

The older switches can automagically perform a software reset when the
watchdog triggers. This just resets the internal state machine, but
leaves the switch configuration unchanged.

The 6390 family of switches cannot both raise an interrupt and
automagically perform a software reset. So the interrupt handler has
to perform the switch reset, and then re-enable the watchdog
interrupts.

This has been tested using hacked together debugfs code which allows
the "force" bit to be set, so cause a watchdog interrupt.

Andrew Lunn (2):
  net: dsa: mv88e6xxx: Add watchdog interrupt handler
  net: dsa: mv88e6xxx: Add mv88e6390 watchdog interrupt support

 drivers/net/dsa/mv88e6xxx/chip.c  |  23 ++
 drivers/net/dsa/mv88e6xxx/global2.c   | 137 +-
 drivers/net/dsa/mv88e6xxx/global2.h   |   6 ++
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h |  33 
 4 files changed, 198 insertions(+), 1 deletion(-)

-- 
2.11.0




[PATCH] net: dsa: mv88e6xxx: Move forward declaration to where it is needed

2017-02-07 Thread Andrew Lunn
Move it out from the middle for the #defines to just before it is
needed.

Signed-off-by: Andrew Lunn 
---
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h 
b/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
index c284e7f1f294..6713776ad26b 100644
--- a/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
+++ b/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
@@ -660,8 +660,6 @@ enum mv88e6xxx_cap {
 MV88E6XXX_FLAGS_MULTI_CHIP |   \
 MV88E6XXX_FLAGS_PVT)
 
-struct mv88e6xxx_ops;
-
 #define MV88E6XXX_FLAGS_FAMILY_6390\
(MV88E6XXX_FLAG_EEE |   \
 MV88E6XXX_FLAG_GLOBAL2 |   \
@@ -672,6 +670,8 @@ struct mv88e6xxx_ops;
 MV88E6XXX_FLAGS_MULTI_CHIP |   \
 MV88E6XXX_FLAGS_PVT)
 
+struct mv88e6xxx_ops;
+
 struct mv88e6xxx_info {
enum mv88e6xxx_family family;
u16 prod_num;
-- 
2.11.0



Re: [PATCH 3/3] rhashtable: Add nested tables

2017-02-07 Thread Herbert Xu
On Tue, Feb 07, 2017 at 07:02:16PM +0100, Florian Westphal wrote:
>
> I can't really say anything here because *I* don't expect
> it to succeed.

Think about incoming TCP connections, you can't rate-limit that
without defeating yourself.

> Even with this proposed patch things will eventually fail
> on OOM conditions.

Exactly.  That's the only case where it should fail.  Previously
it would fail if we cannot allocate a large number of consecutive
pages, which can happen even if you have lots of memory left.  With
my patch it will only fail if it cannot allocate at most two non-
consecutive pages, i.e., a real OOM.

> Also, such period should be very short until rht has reached
> peak size for the workload.

How would you know? The rate of insertions could be extremely
high.
 
> Also, given that we could easily oversubscribe a table by a factor
> of 10 or more while still keeping sane chain lengths I don't
> see why thats a problem (also, a 'rht_insert_force' or similar
> interface that doesn't do chain length checks makes it
> easy to spot places that need/want this behaviour).

But you would still have to impose a limit, whether it's 1 or
10.  IOW you will still fail insertions at some point even though
you have sufficient memory to perform the insertion.
 
> (insecure_elasticity and/or insecure_max_entries come to mind, seems
>  some of that might not even be needed anymore but I don't have time
>  right now to investigate).

Yes insecure_elasticity is now obsolete and anyone using it should
switch over to rhlist.  Once nobody uses it then it can be removed.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [RFC v3 02/11] IB/hfi-vnic: Virtual Network Interface Controller (VNIC) interface

2017-02-07 Thread Vishwanathapura, Niranjana

On Tue, Feb 07, 2017 at 03:19:25PM -0700, Jason Gunthorpe wrote:

On Tue, Feb 07, 2017 at 02:06:30PM -0800, Vishwanathapura, Niranjana wrote:


>>IB_DEVICE_RAW_SCATTER_FCS   = (1ULL << 34),
>>+   IB_DEVICE_RDMA_NETDEV_HFI_VNIC  = (1ULL << 35),
>
>What is this called HFI_VNIC anyhow? Shouldn't this be OPA_VNIC? There
>is nothing really HFI specific, right?

Agreed, OPA_VNIC is more appropriate here. Will change it.


And probably lots of other places too.. :)



Well, our driver is called HFI1 and HFI_VNIC is in accordance with our naming 
convention. I will only change the above device attribute name to OPA_VNIC in 
the ib interface just to be consitant with other such defintions here.





>And this should be rn->dev_priv ?

Yah, both will result in same behavior. But yah, what you are suggesting
will remove any confusion. Will change in next PATCH series.


Only because the struct has no members, as soon as someone adds
something it would go booom.



Agreed.


Jason


Re: [RFC v3 00/11] HFI Virtual Network Interface Controller (VNIC)

2017-02-07 Thread Vishwanathapura, Niranjana

On Tue, Feb 07, 2017 at 09:58:50PM +, Bart Van Assche wrote:

On Tue, 2017-02-07 at 21:44 +, Hefty, Sean wrote:

This is Ethernet - not IP - encapsulation over a non-InfiniBand device/protocol.


That's more than clear from the cover letter. In my opinion the cover letter
should explain why it is considered useful to have such a driver upstream
and what the use cases are of encapsulating Ethernet frames inside RDMA
packets.



We believe on our HW, HFI VNIC design gives better hardware resource usage 
which is also scalable and hence room for better performance.
Also as evident in the cover letter, it gives us better manageability by 
defining virtual Ethernet switches overlaid on the fabric and

use standard Ethernet support provided by Linux.

Niranjana




RE: [RFC v3 02/11] IB/hfi-vnic: Virtual Network Interface Controller (VNIC) interface

2017-02-07 Thread Parav Pandit
Hi 

> -Original Message-
> From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-
> ow...@vger.kernel.org] On Behalf Of Vishwanathapura, Niranjana
> Sent: Tuesday, February 7, 2017 2:23 PM
> To: dledf...@redhat.com
> Cc: linux-r...@vger.kernel.org; netdev@vger.kernel.org;
> dennis.dalessan...@intel.com; ira.we...@intel.com; Niranjana
> Vishwanathapura 
> Subject: [RFC v3 02/11] IB/hfi-vnic: Virtual Network Interface Controller
> (VNIC) interface
> 
> Add rdma netdev interface to ib device structure allowing rdma netdev
> devices to be allocated by ib clients.
> Define HFI VNIC interface between hardware independent VNIC
> functionality and the hardware dependent VNIC functionality.
> 
> Reviewed-by: Dennis Dalessandro 
> Reviewed-by: Ira Weiny 
> Signed-off-by: Niranjana Vishwanathapura
> 
> ---
>  struct ib_device {
>   struct device*dma_device;
> 
> @@ -2096,6 +2114,15 @@ struct ib_device {
>  struct
> ib_rwq_ind_table_init_attr *init_attr,
>  struct ib_udata
> *udata);
>   int(*destroy_rwq_ind_table)(struct 
> ib_rwq_ind_table
> *wq_ind_table);
> + /* rdma netdev operations */
> + struct net_device *(*alloc_rdma_netdev)(
> + struct ib_device *device,
> + u8 port_num,
> + enum rdma_netdev_t type,
> + const char *name,
> + unsigned char name_assign_type,
> + void (*setup)(struct net_device *));
> + void (*free_rdma_netdev)(struct net_device *netdev);
>   struct ib_dma_mapping_ops   *dma_ops;
> 
>   struct module   *owner;

As its clear from the cover letter and from the request to place this in 
drivers/infiniband/ulp,
Instead of increasing the ib_dev structure further,
Can you change the code to make use of ib_register_client() and friend 
functions to register vnic as ULP.
(similar to other ULP such as uverbs, srp, ipoib).
This will also allow you get to get notified for removing the vnic device when 
underlying rdma device gets removed.
Based on the property that gets exposed by the ibdev, vnic driver filters 
whether it needs to load its vnic to specific device or not.
This way modules are isolated between core and ULP little better.
Would it work for you?



[PATCHv2 net-next 2/2] net: dsa: mv88e6xxx: Add mv88e6390 watchdog interrupt support

2017-02-07 Thread Andrew Lunn
Implement the ops needed to support the watchdog for the MV88E6390
family.

Signed-off-by: Andrew Lunn 
---
v2
  Completely new.

drivers/net/dsa/mv88e6xxx/chip.c  |  7 +
 drivers/net/dsa/mv88e6xxx/global2.c   | 48 +++
 drivers/net/dsa/mv88e6xxx/global2.h   |  2 ++
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h | 12 +
 4 files changed, 69 insertions(+)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 357a65d8f02f..7be4419db01f 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -3419,6 +3419,7 @@ static const struct mv88e6xxx_ops mv88e6190_ops = {
.stats_get_stats = mv88e6390_stats_get_stats,
.g1_set_cpu_port = mv88e6390_g1_set_cpu_port,
.g1_set_egress_port = mv88e6390_g1_set_egress_port,
+   .g2_watchdog_ops = _watchdog_ops,
.mgmt_rsvd2cpu = mv88e6390_g1_mgmt_rsvd2cpu,
.reset = mv88e6352_g1_reset,
 };
@@ -3446,6 +3447,7 @@ static const struct mv88e6xxx_ops mv88e6190x_ops = {
.stats_get_stats = mv88e6390_stats_get_stats,
.g1_set_cpu_port = mv88e6390_g1_set_cpu_port,
.g1_set_egress_port = mv88e6390_g1_set_egress_port,
+   .g2_watchdog_ops = _watchdog_ops,
.mgmt_rsvd2cpu = mv88e6390_g1_mgmt_rsvd2cpu,
.reset = mv88e6352_g1_reset,
 };
@@ -3473,6 +3475,7 @@ static const struct mv88e6xxx_ops mv88e6191_ops = {
.stats_get_stats = mv88e6390_stats_get_stats,
.g1_set_cpu_port = mv88e6390_g1_set_cpu_port,
.g1_set_egress_port = mv88e6390_g1_set_egress_port,
+   .g2_watchdog_ops = _watchdog_ops,
.mgmt_rsvd2cpu = mv88e6390_g1_mgmt_rsvd2cpu,
.reset = mv88e6352_g1_reset,
 };
@@ -3530,6 +3533,7 @@ static const struct mv88e6xxx_ops mv88e6290_ops = {
.stats_get_stats = mv88e6390_stats_get_stats,
.g1_set_cpu_port = mv88e6390_g1_set_cpu_port,
.g1_set_egress_port = mv88e6390_g1_set_egress_port,
+   .g2_watchdog_ops = _watchdog_ops,
.mgmt_rsvd2cpu = mv88e6390_g1_mgmt_rsvd2cpu,
.reset = mv88e6352_g1_reset,
 };
@@ -3754,6 +3758,7 @@ static const struct mv88e6xxx_ops mv88e6390_ops = {
.stats_get_stats = mv88e6390_stats_get_stats,
.g1_set_cpu_port = mv88e6390_g1_set_cpu_port,
.g1_set_egress_port = mv88e6390_g1_set_egress_port,
+   .g2_watchdog_ops = _watchdog_ops,
.mgmt_rsvd2cpu = mv88e6390_g1_mgmt_rsvd2cpu,
.reset = mv88e6352_g1_reset,
 };
@@ -3783,6 +3788,7 @@ static const struct mv88e6xxx_ops mv88e6390x_ops = {
.stats_get_stats = mv88e6390_stats_get_stats,
.g1_set_cpu_port = mv88e6390_g1_set_cpu_port,
.g1_set_egress_port = mv88e6390_g1_set_egress_port,
+   .g2_watchdog_ops = _watchdog_ops,
.mgmt_rsvd2cpu = mv88e6390_g1_mgmt_rsvd2cpu,
.reset = mv88e6352_g1_reset,
 };
@@ -3810,6 +3816,7 @@ static const struct mv88e6xxx_ops mv88e6391_ops = {
.stats_get_stats = mv88e6390_stats_get_stats,
.g1_set_cpu_port = mv88e6390_g1_set_cpu_port,
.g1_set_egress_port = mv88e6390_g1_set_egress_port,
+   .g2_watchdog_ops = _watchdog_ops,
.mgmt_rsvd2cpu = mv88e6390_g1_mgmt_rsvd2cpu,
.reset = mv88e6352_g1_reset,
 };
diff --git a/drivers/net/dsa/mv88e6xxx/global2.c 
b/drivers/net/dsa/mv88e6xxx/global2.c
index 200c08196fa0..e065c00df9e5 100644
--- a/drivers/net/dsa/mv88e6xxx/global2.c
+++ b/drivers/net/dsa/mv88e6xxx/global2.c
@@ -686,6 +686,54 @@ const struct mv88e6xxx_irq_ops mv88e6097_watchdog_ops = {
.irq_free = mv88e6097_watchdog_free,
 };
 
+static int mv88e6390_watchdog_setup(struct mv88e6xxx_chip *chip)
+{
+   return mv88e6xxx_g2_update(chip, GLOBAL2_WDOG_CONTROL,
+  GLOBAL2_WDOG_INT_ENABLE |
+  GLOBAL2_WDOG_CUT_THROUGH |
+  GLOBAL2_WDOG_QUEUE_CONTROLLER |
+  GLOBAL2_WDOG_EGRESS |
+  GLOBAL2_WDOG_FORCE_IRQ);
+}
+
+static int mv88e6390_watchdog_action(struct mv88e6xxx_chip *chip, int irq)
+{
+   int err;
+   u16 reg;
+
+   mv88e6xxx_g2_write(chip, GLOBAL2_WDOG_CONTROL, GLOBAL2_WDOG_EVENT);
+   err = mv88e6xxx_g2_read(chip, GLOBAL2_WDOG_CONTROL, );
+
+   dev_info(chip->dev, "Watchdog event: 0x%04x",
+reg & GLOBAL2_WDOG_DATA_MASK);
+
+   mv88e6xxx_g2_write(chip, GLOBAL2_WDOG_CONTROL, GLOBAL2_WDOG_HISTORY);
+   err = mv88e6xxx_g2_read(chip, GLOBAL2_WDOG_CONTROL, );
+
+   dev_info(chip->dev, "Watchdog history: 0x%04x",
+reg & GLOBAL2_WDOG_DATA_MASK);
+
+   /* Trigger a software reset to try to recover the switch */
+   if (chip->info->ops->reset)
+   chip->info->ops->reset(chip);
+
+   mv88e6390_watchdog_setup(chip);
+
+   return IRQ_HANDLED;
+}
+
+static void mv88e6390_watchdog_free(struct mv88e6xxx_chip *chip)
+{
+   mv88e6xxx_g2_update(chip, 

[PATCHv2 net-next 1/2] net: dsa: mv88e6xxx: Add watchdog interrupt handler

2017-02-07 Thread Andrew Lunn
The switch contains a watchdog looking for issues with the internal
gubbins of the switch. Hook the interrupt the watchdog triggers and
log the value of the control register indicating why the watchdog
fired. The watchdog can only be cleared with a switch reset, which
will destroy the current configuration. Rather than doing this, just
disable the interrupt.

The mv88e6390 family has different watchdog registers. So use an ops
structure, so support for the mv88e6390 family can be added later.

Signed-off-by: Andrew Lunn 
---
v2:
  Use ops and exclude the 6390 family
  Add missing locks in the IRQ handler
---
 drivers/net/dsa/mv88e6xxx/chip.c  | 16 +++
 drivers/net/dsa/mv88e6xxx/global2.c   | 89 ++-
 drivers/net/dsa/mv88e6xxx/global2.h   |  4 ++
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h | 21 +
 4 files changed, 129 insertions(+), 1 deletion(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 7b4e40b286e4..357a65d8f02f 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -3111,6 +3111,7 @@ static const struct mv88e6xxx_ops mv88e6085_ops = {
.stats_get_stats = mv88e6095_stats_get_stats,
.g1_set_cpu_port = mv88e6095_g1_set_cpu_port,
.g1_set_egress_port = mv88e6095_g1_set_egress_port,
+   .g2_watchdog_ops = _watchdog_ops,
.mgmt_rsvd2cpu = mv88e6095_g2_mgmt_rsvd2cpu,
.ppu_enable = mv88e6185_g1_ppu_enable,
.ppu_disable = mv88e6185_g1_ppu_disable,
@@ -3179,6 +3180,7 @@ static const struct mv88e6xxx_ops mv88e6123_ops = {
.stats_get_stats = mv88e6095_stats_get_stats,
.g1_set_cpu_port = mv88e6095_g1_set_cpu_port,
.g1_set_egress_port = mv88e6095_g1_set_egress_port,
+   .g2_watchdog_ops = _watchdog_ops,
.mgmt_rsvd2cpu = mv88e6095_g2_mgmt_rsvd2cpu,
.reset = mv88e6352_g1_reset,
 };
@@ -3205,6 +3207,7 @@ static const struct mv88e6xxx_ops mv88e6131_ops = {
.stats_get_stats = mv88e6095_stats_get_stats,
.g1_set_cpu_port = mv88e6095_g1_set_cpu_port,
.g1_set_egress_port = mv88e6095_g1_set_egress_port,
+   .g2_watchdog_ops = _watchdog_ops,
.mgmt_rsvd2cpu = mv88e6095_g2_mgmt_rsvd2cpu,
.ppu_enable = mv88e6185_g1_ppu_enable,
.ppu_disable = mv88e6185_g1_ppu_disable,
@@ -3232,6 +3235,7 @@ static const struct mv88e6xxx_ops mv88e6161_ops = {
.stats_get_stats = mv88e6095_stats_get_stats,
.g1_set_cpu_port = mv88e6095_g1_set_cpu_port,
.g1_set_egress_port = mv88e6095_g1_set_egress_port,
+   .g2_watchdog_ops = _watchdog_ops,
.mgmt_rsvd2cpu = mv88e6095_g2_mgmt_rsvd2cpu,
.reset = mv88e6352_g1_reset,
 };
@@ -3250,6 +3254,7 @@ static const struct mv88e6xxx_ops mv88e6165_ops = {
.stats_get_stats = mv88e6095_stats_get_stats,
.g1_set_cpu_port = mv88e6095_g1_set_cpu_port,
.g1_set_egress_port = mv88e6095_g1_set_egress_port,
+   .g2_watchdog_ops = _watchdog_ops,
.mgmt_rsvd2cpu = mv88e6095_g2_mgmt_rsvd2cpu,
.reset = mv88e6352_g1_reset,
 };
@@ -3276,6 +3281,7 @@ static const struct mv88e6xxx_ops mv88e6171_ops = {
.stats_get_stats = mv88e6095_stats_get_stats,
.g1_set_cpu_port = mv88e6095_g1_set_cpu_port,
.g1_set_egress_port = mv88e6095_g1_set_egress_port,
+   .g2_watchdog_ops = _watchdog_ops,
.mgmt_rsvd2cpu = mv88e6095_g2_mgmt_rsvd2cpu,
.reset = mv88e6352_g1_reset,
 };
@@ -3304,6 +3310,7 @@ static const struct mv88e6xxx_ops mv88e6172_ops = {
.stats_get_stats = mv88e6095_stats_get_stats,
.g1_set_cpu_port = mv88e6095_g1_set_cpu_port,
.g1_set_egress_port = mv88e6095_g1_set_egress_port,
+   .g2_watchdog_ops = _watchdog_ops,
.mgmt_rsvd2cpu = mv88e6095_g2_mgmt_rsvd2cpu,
.reset = mv88e6352_g1_reset,
 };
@@ -3330,6 +3337,7 @@ static const struct mv88e6xxx_ops mv88e6175_ops = {
.stats_get_stats = mv88e6095_stats_get_stats,
.g1_set_cpu_port = mv88e6095_g1_set_cpu_port,
.g1_set_egress_port = mv88e6095_g1_set_egress_port,
+   .g2_watchdog_ops = _watchdog_ops,
.mgmt_rsvd2cpu = mv88e6095_g2_mgmt_rsvd2cpu,
.reset = mv88e6352_g1_reset,
 };
@@ -3358,6 +3366,7 @@ static const struct mv88e6xxx_ops mv88e6176_ops = {
.stats_get_stats = mv88e6095_stats_get_stats,
.g1_set_cpu_port = mv88e6095_g1_set_cpu_port,
.g1_set_egress_port = mv88e6095_g1_set_egress_port,
+   .g2_watchdog_ops = _watchdog_ops,
.mgmt_rsvd2cpu = mv88e6095_g2_mgmt_rsvd2cpu,
.reset = mv88e6352_g1_reset,
 };
@@ -3380,6 +3389,7 @@ static const struct mv88e6xxx_ops mv88e6185_ops = {
.stats_get_stats = mv88e6095_stats_get_stats,
.g1_set_cpu_port = mv88e6095_g1_set_cpu_port,
.g1_set_egress_port = mv88e6095_g1_set_egress_port,
+   .g2_watchdog_ops = _watchdog_ops,
.mgmt_rsvd2cpu = mv88e6095_g2_mgmt_rsvd2cpu,
.ppu_enable = 

[PATCH net-next] bpf, lpm: fix overflows in trie_alloc checks

2017-02-07 Thread Daniel Borkmann
Cap the maximum (total) value size and bail out if larger than KMALLOC_MAX_SIZE
as otherwise it doesn't make any sense to proceed further, since we're
guaranteed to fail to allocate elements anyway in lpm_trie_node_alloc();
likleyhood of failure is still high for large values, though, similarly
as with htab case in non-prealloc.

Next, make sure that cost vars are really u64 instead of size_t, so that we
don't overflow on 32 bit and charge only tiny map.pages against memlock while
allowing huge max_entries; cap also the max cost like we do with other map
types.

Fixes: b95a5c4db09b ("bpf: add a longest prefix match trie map implementation")
Signed-off-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov 
---
 kernel/bpf/lpm_trie.c | 36 +++-
 1 file changed, 27 insertions(+), 9 deletions(-)

diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c
index 144e976..e0f6a0b 100644
--- a/kernel/bpf/lpm_trie.c
+++ b/kernel/bpf/lpm_trie.c
@@ -394,10 +394,21 @@ static int trie_delete_elem(struct bpf_map *map, void 
*key)
return -ENOSYS;
 }
 
+#define LPM_DATA_SIZE_MAX  256
+#define LPM_DATA_SIZE_MIN  1
+
+#define LPM_VAL_SIZE_MAX   (KMALLOC_MAX_SIZE - LPM_DATA_SIZE_MAX - \
+sizeof(struct lpm_trie_node))
+#define LPM_VAL_SIZE_MIN   1
+
+#define LPM_KEY_SIZE(X)(sizeof(struct bpf_lpm_trie_key) + (X))
+#define LPM_KEY_SIZE_MAX   LPM_KEY_SIZE(LPM_DATA_SIZE_MAX)
+#define LPM_KEY_SIZE_MIN   LPM_KEY_SIZE(LPM_DATA_SIZE_MIN)
+
 static struct bpf_map *trie_alloc(union bpf_attr *attr)
 {
-   size_t cost, cost_per_node;
struct lpm_trie *trie;
+   u64 cost = sizeof(*trie), cost_per_node;
int ret;
 
if (!capable(CAP_SYS_ADMIN))
@@ -406,9 +417,10 @@ static struct bpf_map *trie_alloc(union bpf_attr *attr)
/* check sanity of attributes */
if (attr->max_entries == 0 ||
attr->map_flags != BPF_F_NO_PREALLOC ||
-   attr->key_size < sizeof(struct bpf_lpm_trie_key) + 1   ||
-   attr->key_size > sizeof(struct bpf_lpm_trie_key) + 256 ||
-   attr->value_size == 0)
+   attr->key_size < LPM_KEY_SIZE_MIN ||
+   attr->key_size > LPM_KEY_SIZE_MAX ||
+   attr->value_size < LPM_VAL_SIZE_MIN ||
+   attr->value_size > LPM_VAL_SIZE_MAX)
return ERR_PTR(-EINVAL);
 
trie = kzalloc(sizeof(*trie), GFP_USER | __GFP_NOWARN);
@@ -426,18 +438,24 @@ static struct bpf_map *trie_alloc(union bpf_attr *attr)
 
cost_per_node = sizeof(struct lpm_trie_node) +
attr->value_size + trie->data_size;
-   cost = sizeof(*trie) + attr->max_entries * cost_per_node;
+   cost += (u64) attr->max_entries * cost_per_node;
+   if (cost >= U32_MAX - PAGE_SIZE) {
+   ret = -E2BIG;
+   goto out_err;
+   }
+
trie->map.pages = round_up(cost, PAGE_SIZE) >> PAGE_SHIFT;
 
ret = bpf_map_precharge_memlock(trie->map.pages);
-   if (ret) {
-   kfree(trie);
-   return ERR_PTR(ret);
-   }
+   if (ret)
+   goto out_err;
 
raw_spin_lock_init(>lock);
 
return >map;
+out_err:
+   kfree(trie);
+   return ERR_PTR(ret);
 }
 
 static void trie_free(struct bpf_map *map)
-- 
1.9.3



Re: [PATCH net-next v3 04/11] bpf: Use bpf_load_program() from the library

2017-02-07 Thread Alexei Starovoitov

On 2/7/17 1:44 PM, Mickaël Salaün wrote:

-   union bpf_attr attr;
+   union bpf_attr attr = {};

-   bzero(, sizeof(attr));


I think somebody mentioned that there are compilers out there
that don't do it correctly, hence it was done with explicit bzero.
Arnaldo, Wang, do you remember the details?


[PATCH net-next] bridge: vlan tunnel id info range fill size calc cleanups

2017-02-07 Thread Roopa Prabhu
From: Roopa Prabhu 

This fixes a bug and cleans up tunnelid range size
calculation code by using consistent variable names
and checks in size calculation and fill functions.

tested for a few cases of vlan-vni range mappings:
(output from patched iproute2):
$bridge vlan showtunnel
port vidtunid
vxlan0   100-1051000-1005
 2002000
 2102100
 211-2132100-2102
 2142104
 216-2172108-2109
 2192119

Fixes: efa5356b0d97 ("bridge: per vlan dst_metadata netlink support")
Reported-by: Colin Ian King 
Signed-off-by: Roopa Prabhu 
---
 net/bridge/br_netlink_tunnel.c |   34 --
 1 file changed, 16 insertions(+), 18 deletions(-)

diff --git a/net/bridge/br_netlink_tunnel.c b/net/bridge/br_netlink_tunnel.c
index 4c1303b..3a0eb54 100644
--- a/net/bridge/br_netlink_tunnel.c
+++ b/net/bridge/br_netlink_tunnel.c
@@ -30,18 +30,18 @@ static size_t __get_vlan_tinfo_size(void)
  nla_total_size(sizeof(u16)); /* IFLA_BRIDGE_VLAN_TUNNEL_FLAGS 
*/
 }
 
-static bool vlan_tunnel_id_isrange(struct net_bridge_vlan *v,
-  struct net_bridge_vlan *v_end)
+static bool vlan_tunid_inrange(struct net_bridge_vlan *v_curr,
+  struct net_bridge_vlan *v_last)
 {
-   __be32 tunid_curr = tunnel_id_to_key32(v->tinfo.tunnel_id);
-   __be32 tunid_end = tunnel_id_to_key32(v_end->tinfo.tunnel_id);
+   __be32 tunid_curr = tunnel_id_to_key32(v_curr->tinfo.tunnel_id);
+   __be32 tunid_last = tunnel_id_to_key32(v_last->tinfo.tunnel_id);
 
-   return (be32_to_cpu(tunid_curr) - be32_to_cpu(tunid_end)) == 1;
+   return (be32_to_cpu(tunid_curr) - be32_to_cpu(tunid_last)) == 1;
 }
 
 static int __get_num_vlan_tunnel_infos(struct net_bridge_vlan_group *vg)
 {
-   struct net_bridge_vlan *v, *v_start = NULL, *v_end = NULL;
+   struct net_bridge_vlan *v, *vtbegin = NULL, *vtend = NULL;
int num_tinfos = 0;
 
/* Count number of vlan infos */
@@ -50,27 +50,25 @@ static int __get_num_vlan_tunnel_infos(struct 
net_bridge_vlan_group *vg)
if (!br_vlan_should_use(v) || !v->tinfo.tunnel_id)
continue;
 
-   if (!v_start) {
+   if (!vtbegin) {
goto initvars;
-   } else if ((v->vid - v_end->vid) == 1 &&
-  vlan_tunnel_id_isrange(v_end, v) == 1) {
-   v_end = v;
+   } else if ((v->vid - vtend->vid) == 1 &&
+  vlan_tunid_inrange(v, vtend)) {
+   vtend = v;
continue;
} else {
-   if ((v_end->vid - v->vid) > 0 &&
-   vlan_tunnel_id_isrange(v_end, v) > 0)
+   if ((vtend->vid - vtbegin->vid) > 0)
num_tinfos += 2;
else
num_tinfos += 1;
}
 initvars:
-   v_start = v;
-   v_end = v;
+   vtbegin = v;
+   vtend = v;
}
 
-   if (v_start) {
-   if ((v_end->vid - v->vid) > 0 &&
-   vlan_tunnel_id_isrange(v_end, v) > 0)
+   if (vtbegin && vtend) {
+   if ((vtend->vid - vtbegin->vid) > 0)
num_tinfos += 2;
else
num_tinfos += 1;
@@ -171,7 +169,7 @@ int br_fill_vlan_tunnel_info(struct sk_buff *skb,
if (!vtbegin) {
goto initvars;
} else if ((v->vid - vtend->vid) == 1 &&
-   vlan_tunnel_id_isrange(v, vtend)) {
+   vlan_tunid_inrange(v, vtend)) {
vtend = v;
continue;
} else {
-- 
1.7.10.4



Re: [iproute PATCH 1/2] testsuite: skip link show test on big endian systems

2017-02-07 Thread Phil Sutter
On Tue, Feb 07, 2017 at 03:12:49PM -0800, Stephen Hemminger wrote:
> On Wed,  8 Feb 2017 00:04:21 +0100
> Phil Sutter  wrote:
> 
> > Netlink protocol is in host byte order, so the provided binary netlink
> > message buffer being in little endian format will cause the test to fail
> > on big endian systems.
> > 
> > Signed-off-by: Phil Sutter 
> 
> Maybe better to figure out how to generate the files in host order?

Yes, I had thought about that as well. Not sure whether it's worth the
effort though to write a program which constructs the messages and dumps
them into a file for 'ip monitor' to read. I think there's a certain
chance this will eventually test the message builder instead of iproute.
:)

Cheers, Phil


Re: gro_cells: move to net/core/gro_cells.c

2017-02-07 Thread Eric Dumazet
On Tue, 2017-02-07 at 15:37 -0800, Eric Dumazet wrote:
> From: Eric Dumazet 
> 
> We have many gro cells users, so lets move the code to avoid
> duplication.
> 
> This creates a CONFIG_GRO_CELLS option.
> 
> Signed-off-by: Eric Dumazet 

This is targeting net-next tree, just in case the lack of 
[PATCH net-next] is confusing :/




gro_cells: move to net/core/gro_cells.c

2017-02-07 Thread Eric Dumazet
From: Eric Dumazet 

We have many gro cells users, so lets move the code to avoid
duplication.

This creates a CONFIG_GRO_CELLS option.

Signed-off-by: Eric Dumazet 
---
 drivers/net/Kconfig |3 +
 include/net/gro_cells.h |   86 +--
 net/Kconfig |4 +
 net/core/Makefile   |1 
 net/core/gro_cells.c|   92 ++
 net/ipv4/Kconfig|1 
 net/ipv6/Kconfig|1 
 net/xfrm/Kconfig|1 
 8 files changed, 107 insertions(+), 82 deletions(-)

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 
95c32f2d7601b9180d43b77e1c143d4988f5..a993cbeb9e0c84326a63369226b14ed00870 
100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -170,6 +170,7 @@ config VXLAN
tristate "Virtual eXtensible Local Area Network (VXLAN)"
depends on INET
select NET_UDP_TUNNEL
+   select GRO_CELLS
---help---
  This allows one to create vxlan virtual interfaces that provide
  Layer 2 Networks over Layer 3 Networks. VXLAN is often used
@@ -184,6 +185,7 @@ config GENEVE
tristate "Generic Network Virtualization Encapsulation"
depends on INET && NET_UDP_TUNNEL
select NET_IP_TUNNEL
+   select GRO_CELLS
---help---
  This allows one to create geneve virtual interfaces that provide
  Layer 2 Networks over Layer 3 Networks. GENEVE is often used
@@ -216,6 +218,7 @@ config MACSEC
select CRYPTO
select CRYPTO_AES
select CRYPTO_GCM
+   select GRO_CELLS
---help---
   MACsec is an encryption standard for Ethernet.
 
diff --git a/include/net/gro_cells.h b/include/net/gro_cells.h
index 
2a1abbf8da74368cd01adc40cef6c0644e05..fcaf8f47913054543e97d606518f78eabf06 
100644
--- a/include/net/gro_cells.h
+++ b/include/net/gro_cells.h
@@ -5,92 +5,14 @@
 #include 
 #include 
 
-struct gro_cell {
-   struct sk_buff_head napi_skbs;
-   struct napi_struct  napi;
-};
+struct gro_cell;
 
 struct gro_cells {
struct gro_cell __percpu*cells;
 };
 
-static inline int gro_cells_receive(struct gro_cells *gcells, struct sk_buff 
*skb)
-{
-   struct gro_cell *cell;
-   struct net_device *dev = skb->dev;
-
-   if (!gcells->cells || skb_cloned(skb) || !(dev->features & NETIF_F_GRO))
-   return netif_rx(skb);
-
-   cell = this_cpu_ptr(gcells->cells);
-
-   if (skb_queue_len(>napi_skbs) > netdev_max_backlog) {
-   atomic_long_inc(>rx_dropped);
-   kfree_skb(skb);
-   return NET_RX_DROP;
-   }
-
-   __skb_queue_tail(>napi_skbs, skb);
-   if (skb_queue_len(>napi_skbs) == 1)
-   napi_schedule(>napi);
-   return NET_RX_SUCCESS;
-}
-
-/* called under BH context */
-static inline int gro_cell_poll(struct napi_struct *napi, int budget)
-{
-   struct gro_cell *cell = container_of(napi, struct gro_cell, napi);
-   struct sk_buff *skb;
-   int work_done = 0;
-
-   while (work_done < budget) {
-   skb = __skb_dequeue(>napi_skbs);
-   if (!skb)
-   break;
-   napi_gro_receive(napi, skb);
-   work_done++;
-   }
-
-   if (work_done < budget)
-   napi_complete_done(napi, work_done);
-   return work_done;
-}
-
-static inline int gro_cells_init(struct gro_cells *gcells, struct net_device 
*dev)
-{
-   int i;
-
-   gcells->cells = alloc_percpu(struct gro_cell);
-   if (!gcells->cells)
-   return -ENOMEM;
-
-   for_each_possible_cpu(i) {
-   struct gro_cell *cell = per_cpu_ptr(gcells->cells, i);
-
-   __skb_queue_head_init(>napi_skbs);
-
-   set_bit(NAPI_STATE_NO_BUSY_POLL, >napi.state);
-
-   netif_napi_add(dev, >napi, gro_cell_poll, 64);
-   napi_enable(>napi);
-   }
-   return 0;
-}
-
-static inline void gro_cells_destroy(struct gro_cells *gcells)
-{
-   int i;
-
-   if (!gcells->cells)
-   return;
-   for_each_possible_cpu(i) {
-   struct gro_cell *cell = per_cpu_ptr(gcells->cells, i);
-
-   netif_napi_del(>napi);
-   __skb_queue_purge(>napi_skbs);
-   }
-   free_percpu(gcells->cells);
-   gcells->cells = NULL;
-}
+int gro_cells_receive(struct gro_cells *gcells, struct sk_buff *skb);
+int gro_cells_init(struct gro_cells *gcells, struct net_device *dev);
+void gro_cells_destroy(struct gro_cells *gcells);
 
 #endif
diff --git a/net/Kconfig b/net/Kconfig
index 
2f2842d2d3edde3f13574923953d4c33aff0..f19c0c3b9589757856502bc91398e6ee17d8 
100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -413,6 +413,10 @@ config DST_CACHE
bool
default n
 
+config GRO_CELLS
+   bool
+   default n
+
 config NET_DEVLINK
tristate "Network physical/parent device Netlink interface"
   

Re: [PATCH v2] xen-netfront: Improve error handling during initialization

2017-02-07 Thread Boris Ostrovsky
On 02/07/2017 09:55 AM, Ross Lagerwall wrote:
> This fixes a crash when running out of grant refs when creating many
> queues across many netdevs.
>
> * If creating queues fails (i.e. there are no grant refs available),
> call xenbus_dev_fatal() to ensure that the xenbus device is set to the
> closed state.
> * If no queues are created, don't call xennet_disconnect_backend as
> netdev->real_num_tx_queues will not have been set correctly.
> * If setup_netfront() fails, ensure that all the queues created are
> cleaned up, not just those that have been set up.
> * If any queues were set up and an error occurs, call
> xennet_destroy_queues() to clean up the napi context.
> * If any fatal error occurs, unregister and destroy the netdev to avoid
> leaving around a half setup network device.
>
> Signed-off-by: Ross Lagerwall 
> ---
>
> Changed in V2:
> * Retested on top of v4.10-rc7 + "xen-netfront: Delete rx_refill_timer
>   in xennet_disconnect_backend()".
> * Don't move setup_timer as it is not necessary.
>
>  drivers/net/xen-netfront.c | 33 +++--
>  1 file changed, 15 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> index 722fe9f..5399a86 100644
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -1823,27 +1823,23 @@ static int talk_to_netback(struct xenbus_device *dev,
>   xennet_destroy_queues(info);
>  
>   err = xennet_create_queues(info, _queues);
> - if (err < 0)
> - goto destroy_ring;
> + if (err < 0) {
> + xenbus_dev_fatal(dev, err, "creating queues");
> + if (num_queues > 0) {
> + goto destroy_ring;

The only way for us to have (err<0) && (num_queues>0) is when we get a
-ENOMEM right at the top, isn't it? So there is nothing to disconnect or
destroy, it seems to me. And if that's true you can directly 'goto out'.

-boris

> + } else {
> + kfree(info->queues);
> + info->queues = NULL;
> + goto out;
> + }
> + }
>  
>   /* Create shared ring, alloc event channel -- for each queue */
>   for (i = 0; i < num_queues; ++i) {
>   queue = >queues[i];
>   err = setup_netfront(dev, queue, feature_split_evtchn);
> - if (err) {
> - /* setup_netfront() will tidy up the current
> -  * queue on error, but we need to clean up
> -  * those already allocated.
> -  */
> - if (i > 0) {
> - rtnl_lock();
> - netif_set_real_num_tx_queues(info->netdev, i);
> - rtnl_unlock();
> - goto destroy_ring;
> - } else {
> - goto out;
> - }
> - }
> + if (err)
> + goto destroy_ring;
>   }
>  
>  again:
> @@ -1933,9 +1929,10 @@ static int talk_to_netback(struct xenbus_device *dev,
>   xenbus_transaction_end(xbt, 1);
>   destroy_ring:
>   xennet_disconnect_backend(info);
> - kfree(info->queues);
> - info->queues = NULL;
> + xennet_destroy_queues(info);
>   out:
> + unregister_netdev(info->netdev);
> + xennet_free_netdev(info->netdev);
>   return err;
>  }
>  




Re: [iproute PATCH 1/2] testsuite: skip link show test on big endian systems

2017-02-07 Thread Stephen Hemminger
On Wed,  8 Feb 2017 00:04:21 +0100
Phil Sutter  wrote:

> Netlink protocol is in host byte order, so the provided binary netlink
> message buffer being in little endian format will cause the test to fail
> on big endian systems.
> 
> Signed-off-by: Phil Sutter 

Maybe better to figure out how to generate the files in host order?


[PATCH net-next v2 02/12] net: cgroups: fix build errors when linux/phy*.h is removed from net/dsa.h

2017-02-07 Thread Florian Fainelli
From: Russell King 

net/core/netprio_cgroup.c:303:16: error: expected declaration specifiers or 
'...' before string constant
MODULE_LICENSE("GPL v2");
   ^~~~

Add linux/module.h to fix this.

Signed-off-by: Russell King 
---
 net/core/netprio_cgroup.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index 2ec86fc552df..756637dc7a57 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -13,6 +13,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
-- 
2.9.3



[PATCH net-next v2 00/12] net: dsa: remove unnecessary phy.h include

2017-02-07 Thread Florian Fainelli
Hi all,

Including phy.h and phy_fixed.h into net/dsa.h causes phy*.h to be an
unnecessary dependency for quite a large amount of the kernel.  There's
very little which actually requires definitions from phy.h in net/dsa.h
- the include itself only wants the declaration of a couple of
structures and IFNAMSIZ.

Add linux/if.h for IFNAMSIZ, declarations for the structures, phy.h to
mv88e6xxx.h as it needs it for phy_interface_t, and remove both phy.h
and phy_fixed.h from net/dsa.h.

This patch reduces from around 800 files rebuilt to around 40 - even
with ccache, the time difference is noticable.

In order to make this change, several drivers need to be updated to
include necessary headers that they were picking up through this
include.  This has resulted in a much larger patch series.

I'm assuming the 0-day builder has had 24 hours with this series, and
hasn't reported any further issues with it - the last issue was two
weeks ago (before I became ill) which I fixed over the last weekend.

I'm hoping this doesn't conflict with what's already in net-next...

David, this should probably go via your tree considering the diffstat.

Changes in v2:

- took Russell's patch series
- removed Qualcomm EMAC patch
- rebased against net-next/master

Russell King (12):
  net: sunrpc: fix build errors when linux/phy*.h is removed from
net/dsa.h
  net: cgroups: fix build errors when linux/phy*.h is removed from
net/dsa.h
  net: macb: fix build errors when linux/phy*.h is removed from
net/dsa.h
  net: lan78xx: fix build errors when linux/phy*.h is removed from
net/dsa.h
  net: bgmac: fix build errors when linux/phy*.h is removed from
net/dsa.h
  net: fman: fix build errors when linux/phy*.h is removed from
net/dsa.h
  net: mvneta: fix build errors when linux/phy*.h is removed from
net/dsa.h
  iscsi: fix build errors when linux/phy*.h is removed from net/dsa.h
  MIPS: Octeon: Remove unnecessary MODULE_*()
  net: liquidio: fix build errors when linux/phy*.h is removed from
net/dsa.h
  net: ath5k: fix build errors when linux/phy*.h is removed from
net/dsa.h
  net: dsa: remove unnecessary phy*.h includes

 arch/mips/cavium-octeon/octeon-platform.c | 4 
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h | 1 +
 drivers/net/ethernet/broadcom/bgmac.c | 2 ++
 drivers/net/ethernet/cadence/macb.h   | 2 ++
 drivers/net/ethernet/cavium/liquidio/lio_main.c   | 1 +
 drivers/net/ethernet/cavium/liquidio/lio_vf_main.c| 1 +
 drivers/net/ethernet/cavium/liquidio/octeon_console.c | 1 +
 drivers/net/ethernet/freescale/fman/fman_memac.c  | 1 +
 drivers/net/ethernet/marvell/mvneta.c | 1 +
 drivers/net/usb/lan78xx.c | 1 +
 drivers/net/wireless/ath/ath5k/ahb.c  | 2 +-
 drivers/target/iscsi/iscsi_target_login.c | 1 +
 include/net/dsa.h | 5 +++--
 net/core/netprio_cgroup.c | 1 +
 net/sunrpc/xprtrdma/svc_rdma_backchannel.c| 1 +
 15 files changed, 18 insertions(+), 7 deletions(-)

-- 
2.9.3



[PATCH net-next v2 04/12] net: lan78xx: fix build errors when linux/phy*.h is removed from net/dsa.h

2017-02-07 Thread Florian Fainelli
From: Russell King 

drivers/net/usb/lan78xx.c:394:33: sparse: expected ; at end of declaration
drivers/net/usb/lan78xx.c:394:33: sparse: Expected } at end of 
struct-union-enum-specifier
drivers/net/usb/lan78xx.c:394:33: sparse: got interface
drivers/net/usb/lan78xx.c:403:1: sparse: Expected ; at the end of type 
declaration
drivers/net/usb/lan78xx.c:403:1: sparse: got }

Add linux/phy.h to lan78xx.c

Signed-off-by: Russell King 
---
 drivers/net/usb/lan78xx.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index 08f8703e4d54..9889a70ff4f6 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "lan78xx.h"
 
 #define DRIVER_AUTHOR  "WOOJUNG HUH "
-- 
2.9.3



[PATCH net-next v2 03/12] net: macb: fix build errors when linux/phy*.h is removed from net/dsa.h

2017-02-07 Thread Florian Fainelli
From: Russell King 

drivers/net/ethernet/cadence/macb.h:862:33: sparse: expected ; at end of 
declaration
drivers/net/ethernet/cadence/macb.h:862:33: sparse: Expected } at end of 
struct-union-enum-specifier
drivers/net/ethernet/cadence/macb.h:862:33: sparse: got phy_interface
drivers/net/ethernet/cadence/macb.h:877:1: sparse: Expected ; at the end of 
type declaration
drivers/net/ethernet/cadence/macb.h:877:1: sparse: got }
In file included from drivers/net/ethernet/cadence/macb_pci.c:29:0:
drivers/net/ethernet/cadence/macb.h:862:2: error: unknown type name 
'phy_interface_t'
 phy_interface_t  phy_interface;
 ^~~

Add linux/phy.h to macb.h

Signed-off-by: Russell King 
Acked-by: Nicolas Ferre 
---
 drivers/net/ethernet/cadence/macb.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/cadence/macb.h 
b/drivers/net/ethernet/cadence/macb.h
index a2cf91223003..234a49eaccfd 100644
--- a/drivers/net/ethernet/cadence/macb.h
+++ b/drivers/net/ethernet/cadence/macb.h
@@ -10,6 +10,8 @@
 #ifndef _MACB_H
 #define _MACB_H
 
+#include 
+
 #define MACB_GREGS_NBR 16
 #define MACB_GREGS_VERSION 2
 #define MACB_MAX_QUEUES 8
-- 
2.9.3



[PATCH net-next v2 01/12] net: sunrpc: fix build errors when linux/phy*.h is removed from net/dsa.h

2017-02-07 Thread Florian Fainelli
From: Russell King 

Removing linux/phy.h from net/dsa.h reveals a build error in the sunrpc
code:

net/sunrpc/xprtrdma/svc_rdma_backchannel.c: In function 'xprt_rdma_bc_put':
net/sunrpc/xprtrdma/svc_rdma_backchannel.c:277:2: error: implicit declaration 
of function 'module_put' [-Werror=implicit-function-declaration]
net/sunrpc/xprtrdma/svc_rdma_backchannel.c: In function 'xprt_setup_rdma_bc':
net/sunrpc/xprtrdma/svc_rdma_backchannel.c:348:7: error: implicit declaration 
of function 'try_module_get' [-Werror=implicit-function-declaration]

Fix this by adding linux/module.h to svc_rdma_backchannel.c

Signed-off-by: Russell King 
Acked-by: Anna Schumaker 
---
 net/sunrpc/xprtrdma/svc_rdma_backchannel.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c 
b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
index 288e35c2d8f4..cb1e48e54eb1 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
@@ -4,6 +4,7 @@
  * Support for backward direction RPCs on RPC/RDMA (server-side).
  */
 
+#include 
 #include 
 #include "xprt_rdma.h"
 
-- 
2.9.3



[PATCH] net: mellanox: switchx2: use new api ethtool_{get|set}_link_ksettings

2017-02-07 Thread Philippe Reynes
The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

As I don't have the hardware, I'd be very pleased if
someone may test this patch.

Signed-off-by: Philippe Reynes 
---
 drivers/net/ethernet/mellanox/mlxsw/switchx2.c |   47 +++-
 1 files changed, 30 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/switchx2.c 
b/drivers/net/ethernet/mellanox/mlxsw/switchx2.c
index 169193e..ec1e886 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/switchx2.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/switchx2.c
@@ -733,7 +733,7 @@ static u32 mlxsw_sx_from_ptys_advert_link(u32 
ptys_eth_proto)
 }
 
 static void mlxsw_sx_from_ptys_speed_duplex(bool carrier_ok, u32 
ptys_eth_proto,
-   struct ethtool_cmd *cmd)
+   struct ethtool_link_ksettings *cmd)
 {
u32 speed = SPEED_UNKNOWN;
u8 duplex = DUPLEX_UNKNOWN;
@@ -750,8 +750,8 @@ static void mlxsw_sx_from_ptys_speed_duplex(bool 
carrier_ok, u32 ptys_eth_proto,
}
}
 out:
-   ethtool_cmd_speed_set(cmd, speed);
-   cmd->duplex = duplex;
+   cmd->base.speed = speed;
+   cmd->base.duplex = duplex;
 }
 
 static u8 mlxsw_sx_port_connector_port(u32 ptys_eth_proto)
@@ -776,8 +776,9 @@ static u8 mlxsw_sx_port_connector_port(u32 ptys_eth_proto)
return PORT_OTHER;
 }
 
-static int mlxsw_sx_port_get_settings(struct net_device *dev,
- struct ethtool_cmd *cmd)
+static int
+mlxsw_sx_port_get_link_ksettings(struct net_device *dev,
+struct ethtool_link_ksettings *cmd)
 {
struct mlxsw_sx_port *mlxsw_sx_port = netdev_priv(dev);
struct mlxsw_sx *mlxsw_sx = mlxsw_sx_port->mlxsw_sx;
@@ -785,6 +786,7 @@ static int mlxsw_sx_port_get_settings(struct net_device 
*dev,
u32 eth_proto_cap;
u32 eth_proto_admin;
u32 eth_proto_oper;
+   u32 supported, advertising, lp_advertising;
int err;
 
mlxsw_reg_ptys_eth_pack(ptys_pl, mlxsw_sx_port->local_port, 0);
@@ -796,18 +798,24 @@ static int mlxsw_sx_port_get_settings(struct net_device 
*dev,
mlxsw_reg_ptys_eth_unpack(ptys_pl, _proto_cap,
  _proto_admin, _proto_oper);
 
-   cmd->supported = mlxsw_sx_from_ptys_supported_port(eth_proto_cap) |
+   supported = mlxsw_sx_from_ptys_supported_port(eth_proto_cap) |
 mlxsw_sx_from_ptys_supported_link(eth_proto_cap) |
 SUPPORTED_Pause | SUPPORTED_Asym_Pause;
-   cmd->advertising = mlxsw_sx_from_ptys_advert_link(eth_proto_admin);
+   advertising = mlxsw_sx_from_ptys_advert_link(eth_proto_admin);
mlxsw_sx_from_ptys_speed_duplex(netif_carrier_ok(dev),
eth_proto_oper, cmd);
 
eth_proto_oper = eth_proto_oper ? eth_proto_oper : eth_proto_cap;
-   cmd->port = mlxsw_sx_port_connector_port(eth_proto_oper);
-   cmd->lp_advertising = mlxsw_sx_from_ptys_advert_link(eth_proto_oper);
+   cmd->base.port = mlxsw_sx_port_connector_port(eth_proto_oper);
+   lp_advertising = mlxsw_sx_from_ptys_advert_link(eth_proto_oper);
+
+   ethtool_convert_legacy_u32_to_link_mode(cmd->link_modes.supported,
+   supported);
+   ethtool_convert_legacy_u32_to_link_mode(cmd->link_modes.advertising,
+   advertising);
+   ethtool_convert_legacy_u32_to_link_mode(cmd->link_modes.lp_advertising,
+   lp_advertising);
 
-   cmd->transceiver = XCVR_INTERNAL;
return 0;
 }
 
@@ -847,8 +855,9 @@ static u32 mlxsw_sx_to_ptys_upper_speed(u32 upper_speed)
return ptys_proto;
 }
 
-static int mlxsw_sx_port_set_settings(struct net_device *dev,
- struct ethtool_cmd *cmd)
+static int
+mlxsw_sx_port_set_link_ksettings(struct net_device *dev,
+const struct ethtool_link_ksettings *cmd)
 {
struct mlxsw_sx_port *mlxsw_sx_port = netdev_priv(dev);
struct mlxsw_sx *mlxsw_sx = mlxsw_sx_port->mlxsw_sx;
@@ -857,13 +866,17 @@ static int mlxsw_sx_port_set_settings(struct net_device 
*dev,
u32 eth_proto_new;
u32 eth_proto_cap;
u32 eth_proto_admin;
+   u32 advertising;
bool is_up;
int err;
 
-   speed = ethtool_cmd_speed(cmd);
+   speed = cmd->base.speed;
+
+   ethtool_convert_link_mode_to_legacy_u32(,
+   cmd->link_modes.advertising);
 
-   eth_proto_new = cmd->autoneg == AUTONEG_ENABLE ?
-   mlxsw_sx_to_ptys_advert_link(cmd->advertising) :
+   eth_proto_new = cmd->base.autoneg == AUTONEG_ENABLE ?
+   mlxsw_sx_to_ptys_advert_link(advertising) :

[PATCH net-next v2 05/12] net: bgmac: fix build errors when linux/phy*.h is removed from net/dsa.h

2017-02-07 Thread Florian Fainelli
From: Russell King 

drivers/net/ethernet/broadcom/bgmac.c:1015:17: error: dereferencing pointer to 
incomplete type 'struct mii_bus'
drivers/net/ethernet/broadcom/bgmac.c:1185:2: error: implicit declaration of 
function 'phy_start' [-Werror=implicit-function-declaration]
drivers/net/ethernet/broadcom/bgmac.c:1198:2: error: implicit declaration of 
function 'phy_stop' [-Werror=implicit-function-declaration]
drivers/net/ethernet/broadcom/bgmac.c:1239:9: error: implicit declaration of 
function 'phy_mii_ioctl' [-Werror=implicit-function-declaration]
drivers/net/ethernet/broadcom/bgmac.c:1389:28: error: 
'phy_ethtool_get_link_ksettings' undeclared here (not in a function)
drivers/net/ethernet/broadcom/bgmac.c:1390:28: error: 
'phy_ethtool_set_link_ksettings' undeclared here (not in a function)
drivers/net/ethernet/broadcom/bgmac.c:1403:13: error: dereferencing pointer to 
incomplete type 'struct phy_device'
drivers/net/ethernet/broadcom/bgmac.c:1417:3: error: implicit declaration of 
function 'phy_print_status' [-Werror=implicit-function-declaration]
drivers/net/ethernet/broadcom/bgmac.c:1424:26: error: storage size of 
'fphy_status' isn't known
drivers/net/ethernet/broadcom/bgmac.c:1424:9: error: variable 'fphy_status' has 
initializer but incomplete type
drivers/net/ethernet/broadcom/bgmac.c:1425:11: warning: excess elements in 
struct initializer
drivers/net/ethernet/broadcom/bgmac.c:1425:3: error: unknown field 'link' 
specified in initializer
drivers/net/ethernet/broadcom/bgmac.c:1426:12: note: in expansion of macro 
'SPEED_1000'
drivers/net/ethernet/broadcom/bgmac.c:1426:3: error: unknown field 'speed' 
specified in initializer
drivers/net/ethernet/broadcom/bgmac.c:1427:13: note: in expansion of macro 
'DUPLEX_FULL'
drivers/net/ethernet/broadcom/bgmac.c:1427:3: error: unknown field 'duplex' 
specified in initializer
drivers/net/ethernet/broadcom/bgmac.c:1432:12: error: implicit declaration of 
function 'fixed_phy_register' [-Werror=implicit-function-declaration]
drivers/net/ethernet/broadcom/bgmac.c:1432:31: error: 'PHY_POLL' undeclared 
(first use in this function)
drivers/net/ethernet/broadcom/bgmac.c:1438:8: error: implicit declaration of 
function 'phy_connect_direct' [-Werror=implicit-function-declaration]
drivers/net/ethernet/broadcom/bgmac.c:1439:6: error: 'PHY_INTERFACE_MODE_MII' 
undeclared (first use in this function)
drivers/net/ethernet/broadcom/bgmac.c:1521:2: error: implicit declaration of 
function 'phy_disconnect' [-Werror=implicit-function-declaration]
drivers/net/ethernet/broadcom/bgmac.c:1541:15: error: expected declaration 
specifiers or '...' before string constant

Add linux/phy.h to bgmac.c

Signed-off-by: Russell King 
Acked-by: Rafał Miłecki 
---
 drivers/net/ethernet/broadcom/bgmac.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bgmac.c 
b/drivers/net/ethernet/broadcom/bgmac.c
index fe88126b1e0c..20fe2520da42 100644
--- a/drivers/net/ethernet/broadcom/bgmac.c
+++ b/drivers/net/ethernet/broadcom/bgmac.c
@@ -12,6 +12,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include "bgmac.h"
 
 static bool bgmac_wait_value(struct bgmac *bgmac, u16 reg, u32 mask,
-- 
2.9.3



[PATCH net-next v2 09/12] MIPS: Octeon: Remove unnecessary MODULE_*()

2017-02-07 Thread Florian Fainelli
From: Russell King 

octeon-platform.c can not be built as a module for two reasons:

(a) the Makefile doesn't allow it:
obj-y := cpu.o setup.o octeon-platform.o octeon-irq.o csrc-octeon.o

(b) the multiple *_initcall() statements, each of which are translated
to a module_init() call when attempting a module build, become
aliases to init_module().  Having more than one alias will cause a
build error.

Hence, rather than adding a linux/module.h include, remove the redundant
MODULE_*() from this file.

Acked-by: David Daney 
Signed-off-by: Russell King 
---
 arch/mips/cavium-octeon/octeon-platform.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/arch/mips/cavium-octeon/octeon-platform.c 
b/arch/mips/cavium-octeon/octeon-platform.c
index 37a932d9148c..8297ce714c5e 100644
--- a/arch/mips/cavium-octeon/octeon-platform.c
+++ b/arch/mips/cavium-octeon/octeon-platform.c
@@ -1060,7 +1060,3 @@ static int __init octeon_publish_devices(void)
return of_platform_bus_probe(NULL, octeon_ids, NULL);
 }
 arch_initcall(octeon_publish_devices);
-
-MODULE_AUTHOR("David Daney ");
-MODULE_LICENSE("GPL");
-MODULE_DESCRIPTION("Platform driver for Octeon SOC");
-- 
2.9.3



[PATCH net-next v2 11/12] net: ath5k: fix build errors when linux/phy*.h is removed from net/dsa.h

2017-02-07 Thread Florian Fainelli
From: Russell King 

Fix these errors reported by the 0-day builder by replacing the
linux/export.h include with linux/module.h.

In file included from include/linux/platform_device.h:14:0,
 from drivers/net/wireless/ath/ath5k/ahb.c:20:
include/linux/device.h:1463:1: warning: data definition has no type or storage 
class
 module_init(__driver##_init); \
 ^
include/linux/platform_device.h:228:2: note: in expansion of macro 
'module_driver'
  module_driver(__platform_driver, platform_driver_register, \
  ^
drivers/net/wireless/ath/ath5k/ahb.c:233:1: note: in expansion of macro 
'module_platform_driver'
 module_platform_driver(ath_ahb_driver);
 ^~
include/linux/device.h:1463:1: error: type defaults to 'int' in declaration of 
'module_init' [-Werror=implicit-int]
 module_init(__driver##_init); \
 ^
include/linux/platform_device.h:228:2: note: in expansion of macro 
'module_driver'
  module_driver(__platform_driver, platform_driver_register, \
  ^
drivers/net/wireless/ath/ath5k/ahb.c:233:1: note: in expansion of macro 
'module_platform_driver'
 module_platform_driver(ath_ahb_driver);
 ^~
drivers/net/wireless/ath/ath5k/ahb.c:233:1: warning: parameter names (without 
types) in function declaration
In file included from include/linux/platform_device.h:14:0,
 from drivers/net/wireless/ath/ath5k/ahb.c:20:
include/linux/device.h:1468:1: warning: data definition has no type or storage 
class
 module_exit(__driver##_exit);
 ^
include/linux/platform_device.h:228:2: note: in expansion of macro 
'module_driver'
  module_driver(__platform_driver, platform_driver_register, \
  ^
drivers/net/wireless/ath/ath5k/ahb.c:233:1: note: in expansion of macro 
'module_platform_driver'
 module_platform_driver(ath_ahb_driver);
 ^~
include/linux/device.h:1468:1: error: type defaults to 'int' in declaration of 
'module_exit' [-Werror=implicit-int]
 module_exit(__driver##_exit);
 ^
include/linux/platform_device.h:228:2: note: in expansion of macro 
'module_driver'
  module_driver(__platform_driver, platform_driver_register, \
  ^
drivers/net/wireless/ath/ath5k/ahb.c:233:1: note: in expansion of macro 
'module_platform_driver'
 module_platform_driver(ath_ahb_driver);
 ^~
drivers/net/wireless/ath/ath5k/ahb.c:233:1: warning: parameter names (without 
types) in function declaration
In file included from include/linux/platform_device.h:14:0,
 from drivers/net/wireless/ath/ath5k/ahb.c:20:
drivers/net/wireless/ath/ath5k/ahb.c:233:24: warning: 'ath_ahb_driver_exit' 
defined but not used [-Wunused-function]
 module_platform_driver(ath_ahb_driver);
^
include/linux/device.h:1464:20: note: in definition of macro 'module_driver'
 static void __exit __driver##_exit(void) \
^~~~
drivers/net/wireless/ath/ath5k/ahb.c:233:1: note: in expansion of macro 
'module_platform_driver'
 module_platform_driver(ath_ahb_driver);
 ^~
drivers/net/wireless/ath/ath5k/ahb.c:233:24: warning: 'ath_ahb_driver_init' 
defined but not used [-Wunused-function]
 module_platform_driver(ath_ahb_driver);
^
include/linux/device.h:1459:19: note: in definition of macro 'module_driver'
 static int __init __driver##_init(void) \
   ^~~~
drivers/net/wireless/ath/ath5k/ahb.c:233:1: note: in expansion of macro 
'module_platform_driver'
 module_platform_driver(ath_ahb_driver);
 ^~

Signed-off-by: Russell King 
Acked-by: Kalle Valo 
---
 drivers/net/wireless/ath/ath5k/ahb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath5k/ahb.c 
b/drivers/net/wireless/ath/ath5k/ahb.c
index 2ca88b593e4c..c0794f5988b3 100644
--- a/drivers/net/wireless/ath/ath5k/ahb.c
+++ b/drivers/net/wireless/ath/ath5k/ahb.c
@@ -16,10 +16,10 @@
  * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
  */
 
+#include 
 #include 
 #include 
 #include 
-#include 
 #include 
 #include "ath5k.h"
 #include "debug.h"
-- 
2.9.3



[PATCH net-next v2 08/12] iscsi: fix build errors when linux/phy*.h is removed from net/dsa.h

2017-02-07 Thread Florian Fainelli
From: Russell King 

drivers/target/iscsi/iscsi_target_login.c:1135:7: error: implicit declaration 
of function 'try_module_get' [-Werror=implicit-function-declaration]

Add linux/module.h to iscsi_target_login.c.

Signed-off-by: Russell King 
Reviewed-by: Bart Van Assche 
---
 drivers/target/iscsi/iscsi_target_login.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/target/iscsi/iscsi_target_login.c 
b/drivers/target/iscsi/iscsi_target_login.c
index 450f51deb2a2..eab274d17b5c 100644
--- a/drivers/target/iscsi/iscsi_target_login.c
+++ b/drivers/target/iscsi/iscsi_target_login.c
@@ -17,6 +17,7 @@
  
**/
 
 #include 
+#include 
 #include 
 #include 
 #include 
-- 
2.9.3



[PATCH net-next v2 10/12] net: liquidio: fix build errors when linux/phy*.h is removed from net/dsa.h

2017-02-07 Thread Florian Fainelli
From: Russell King 

drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:30: error: expected 
declaration specifiers or '...' before string constant
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:30: warning: data definition 
has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:30: error: type defaults to 
'int' in declaration of 'MODULE_AUTHOR'
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:30: error: function 
declaration isn't a prototype
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:31: error: expected 
declaration specifiers or '...' before string constant
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:31: warning: data definition 
has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:31: error: type defaults to 
'int' in declaration of 'MODULE_DESCRIPTION'
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:31: error: function 
declaration isn't a prototype
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:32: error: expected 
declaration specifiers or '...' before string constant
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:32: warning: data definition 
has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:32: error: type defaults to 
'int' in declaration of 'MODULE_LICENSE'
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:32: error: function 
declaration isn't a prototype
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:33: error: expected 
declaration specifiers or '...' before string constant
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:33: warning: data definition 
has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:33: error: type defaults to 
'int' in declaration of 'MODULE_VERSION'
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:33: error: function 
declaration isn't a prototype
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:36: error: expected ')' 
before 'int'
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:37: error: expected ')' 
before string constant
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:325: warning: data 
definition has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:325: error: type defaults to 
'int' in declaration of 'MODULE_DEVICE_TABLE'
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:325: warning: parameter 
names (without types) in function declaration
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:3250: warning: data 
definition has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:3250: error: type defaults 
to 'int' in declaration of 'module_init'
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:3250: warning: parameter 
names (without types) in function declaration
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:3251: warning: data 
definition has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:3251: error: type defaults 
to 'int' in declaration of 'module_exit'
drivers/net/ethernet/cavium/liquidio/lio_vf_main.c:3251: warning: parameter 
names (without types) in function declaration
drivers/net/ethernet/cavium/liquidio/lio_main.c:36: error: expected declaration 
specifiers or '...' before string constant
drivers/net/ethernet/cavium/liquidio/lio_main.c:36: warning: data definition 
has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_main.c:36: error: type defaults to 
'int' in declaration of 'MODULE_AUTHOR'
drivers/net/ethernet/cavium/liquidio/lio_main.c:36: error: function declaration 
isn't a prototype
drivers/net/ethernet/cavium/liquidio/lio_main.c:37: error: expected declaration 
specifiers or '...' before string constant
drivers/net/ethernet/cavium/liquidio/lio_main.c:37: warning: data definition 
has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_main.c:37: error: type defaults to 
'int' in declaration of 'MODULE_DESCRIPTION'
drivers/net/ethernet/cavium/liquidio/lio_main.c:37: error: function declaration 
isn't a prototype
drivers/net/ethernet/cavium/liquidio/lio_main.c:38: error: expected declaration 
specifiers or '...' before string constant
drivers/net/ethernet/cavium/liquidio/lio_main.c:38: warning: data definition 
has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_main.c:38: error: type defaults to 
'int' in declaration of 'MODULE_LICENSE'
drivers/net/ethernet/cavium/liquidio/lio_main.c:38: error: function declaration 
isn't a prototype
drivers/net/ethernet/cavium/liquidio/lio_main.c:39: error: expected declaration 
specifiers or '...' before string constant
drivers/net/ethernet/cavium/liquidio/lio_main.c:39: warning: data definition 
has no type or storage class
drivers/net/ethernet/cavium/liquidio/lio_main.c:39: error: type defaults to 
'int' in declaration of 'MODULE_VERSION'
drivers/net/ethernet/cavium/liquidio/lio_main.c:39: error: function declaration 
isn't a prototype
drivers/net/ethernet/cavium/liquidio/lio_main.c:40: 

[PATCH net-next v2 07/12] net: mvneta: fix build errors when linux/phy*.h is removed from net/dsa.h

2017-02-07 Thread Florian Fainelli
From: Russell King 

drivers/net/ethernet/marvell/mvneta.c:2694:26: error: storage size of 'status' 
isn't known
drivers/net/ethernet/marvell/mvneta.c:2695:26: error: storage size of 'changed' 
isn't known
drivers/net/ethernet/marvell/mvneta.c:2695:9: error: variable 'changed' has 
initializer but incomplete type
drivers/net/ethernet/marvell/mvneta.c:2709:2: error: implicit declaration of 
function 'fixed_phy_update_state' [-Werror=implicit-function-declaration]

Add linux/phy_fixed.h to mvneta.c

Signed-off-by: Russell King 
Acked-by: Thomas Petazzoni 
---
 drivers/net/ethernet/marvell/mvneta.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/marvell/mvneta.c 
b/drivers/net/ethernet/marvell/mvneta.c
index 0f4d1697be46..fdf71720e707 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
-- 
2.9.3



[PATCH net-next v2 06/12] net: fman: fix build errors when linux/phy*.h is removed from net/dsa.h

2017-02-07 Thread Florian Fainelli
From: Russell King 

drivers/net/ethernet/freescale/fman/fman_memac.c:519:21: error: dereferencing 
pointer to incomplete type 'struct fixed_phy_status'

Add linux/phy_fixed.h to fman_memac.c

Signed-off-by: Russell King 
---
 drivers/net/ethernet/freescale/fman/fman_memac.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/freescale/fman/fman_memac.c 
b/drivers/net/ethernet/freescale/fman/fman_memac.c
index 71a5ded9d1de..cd6a53eaf161 100644
--- a/drivers/net/ethernet/freescale/fman/fman_memac.c
+++ b/drivers/net/ethernet/freescale/fman/fman_memac.c
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 /* PCS registers */
-- 
2.9.3



[PATCH net-next v2 12/12] net: dsa: remove unnecessary phy*.h includes

2017-02-07 Thread Florian Fainelli
From: Russell King 

Including phy.h and phy_fixed.h into net/dsa.h causes phy*.h to be an
unnecessary dependency for quite a large amount of the kernel.  There's
very little which actually requires definitions from phy.h in net/dsa.h
- the include itself only wants the declaration of a couple of
structures and IFNAMSIZ.

Add linux/if.h for IFNAMSIZ, declarations for the structures, phy.h to
mv88e6xxx.h as it needs it for phy_interface_t, and remove both phy.h
and phy_fixed.h from net/dsa.h.

This patch reduces from around 800 files rebuilt to around 40 - even
with ccache, the time difference is noticable.

Tested-by: Vivien Didelot 
Reviewed-by: Florian Fainelli 
Signed-off-by: Russell King 
---
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h | 1 +
 include/net/dsa.h | 5 +++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h 
b/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
index 8a21800374f3..91c4dd25c2d3 100644
--- a/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
+++ b/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifndef UINT64_MAX
 #define UINT64_MAX (u64)(~((u64)0))
diff --git a/include/net/dsa.h b/include/net/dsa.h
index b49b2004891e..4e13e695f025 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -11,17 +11,18 @@
 #ifndef __LINUX_NET_DSA_H
 #define __LINUX_NET_DSA_H
 
+#include 
 #include 
 #include 
 #include 
 #include 
 #include 
 #include 
-#include 
-#include 
 #include 
 
 struct tc_action;
+struct phy_device;
+struct fixed_phy_status;
 
 enum dsa_tag_protocol {
DSA_TAG_PROTO_NONE = 0,
-- 
2.9.3



[iproute PATCH 2/2] testsuite: Search kernel config in modules dir also

2017-02-07 Thread Phil Sutter
At least in Fedora there is no /proc/config.gz but instead
/lib/modules/`uname -r`/config, so use that as a fallback.

Signed-off-by: Phil Sutter 
---
 testsuite/Makefile | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/testsuite/Makefile b/testsuite/Makefile
index 17568881b8b66..41a3afc92ae98 100644
--- a/testsuite/Makefile
+++ b/testsuite/Makefile
@@ -15,6 +15,12 @@ IPVERS := $(filter-out iproute2/Makefile,$(wildcard 
iproute2/*))
 
 ifneq (,$(wildcard /proc/config.gz))
KENV := $(shell cat /proc/config.gz | gunzip | grep ^CONFIG)
+else
+KVER := $(shell uname -r)
+KCPATH := /lib/modules/${KVER}/config
+ifneq (,$(wildcard ${KCPATH}))
+   KENV := $(shell cat ${KCPATH} | grep ^CONFIG)
+endif
 endif
 
 .PHONY: compile listtests alltests configure $(TESTS)
-- 
2.11.0



[iproute PATCH 1/2] testsuite: skip link show test on big endian systems

2017-02-07 Thread Phil Sutter
Netlink protocol is in host byte order, so the provided binary netlink
message buffer being in little endian format will cause the test to fail
on big endian systems.

Signed-off-by: Phil Sutter 
---
 .gitignore| 1 +
 testsuite/Makefile| 1 +
 testsuite/tests/ip/link/show_dev_wo_vf_rate.t | 5 +
 testsuite/tools/Makefile  | 2 ++
 testsuite/tools/is_big_endian.c   | 7 +++
 5 files changed, 16 insertions(+)
 create mode 100644 testsuite/tools/Makefile
 create mode 100644 testsuite/tools/is_big_endian.c

diff --git a/.gitignore b/.gitignore
index 74a5496ddf7aa..a1f295dac3dd2 100644
--- a/.gitignore
+++ b/.gitignore
@@ -36,6 +36,7 @@ series
 # tests
 testsuite/results
 testsuite/iproute2/iproute2-this
+testsuite/tools/is_big_endian
 
 # doc files generated at runtime
 doc/*.aux
diff --git a/testsuite/Makefile b/testsuite/Makefile
index 2027650051d48..17568881b8b66 100644
--- a/testsuite/Makefile
+++ b/testsuite/Makefile
@@ -24,6 +24,7 @@ configure:
 
 compile: configure
echo "Entering iproute2" && cd iproute2 && $(MAKE) && cd ..;
+   echo "Entering tools" && cd tools && $(MAKE) && cd ..;
 
 listtests:
@for t in $(TESTS); do \
diff --git a/testsuite/tests/ip/link/show_dev_wo_vf_rate.t 
b/testsuite/tests/ip/link/show_dev_wo_vf_rate.t
index a600ba65c5bec..ad90af5400271 100755
--- a/testsuite/tests/ip/link/show_dev_wo_vf_rate.t
+++ b/testsuite/tests/ip/link/show_dev_wo_vf_rate.t
@@ -2,5 +2,10 @@
 
 source lib/generic.sh
 
+if ./tools/is_big_endian; then
+   ts_log "won't work on big endian system"
+   ts_skip
+fi
+
 NL_FILE="tests/ip/link/dev_wo_vf_rate.nl"
 ts_ip "$0" "Show VF devices w/o VF rate info" -d monitor file $NL_FILE
diff --git a/testsuite/tools/Makefile b/testsuite/tools/Makefile
new file mode 100644
index 0..ecbea16c2c1cc
--- /dev/null
+++ b/testsuite/tools/Makefile
@@ -0,0 +1,2 @@
+is_big_endian: is_big_endian.c
+   $(CC) -o $@ $<
diff --git a/testsuite/tools/is_big_endian.c b/testsuite/tools/is_big_endian.c
new file mode 100644
index 0..303e91b4603e8
--- /dev/null
+++ b/testsuite/tools/is_big_endian.c
@@ -0,0 +1,7 @@
+//#include 
+#include 
+
+int main(void)
+{
+   return 1 != ntohs(1);
+}
-- 
2.11.0



[iproute PATCH 0/2] Two minor testsuite fixes

2017-02-07 Thread Phil Sutter
While playing around with testsuite, I noticed two minor nits which this
series attempts to fix.

Phil Sutter (2):
  testsuite: skip link show test on big endian systems
  testsuite: Search kernel config in modules dir also

 .gitignore| 1 +
 testsuite/Makefile| 7 +++
 testsuite/tests/ip/link/show_dev_wo_vf_rate.t | 5 +
 testsuite/tools/Makefile  | 2 ++
 testsuite/tools/is_big_endian.c   | 7 +++
 5 files changed, 22 insertions(+)
 create mode 100644 testsuite/tools/Makefile
 create mode 100644 testsuite/tools/is_big_endian.c

-- 
2.11.0



Re: [net-next PATCH 0/4] Documenting eBPF - extended Berkeley Packet Filter

2017-02-07 Thread Daniel Borkmann

On 02/07/2017 10:19 PM, Jesper Dangaard Brouer wrote:

On Tue, 07 Feb 2017 17:43:38 +0100
Daniel Borkmann  wrote:


Hi Jesper,

On 02/07/2017 03:30 PM, Jesper Dangaard Brouer wrote:

Question: What kernel tree should this go into???

If going through Jonathan Corbet, will it appear sooner here???
   https://www.kernel.org/doc/html/latest/
If it will not appear sooner that way, then it's likely best to keep
it in sync with the tree that takes eBPF code changes.


For initial parts, I don't have a preference (Jonathan has though,
so seems fine via docs tree then). If at some /later/ point in time
features come in along with doc updates (similar to test case updates),
probably best to route them via net-next.


This marks the beginning of user-facing developer documentation for
using eBPF (extended Berkeley Packet Filter) as part of the kernel
Documentation/ tree.

This documentation is also available here[1], as an intermidiate quick
way of prototyping and releasing the documentation.  The autoriative
and official version of the documentation is what gets included in the
kernel tree.  The docs at [2] will get updated based on what gets
accepted after the standard peer-review kernel process.


Thanks for your effort of writing a doc. Some high-level comments on
the set from my PoV first.

I think it's definitely the right direction to move everything BPF
related to Documentation/BPF/. Right now, there are a lot of different
places with different kind of documentation, f.e.:


Agree that we need some in-kernel place to centralize bpf related
documentation, as it is too scattered at the moment.


* Documentation/networking/filter.txt
Covers some cBPF/eBPF internals, tooling, etc; mostly technical,
historically the central spot for BPF documentation. "filter" in
filter.txt is long obsolete name, but looks like various sites,
talks, blogs, etc still link to it. (At best, we should keep the
file saying that the doc moved to Documentation/BPF/.)


Agree.


* bpf(2) man page
Has a good start, but right now is heavily behind the current user
facing kernel code.


Yes, the man-page have proven to get out-of-sync.  This is one of the
reasons I prefer this in-kernel-tree documentation, as documentation
can follow the patchset submission, instead of being something
developers need to submit _after_ patches are accepted.



* include/uapi/linux/bpf.h
Mostly relevant for helper function API description.

* netdev conference slides/proceedings
Also contain mostly technical details on eBPF.

* https://github.com/iovisor/bpf-docs
Non-exhaustive collection of various talks from different confs.

* https://qmonnet.github.io/whirl-offload/2016/09/01/dive-into-bpf/
Even bigger and more complete list of documentation material.

* Various lwn articles ;), blog posts (f.e. from Brendan), etc.

Now, challenge is to bring the relevant parts together and logically
separated into Documentation/BPF/ and bpf(2) man page. I think everything
user API relevant would help most if it updates bpf(2) man page. That can
be explanation of different map types, interaction with maps, quirks, etc.


Sorry, but I disagree.  The man-page bpf(2) should only describe the
bpf syscall.  Details on map types should be documented in this
documentation.  Why, because this allow us to enforce documentation
of a new map type is included together with the code submission (else
it will never get documented).


But essential part of the syscall is to create new maps, interact with
them, etc so it's definitely relevant for the man page. The man page
has a couple of FIXMEs in its source that Michael Kerrisk added in the
course of reviewing and editing patches that were submitted; to give
one example related to this:

  [...]
  Currently, the following values are supported for
  .IR map_type :

  .in +4n
  .nf
  enum bpf_map_type {
BPF_MAP_TYPE_UNSPEC,  /* Reserve 0 as invalid map type */
BPF_MAP_TYPE_HASH,
BPF_MAP_TYPE_ARRAY,
BPF_MAP_TYPE_PROG_ARRAY,
  };
  .fi
  .in

  .I map_type
selects one of the available map implementations in the kernel.
  .\" FIXME We need an explanation of why one might choose each of
  .\" these map implementations
  [...]

Thus, I think it makes sense to address these points there instead of
duplicating as a programmer guide in the kernel tree. Maps are not
added as frequently as helper calls, and neither require writing long
novels to document, so we could require submitters to send a patch there.
When it comes to deeper kernel internals that would not be appropriate
in man pages project, but need to be documented nevertheless (perhaps
a guide on what is necessary to implement new map types), they could
then go to the kernel documentation.

What I'm trying to say is that it makes sense to logically defragment
these bits that are relevant for man-page (since user/prog developer
specific) and for kernel doc tree (rather kernel developer specific),
so 

Re: Extending socket timestamping API for NTP

2017-02-07 Thread Willem de Bruijn
>> 2) new SO_TIMESTAMPING option to receive from the error queue only
>>user data as was passed to sendmsg() instead of Ethernet frames
>>
>>Parsing Ethernet and IP headers (especially IPv6 options) is not
>>fun and SOF_TIMESTAMPING_OPT_ID is not always practical, e.g. in
>>applications which process messages from the error queue
>>asynchronously and don't bind/connect their sockets.
>
> This would be useful for application writing.

What kind of user data are you suggesting? Just a user-defined ID
passed as a cmsg? Allowing such metadata to override
skb_shinfo(skb)->tskey sounds fine.

>> 3) target address in msg_name of messages from the error queue
>>
>>With 2) and unconnected sockets, there needs to be a way to get the
>>address to which the packet was sent. Is it ok to always fill
>>msg_name, or does it need to be a new option?
>
>
> I'm not sure.

This would be an argument to just loop the original packet.

>> 4) allow sockets to use both SW and HW TX timestamping at the same time
>>
>>When using a socket which is not bound to a specific interface, it
>>would be nice to get transmit SW timestamps when HW timestamps are
>>missing. I suspect it's difficult to predict if a HW timestamp will
>>be available. Maybe it would be acceptable to get from the error
>>queue two messages per transmission if the interface supports both
>>SW and HW timestamping?
>
>
> This seems useful,

Agreed, as long as it is optional so that it does not change the
behavior for existing applications.

> but not sure how best to implement it.

It might be sufficient to just remove the second line in sw_tx_timestamp

static inline void sw_tx_timestamp(struct sk_buff *skb)
{
if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP &&
!(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS))
skb_tstamp_tx(skb, NULL);
}


Re: netvsc merge conflicts...

2017-02-07 Thread David Miller
From: Stephen Hemminger 
Date: Tue, 7 Feb 2017 14:17:40 -0800

> On Tue, 07 Feb 2017 16:41:41 -0500 (EST)
> David Miller  wrote:
> 
>> Stephen, I just did a merge of net into net-next and had to
>> resolve a merge conflict in the netvsc driver.
>> 
>> The problem was that in 'net' the hyperv bug fix that added
>> the calls to "init_cache_read_index()" in netvsc_channel_cb()
>> collided with your RX path cleanups.
>> 
>> Please double check my work and send me any needed fixups.
>> 
>> Thanks!
> 
> Sorry the init_cache_read_index came through greg's driver tree.
> Your merge cleanup matches what I was doing manually.

Great, thanks for checking it out for me.


Re: [PATCH net-next 6/7] openvswitch: Add force commit.

2017-02-07 Thread Joe Stringer
On 2 February 2017 at 17:10, Jarno Rajahalme  wrote:
> Stateful network admission policy may allow connections to one
> direction and reject connections initiated in the other direction.
> After policy change it is possible that for a new connection an
> overlapping conntrack entry already exist, where the connection
> original direction is opposed to the new connection's initial packet.
>
> Most importantly, conntrack state relating to the current packet gets
> the "reply" designation based on whether the original direction tuple
> or the reply direction tuple matched.  If this "directionality" is
> wrong w.r.t. to the stateful network admission policy it may happen
> that packets in neither direction are correctly admitted.
>
> This patch adds a new "force commit" option to the OVS conntrack
> action that checks the original direction of an existing conntrack
> entry.  If that direction is opposed to the current packet, the
> existing conntrack entry is deleted and a new one is subsequently
> created in the correct direction.
>
> Signed-off-by: Jarno Rajahalme 
> ---
>  include/uapi/linux/openvswitch.h | 10 ++
>  net/openvswitch/conntrack.c  | 27 +--
>  2 files changed, 35 insertions(+), 2 deletions(-)
>
> diff --git a/include/uapi/linux/openvswitch.h 
> b/include/uapi/linux/openvswitch.h
> index 90af8b8..d5ba9a9 100644
> --- a/include/uapi/linux/openvswitch.h
> +++ b/include/uapi/linux/openvswitch.h
> @@ -674,6 +674,10 @@ struct ovs_action_hash {
>   * @OVS_CT_ATTR_HELPER: variable length string defining conntrack ALG.
>   * @OVS_CT_ATTR_NAT: Nested OVS_NAT_ATTR_* for performing L3 network address
>   * translation (NAT) on the packet.
> + * @OVS_CT_ATTR_FORCE_COMMIT: Like %OVS_CT_ATTR_COMMIT, but instead of doing
> + * nothing if the connection is already committed will check that the current
> + * packet is in conntrack entry's original direction.  If directionality does
> + * not match, will delete the existing conntrack entry and commit a new one.
>   */
>  enum ovs_ct_attr {
> OVS_CT_ATTR_UNSPEC,
> @@ -684,6 +688,12 @@ enum ovs_ct_attr {
> OVS_CT_ATTR_HELPER, /* netlink helper to assist detection of
>related connections. */
> OVS_CT_ATTR_NAT,/* Nested OVS_NAT_ATTR_* */
> +   OVS_CT_ATTR_FORCE_COMMIT,  /* No argument, commits connection.  If the
> +   * conntrack entry original direction tuple
> +   * does not match the current packet header
> +   * values, will delete the current 
> conntrack
> +   * entry and create a new one.
> +   */

We only need one copy of the explanation, keep it above the enum, then
the inline comment can be /* No argument */.

> __OVS_CT_ATTR_MAX
>  };
>
> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> index 1afe153..1f27f44 100644
> --- a/net/openvswitch/conntrack.c
> +++ b/net/openvswitch/conntrack.c
> @@ -65,6 +65,7 @@ struct ovs_conntrack_info {
> struct nf_conn *ct;
> u8 commit : 1;
> u8 nat : 3; /* enum ovs_ct_nat */
> +   u8 force : 1;
> u16 family;
> struct md_mark mark;
> struct md_labels labels;
> @@ -631,10 +632,13 @@ static bool skb_nfct_cached(struct net *net,
>  */
> if (!ct && key->ct.state & OVS_CS_F_TRACKED &&
> !(key->ct.state & OVS_CS_F_INVALID) &&
> -   key->ct.zone == info->zone.id)
> +   key->ct.zone == info->zone.id) {
> ct = ovs_ct_find_existing(net, >zone, info->family, skb,
>   !!(key->ct.state
>  & OVS_CS_F_NAT_MASK));
> +   if (ct)
> +   nf_ct_get(skb, );
> +   }

If ctinfo is only used with the new call below, we can unconditionally
fetch this just before it's used...

> if (!ct)
> return false;
> if (!net_eq(net, read_pnet(>ct_net)))
> @@ -648,6 +652,19 @@ static bool skb_nfct_cached(struct net *net,
> if (help && rcu_access_pointer(help->helper) != info->helper)
> return false;
> }
> +   /* Force conntrack entry direction to the current packet? */

Here.

> +   if (info->force && CTINFO2DIR(ctinfo) != IP_CT_DIR_ORIGINAL) {
> +   /* Delete the conntrack entry if confirmed, else just release
> +* the reference.
> +*/
> +   if (nf_ct_is_confirmed(ct))
> +   nf_ct_delete(ct, 0, 0);
> +   else
> +   nf_ct_put(ct);

We've already ensured that ct is non-NULL, we can use
nf_conntrack_put() instead.

> +   skb->nfct = NULL;
> +   skb->nfctinfo = 0;
> +   

Re: [PATCH iproute2] ip route: Make name of protocol 0 consistent

2017-02-07 Thread Stephen Hemminger
On Tue, 7 Feb 2017 14:51:45 -0700
David Ahern  wrote:

> On 2/7/17 2:40 PM, Stephen Hemminger wrote:
> >> Reading the file changes the string in rtnl_rtprot_tab for
> >> RTPROT_UNSPEC. Both string values -- "none" and "unspec" come from
> >> iproute2, so my point is that string is inconsistent within iproute2.  
> > 
> > Why not change the value in the table rtnl_rtprot_tab to be unspec this 
> > would
> > make the command consistent with the value in the header file.
> >   
> 
> I flipped a coin; it landed on config file.
> 
> "none" is the value that has shown up for 13+ years unless a custom
> protocol value is used triggering the 'unspec'. Seems to me a custom
> protocol value is a rare event suggesting conformity to "none" over
> "unspsec". I really don't care what the string is, but it should be
> consistent. If you want 'unspec' I'll change rtnl_rtprot_tab

Agree it was a coin toss, there were two values in iproute2, but the
kernel header file enum value should supersede.


[PATCH v2 net-next 7/9] sunvnet: remove extra rcu_read_unlocks

2017-02-07 Thread Shannon Nelson
The RCU read lock is grabbed first thing in sunvnet_start_xmit_common()
so it always needs to be released.  This removes the conditional release
in the dropped packet error path and removes a couple of superfluous
calls in the middle of the code.

Reported-by: Bijan Mottahedeh 
Signed-off-by: Shannon Nelson 
---
 drivers/net/ethernet/sun/sunvnet_common.c |8 ++--
 1 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/sun/sunvnet_common.c 
b/drivers/net/ethernet/sun/sunvnet_common.c
index d2aed2c..9384db0 100644
--- a/drivers/net/ethernet/sun/sunvnet_common.c
+++ b/drivers/net/ethernet/sun/sunvnet_common.c
@@ -1279,10 +1279,8 @@ int sunvnet_start_xmit_common(struct sk_buff *skb, 
struct net_device *dev,
 
rcu_read_lock();
port = vnet_tx_port(skb, dev);
-   if (unlikely(!port)) {
-   rcu_read_unlock();
+   if (unlikely(!port))
goto out_dropped;
-   }
 
if (skb_is_gso(skb) && skb->len > port->tsolen) {
err = vnet_handle_offloads(port, skb, vnet_tx_port);
@@ -1307,7 +1305,6 @@ int sunvnet_start_xmit_common(struct sk_buff *skb, struct 
net_device *dev,
fl4.saddr = ip_hdr(skb)->saddr;
 
rt = ip_route_output_key(dev_net(dev), );
-   rcu_read_unlock();
if (!IS_ERR(rt)) {
skb_dst_set(skb, >dst);
icmp_send(skb, ICMP_DEST_UNREACH,
@@ -1467,8 +1464,7 @@ int sunvnet_start_xmit_common(struct sk_buff *skb, struct 
net_device *dev,
jiffies + VNET_CLEAN_TIMEOUT);
else if (port)
del_timer(>clean_timer);
-   if (port)
-   rcu_read_unlock();
+   rcu_read_unlock();
if (skb)
dev_kfree_skb(skb);
vnet_free_skbs(freeskbs);
-- 
1.7.1



Re: [RFC v3 02/11] IB/hfi-vnic: Virtual Network Interface Controller (VNIC) interface

2017-02-07 Thread Jason Gunthorpe
On Tue, Feb 07, 2017 at 02:06:30PM -0800, Vishwanathapura, Niranjana wrote:

> >>IB_DEVICE_RAW_SCATTER_FCS   = (1ULL << 34),
> >>+   IB_DEVICE_RDMA_NETDEV_HFI_VNIC  = (1ULL << 35),
> >
> >What is this called HFI_VNIC anyhow? Shouldn't this be OPA_VNIC? There
> >is nothing really HFI specific, right?
> 
> Agreed, OPA_VNIC is more appropriate here. Will change it.

And probably lots of other places too.. :)


> >And this should be rn->dev_priv ?
> 
> Yah, both will result in same behavior. But yah, what you are suggesting
> will remove any confusion. Will change in next PATCH series.

Only because the struct has no members, as soon as someone adds
something it would go booom.

Jason


[PATCH v2 net-next 2/9] sunvnet: remove unused variable in maybe_tx_wakeup

2017-02-07 Thread Shannon Nelson
From: Sowmini Varadhan 

The vio_dring_state *dr variable is unused in maybe_tx_wakeup().
As the comments indicate, we call maybe_tx_wakeup() whenever we
get a STOPPED LDC message on the port. If the queue is stopped,
we want to wake it up so that we will send another START message
at the next TX and trigger the consumer to drain the dring.

Signed-off-by: Sowmini Varadhan 
Signed-off-by: Shannon Nelson 
---
 drivers/net/ethernet/sun/sunvnet_common.c |6 +-
 1 files changed, 1 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/sun/sunvnet_common.c 
b/drivers/net/ethernet/sun/sunvnet_common.c
index e03cf13..add22d4 100644
--- a/drivers/net/ethernet/sun/sunvnet_common.c
+++ b/drivers/net/ethernet/sun/sunvnet_common.c
@@ -743,12 +743,8 @@ static void maybe_tx_wakeup(struct vnet_port *port)
txq = netdev_get_tx_queue(VNET_PORT_TO_NET_DEVICE(port),
  port->q_index);
__netif_tx_lock(txq, smp_processor_id());
-   if (likely(netif_tx_queue_stopped(txq))) {
-   struct vio_dring_state *dr;
-
-   dr = >vio.drings[VIO_DRIVER_TX_RING];
+   if (likely(netif_tx_queue_stopped(txq)))
netif_tx_wake_queue(txq);
-   }
__netif_tx_unlock(txq);
 }
 
-- 
1.7.1



[PATCH v2 net-next 6/9] sunvnet: straighten up message event handling logic

2017-02-07 Thread Shannon Nelson
The use of gotos for handling the incoming events made this code
harder to read and support than it should be.  This patch straightens
out and clears up the logic.

Signed-off-by: Shannon Nelson 
---
 drivers/net/ethernet/sun/sunvnet_common.c |   94 ++---
 1 files changed, 45 insertions(+), 49 deletions(-)

diff --git a/drivers/net/ethernet/sun/sunvnet_common.c 
b/drivers/net/ethernet/sun/sunvnet_common.c
index 6cb625a..d2aed2c 100644
--- a/drivers/net/ethernet/sun/sunvnet_common.c
+++ b/drivers/net/ethernet/sun/sunvnet_common.c
@@ -764,41 +764,37 @@ static int vnet_event_napi(struct vnet_port *port, int 
budget)
struct vio_driver_state *vio = >vio;
int tx_wakeup, err;
int npkts = 0;
-   int event = (port->rx_event & LDC_EVENT_RESET);
-
-ldc_ctrl:
-   if (unlikely(event == LDC_EVENT_RESET ||
-event == LDC_EVENT_UP)) {
-   vio_link_state_change(vio, event);
-
-   if (event == LDC_EVENT_RESET) {
-   vnet_port_reset(port);
-   vio_port_up(vio);
-
-   /* If the device is running but its tx queue was
-* stopped (due to flow control), restart it.
-* This is necessary since vnet_port_reset()
-* clears the tx drings and thus we may never get
-* back a VIO_TYPE_DATA ACK packet - which is
-* the normal mechanism to restart the tx queue.
-*/
-   if (netif_running(dev))
-   maybe_tx_wakeup(port);
-   }
+
+   /* we don't expect any other bits */
+   BUG_ON(port->rx_event & ~(LDC_EVENT_DATA_READY |
+ LDC_EVENT_RESET |
+ LDC_EVENT_UP));
+
+   /* RESET takes precedent over any other event */
+   if (port->rx_event & LDC_EVENT_RESET) {
+   vio_link_state_change(vio, LDC_EVENT_RESET);
+   vnet_port_reset(port);
+   vio_port_up(vio);
+
+   /* If the device is running but its tx queue was
+* stopped (due to flow control), restart it.
+* This is necessary since vnet_port_reset()
+* clears the tx drings and thus we may never get
+* back a VIO_TYPE_DATA ACK packet - which is
+* the normal mechanism to restart the tx queue.
+*/
+   if (netif_running(dev))
+   maybe_tx_wakeup(port);
+
port->rx_event = 0;
return 0;
}
-   /* We may have multiple LDC events in rx_event. Unroll send_events() */
-   event = (port->rx_event & LDC_EVENT_UP);
-   port->rx_event &= ~(LDC_EVENT_RESET | LDC_EVENT_UP);
-   if (event == LDC_EVENT_UP)
-   goto ldc_ctrl;
-   event = port->rx_event;
-   if (!(event & LDC_EVENT_DATA_READY))
-   return 0;
 
-   /* we dont expect any other bits than RESET, UP, DATA_READY */
-   BUG_ON(event != LDC_EVENT_DATA_READY);
+   if (port->rx_event & LDC_EVENT_UP) {
+   vio_link_state_change(vio, LDC_EVENT_UP);
+   port->rx_event = 0;
+   return 0;
+   }
 
err = 0;
tx_wakeup = 0;
@@ -821,25 +817,25 @@ static int vnet_event_napi(struct vnet_port *port, int 
budget)
pkt->start_idx = vio_dring_next(dr,
port->napi_stop_idx);
pkt->end_idx = -1;
-   goto napi_resume;
-   }
-   err = ldc_read(vio->lp, , sizeof(msgbuf));
-   if (unlikely(err < 0)) {
-   if (err == -ECONNRESET)
-   vio_conn_reset(vio);
-   break;
+   } else {
+   err = ldc_read(vio->lp, , sizeof(msgbuf));
+   if (unlikely(err < 0)) {
+   if (err == -ECONNRESET)
+   vio_conn_reset(vio);
+   break;
+   }
+   if (err == 0)
+   break;
+   viodbg(DATA, "TAG [%02x:%02x:%04x:%08x]\n",
+  msgbuf.tag.type,
+  msgbuf.tag.stype,
+  msgbuf.tag.stype_env,
+  msgbuf.tag.sid);
+   err = vio_validate_sid(vio, );
+   if (err < 0)
+   break;
}
-   if (err == 0)
-   break;
-   viodbg(DATA, "TAG [%02x:%02x:%04x:%08x]\n",
-  msgbuf.tag.type,
-  msgbuf.tag.stype,
-

[PATCH v2 net-next 1/9] sunvnet: make sunvnet common code dynamically loadable

2017-02-07 Thread Shannon Nelson
When the sunvnet_common code was split out for use by both sunvnet
and the newer ldmvsw, it was made into a static kernel library, which
limits the usefulness of sunvnet and ldmvsw as loadables, since most
of the real work is being done in the shared code.  Also, this is
simply dead code in kernels that aren't running the LDoms.

This patch makes the sunvnet_common into a dynamically loadable
module and makes sunvnet and ldmvsw dependent on sunvnet_common.

Signed-off-by: Shannon Nelson 
---
 drivers/net/ethernet/sun/Kconfig  |8 ++--
 drivers/net/ethernet/sun/sunvnet_common.c |   29 +
 2 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/sun/Kconfig b/drivers/net/ethernet/sun/Kconfig
index a4b40e3..a7d91da 100644
--- a/drivers/net/ethernet/sun/Kconfig
+++ b/drivers/net/ethernet/sun/Kconfig
@@ -70,19 +70,23 @@ config CASSINI
  
.
 
 config SUNVNET_COMMON
-   bool
+   tristate "Common routines to support Sun Virtual Networking"
depends on SUN_LDOMS
-   default y if SUN_LDOMS
+   default m if SUN_LDOMS
 
 config SUNVNET
tristate "Sun Virtual Network support"
+   default m
depends on SUN_LDOMS
+   depends on SUNVNET_COMMON
---help---
  Support for virtual network devices under Sun Logical Domains.
 
 config LDMVSW
tristate "Sun4v LDoms Virtual Switch support"
+   default m
depends on SUN_LDOMS
+   depends on SUNVNET_COMMON
---help---
  Support for virtual switch devices under Sun4v Logical Domains.
  This driver adds a network interface for every vsw-port node
diff --git a/drivers/net/ethernet/sun/sunvnet_common.c 
b/drivers/net/ethernet/sun/sunvnet_common.c
index 191c8ad..e03cf13 100644
--- a/drivers/net/ethernet/sun/sunvnet_common.c
+++ b/drivers/net/ethernet/sun/sunvnet_common.c
@@ -37,6 +37,35 @@
  */
 #defineVNET_MAX_RETRIES10
 
+#define DRV_MODULE_NAME"sunvnet_common"
+#define DRV_MODULE_VERSION "1.1"
+#define DRV_MODULE_RELDATE "February 3, 2017"
+
+static char version[] =
+   DRV_MODULE_NAME " " DRV_MODULE_VERSION " (" DRV_MODULE_RELDATE ")";
+MODULE_AUTHOR("David S. Miller (da...@davemloft.net)");
+MODULE_DESCRIPTION("Sun LDOM virtual network support library");
+MODULE_LICENSE("GPL");
+MODULE_VERSION(DRV_MODULE_VERSION);
+
+static int __init sunvnet_common_init(void)
+{
+   pr_info("%s\n", version);
+   return 0;
+}
+module_init(sunvnet_common_init);
+
+static void __exit sunvnet_common_exit(void)
+{
+   /* Empty function, just here to fill the exit function pointer
+* slot.  In some combinations of older gcc and newer kernel,
+* leaving this undefined results in the kernel marking it as a
+* permanent module; it will show up in lsmod output as [permanent]
+* and not be unloadable.
+*/
+}
+module_exit(sunvnet_common_exit);
+
 static int __vnet_tx_trigger(struct vnet_port *port, u32 start);
 static void vnet_port_reset(struct vnet_port *port);
 
-- 
1.7.1



[PATCH v2 net-next 9/9] ldmvsw: disable tso and gso for bridge operations

2017-02-07 Thread Shannon Nelson
The ldmvsw driver is specifically for supporting the ldom virtual
networking by running in the primary ldom and using the LDC to connect
the remaining ldoms to the outside world via a bridge.  With TSO and GSO
supported while connected the bridge, things tend to misbehave as seen
in our case by delayed packets, enough to begin triggering retransmits
and affecting overall throughput.  By turning off advertised support for
TSO and GSO we restore stable traffic flow through the bridge.

Orabug: 23293104

Signed-off-by: Shannon Nelson 
---
 drivers/net/ethernet/sun/ldmvsw.c |5 ++---
 drivers/net/ethernet/sun/sunvnet_common.c |3 ++-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/sun/ldmvsw.c 
b/drivers/net/ethernet/sun/ldmvsw.c
index 3ef5c08..8e1ecfb 100644
--- a/drivers/net/ethernet/sun/ldmvsw.c
+++ b/drivers/net/ethernet/sun/ldmvsw.c
@@ -297,8 +297,7 @@ static void vsw_poll_controller(struct net_device *dev)
dev->ethtool_ops = _ethtool_ops;
dev->watchdog_timeo = VSW_TX_TIMEOUT;
 
-   dev->hw_features = NETIF_F_TSO | NETIF_F_GSO | NETIF_F_GSO_SOFTWARE |
-  NETIF_F_HW_CSUM | NETIF_F_SG;
+   dev->hw_features = NETIF_F_HW_CSUM | NETIF_F_SG;
dev->features = dev->hw_features;
 
/* MTU range: 68 - 65535 */
@@ -383,7 +382,7 @@ static int vsw_port_probe(struct vio_dev *vdev, const 
struct vio_device_id *id)
port->vp = vp;
port->dev = dev;
port->switch_port = 1;
-   port->tso = true;
+   port->tso = false; /* no tso in vsw, misbehaves in bridge */
port->tsolen = 0;
 
/* Mark the port as belonging to ldmvsw which directs the
diff --git a/drivers/net/ethernet/sun/sunvnet_common.c 
b/drivers/net/ethernet/sun/sunvnet_common.c
index 9384db0..1a9bc56 100644
--- a/drivers/net/ethernet/sun/sunvnet_common.c
+++ b/drivers/net/ethernet/sun/sunvnet_common.c
@@ -210,6 +210,7 @@ static int handle_attr_info(struct vio_driver_state *vio,
} else {
pkt->cflags &= ~VNET_LSO_IPV4_CAPAB;
pkt->ipv4_lso_maxlen = 0;
+   port->tsolen = 0;
}
 
/* for version >= 1.6, ACK packet mode we support */
@@ -1661,7 +1662,7 @@ static void vnet_port_reset(struct vnet_port *port)
del_timer(>clean_timer);
sunvnet_port_free_tx_bufs_common(port);
port->rmtu = 0;
-   port->tso = true;
+   port->tso = (port->vsw == 0);  /* no tso in vsw, misbehaves in bridge */
port->tsolen = 0;
 }
 
-- 
1.7.1



[PATCH v2 net-next 5/9] sunvnet: add memory barrier before check for tx enable

2017-02-07 Thread Shannon Nelson
In order to allow the underlying LDC and outstanding memory operations
to potentially catch up with the driver's Tx requests, add a memory
barrier before checking again for available tx descriptors.

Signed-off-by: Shannon Nelson 
---
 drivers/net/ethernet/sun/sunvnet_common.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/sun/sunvnet_common.c 
b/drivers/net/ethernet/sun/sunvnet_common.c
index 82273e6..6cb625a 100644
--- a/drivers/net/ethernet/sun/sunvnet_common.c
+++ b/drivers/net/ethernet/sun/sunvnet_common.c
@@ -1453,6 +1453,7 @@ int sunvnet_start_xmit_common(struct sk_buff *skb, struct 
net_device *dev,
dr->prod = (dr->prod + 1) & (VNET_TX_RING_SIZE - 1);
if (unlikely(vnet_tx_dring_avail(dr) < 1)) {
netif_tx_stop_queue(txq);
+   smp_rmb();
if (vnet_tx_dring_avail(dr) > VNET_TX_WAKEUP_THRESH(dr))
netif_tx_wake_queue(txq);
}
-- 
1.7.1



[PATCH v2 net-next 4/9] sunvnet: add driver stats for ethtool support

2017-02-07 Thread Shannon Nelson
Since we're collecting some stats in the driver code, let's support use
of the ethtool driver stats facility in both sunvnet and ldmvsw.

Signed-off-by: Shannon Nelson 
---
 drivers/net/ethernet/sun/ldmvsw.c |   63 +
 drivers/net/ethernet/sun/sunvnet.c|   63 +
 drivers/net/ethernet/sun/sunvnet_common.c |2 +
 3 files changed, 128 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/sun/ldmvsw.c 
b/drivers/net/ethernet/sun/ldmvsw.c
index 335b876..3999fb7 100644
--- a/drivers/net/ethernet/sun/ldmvsw.c
+++ b/drivers/net/ethernet/sun/ldmvsw.c
@@ -80,11 +80,74 @@ static void vsw_set_msglevel(struct net_device *dev, u32 
value)
port->vp->msg_enable = value;
 }
 
+static const struct {
+   const char string[ETH_GSTRING_LEN];
+} ethtool_stats_keys[] = {
+   { "rx_packets" },
+   { "tx_packets" },
+   { "rx_bytes" },
+   { "tx_bytes" },
+   { "rx_errors" },
+   { "tx_errors" },
+   { "rx_dropped" },
+   { "tx_dropped" },
+   { "multicast" },
+   { "rx_length_errors" },
+   { "rx_frame_errors" },
+   { "rx_missed_errors" },
+   { "tx_carrier_errors" },
+};
+
+static int vsw_get_sset_count(struct net_device *dev, int sset)
+{
+   switch (sset) {
+   case ETH_SS_STATS:
+   return ARRAY_SIZE(ethtool_stats_keys);
+   default:
+   return -EOPNOTSUPP;
+   }
+}
+
+static void vsw_get_strings(struct net_device *dev, u32 stringset, u8 *buf)
+{
+   switch (stringset) {
+   case ETH_SS_STATS:
+   memcpy(buf, _stats_keys, sizeof(ethtool_stats_keys));
+   break;
+   default:
+   WARN_ON(1);
+   break;
+   }
+}
+
+static void vsw_get_ethtool_stats(struct net_device *dev,
+ struct ethtool_stats *estats, u64 *data)
+{
+   int i = 0;
+
+   data[i++] = dev->stats.rx_packets;
+   data[i++] = dev->stats.tx_packets;
+   data[i++] = dev->stats.rx_bytes;
+   data[i++] = dev->stats.tx_bytes;
+   data[i++] = dev->stats.rx_errors;
+   data[i++] = dev->stats.tx_errors;
+   data[i++] = dev->stats.rx_dropped;
+   data[i++] = dev->stats.tx_dropped;
+   data[i++] = dev->stats.multicast;
+   data[i++] = dev->stats.rx_length_errors;
+   data[i++] = dev->stats.rx_frame_errors;
+   data[i++] = dev->stats.rx_missed_errors;
+   data[i++] = dev->stats.tx_carrier_errors;
+}
+
 static const struct ethtool_ops vsw_ethtool_ops = {
.get_drvinfo= vsw_get_drvinfo,
.get_msglevel   = vsw_get_msglevel,
.set_msglevel   = vsw_set_msglevel,
.get_link   = ethtool_op_get_link,
+   .get_sset_count = vsw_get_sset_count,
+   .get_strings= vsw_get_strings,
+   .get_ethtool_stats  = vsw_get_ethtool_stats,
 };
 
 static LIST_HEAD(vnet_list);
diff --git a/drivers/net/ethernet/sun/sunvnet.c 
b/drivers/net/ethernet/sun/sunvnet.c
index 4cc2571..e225b27 100644
--- a/drivers/net/ethernet/sun/sunvnet.c
+++ b/drivers/net/ethernet/sun/sunvnet.c
@@ -77,11 +77,74 @@ static void vnet_set_msglevel(struct net_device *dev, u32 
value)
vp->msg_enable = value;
 }
 
+static const struct {
+   const char string[ETH_GSTRING_LEN];
+} ethtool_stats_keys[] = {
+   { "rx_packets" },
+   { "tx_packets" },
+   { "rx_bytes" },
+   { "tx_bytes" },
+   { "rx_errors" },
+   { "tx_errors" },
+   { "rx_dropped" },
+   { "tx_dropped" },
+   { "multicast" },
+   { "rx_length_errors" },
+   { "rx_frame_errors" },
+   { "rx_missed_errors" },
+   { "tx_carrier_errors" },
+};
+
+static int vnet_get_sset_count(struct net_device *dev, int sset)
+{
+   switch (sset) {
+   case ETH_SS_STATS:
+   return ARRAY_SIZE(ethtool_stats_keys);
+   default:
+   return -EOPNOTSUPP;
+   }
+}
+
+static void vnet_get_strings(struct net_device *dev, u32 stringset, u8 *buf)
+{
+   switch (stringset) {
+   case ETH_SS_STATS:
+   memcpy(buf, _stats_keys, sizeof(ethtool_stats_keys));
+   break;
+   default:
+   WARN_ON(1);
+   break;
+   }
+}
+
+static void vnet_get_ethtool_stats(struct net_device *dev,
+  struct ethtool_stats *estats, u64 *data)
+{
+   int i = 0;
+
+   data[i++] = dev->stats.rx_packets;
+   data[i++] = dev->stats.tx_packets;
+   data[i++] = dev->stats.rx_bytes;
+   data[i++] = dev->stats.tx_bytes;
+   data[i++] = dev->stats.rx_errors;
+   data[i++] = dev->stats.tx_errors;
+   data[i++] = dev->stats.rx_dropped;
+   data[i++] = dev->stats.tx_dropped;
+   data[i++] = dev->stats.multicast;
+   data[i++] = dev->stats.rx_length_errors;
+   data[i++] = dev->stats.rx_frame_errors;
+   data[i++] = 

[PATCH v2 net-next 0/9] sunvnet driver updates

2017-02-07 Thread Shannon Nelson
The sunvnet ldom virtual network driver was due for some updates and
a bugfix or two.  These patches address a few items left over from
last year's make-over.

v2:
 - changed memory barrier fix to use smp_wmb
 - put NETIF_F_SG back into the advertised ldmvsw hw_features

Shannon Nelson (8):
  sunvnet: make sunvnet common code dynamically loadable
  sunvnet: update version and version printing
  sunvnet: add driver stats for ethtool support
  sunvnet: add memory barrier before check for tx enable
  sunvnet: straighten up message event handling logic
  sunvnet: remove extra rcu_read_unlocks
  ldmvsw: update and simplify version string
  ldmvsw: disable tso and gso for bridge operations

Sowmini Varadhan (1):
  sunvnet: remove unused variable in maybe_tx_wakeup

 drivers/net/ethernet/sun/Kconfig  |8 +-
 drivers/net/ethernet/sun/ldmvsw.c |   82 ++---
 drivers/net/ethernet/sun/sunvnet.c|   77 ++--
 drivers/net/ethernet/sun/sunvnet_common.c |  143 
 4 files changed, 224 insertions(+), 86 deletions(-)



Re: netvsc merge conflicts...

2017-02-07 Thread Stephen Hemminger
On Tue, 07 Feb 2017 16:41:41 -0500 (EST)
David Miller  wrote:

> Stephen, I just did a merge of net into net-next and had to
> resolve a merge conflict in the netvsc driver.
> 
> The problem was that in 'net' the hyperv bug fix that added
> the calls to "init_cache_read_index()" in netvsc_channel_cb()
> collided with your RX path cleanups.
> 
> Please double check my work and send me any needed fixups.
> 
> Thanks!

Sorry the init_cache_read_index came through greg's driver tree.
Your merge cleanup matches what I was doing manually.


[PATCH v2 net-next 8/9] ldmvsw: update and simplify version string

2017-02-07 Thread Shannon Nelson
New version and simplify the print code.

Signed-off-by: Shannon Nelson 
---
 drivers/net/ethernet/sun/ldmvsw.c |   14 --
 1 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/sun/ldmvsw.c 
b/drivers/net/ethernet/sun/ldmvsw.c
index 3999fb7..3ef5c08 100644
--- a/drivers/net/ethernet/sun/ldmvsw.c
+++ b/drivers/net/ethernet/sun/ldmvsw.c
@@ -41,11 +41,11 @@
 static u8 vsw_port_hwaddr[ETH_ALEN] = {0xFE, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF};
 
 #define DRV_MODULE_NAME"ldmvsw"
-#define DRV_MODULE_VERSION "1.0"
-#define DRV_MODULE_RELDATE "Jan 15, 2016"
+#define DRV_MODULE_VERSION "1.1"
+#define DRV_MODULE_RELDATE "February 3, 2017"
 
 static char version[] =
-   DRV_MODULE_NAME ".c:v" DRV_MODULE_VERSION " (" DRV_MODULE_RELDATE ")\n";
+   DRV_MODULE_NAME " " DRV_MODULE_VERSION " (" DRV_MODULE_RELDATE ")";
 MODULE_AUTHOR("Oracle");
 MODULE_DESCRIPTION("Sun4v LDOM Virtual Switch Driver");
 MODULE_LICENSE("GPL");
@@ -322,11 +322,6 @@ static void vsw_poll_controller(struct net_device *dev)
.handshake_complete = sunvnet_handshake_complete_common,
 };
 
-static void print_version(void)
-{
-   printk_once(KERN_INFO "%s", version);
-}
-
 static const char *remote_macaddr_prop = "remote-mac-address";
 static const char *id_prop = "id";
 
@@ -342,8 +337,6 @@ static int vsw_port_probe(struct vio_dev *vdev, const 
struct vio_device_id *id)
const u64 *port_id;
u64 handle;
 
-   print_version();
-
hp = mdesc_grab();
 
rmac = mdesc_get_property(hp, vdev->mp, remote_macaddr_prop, );
@@ -520,6 +513,7 @@ static void vsw_cleanup(void)
 
 static int __init vsw_init(void)
 {
+   pr_info("%s\n", version);
return vio_register_driver(_port_driver);
 }
 
-- 
1.7.1



[PATCH v2 net-next 3/9] sunvnet: update version and version printing

2017-02-07 Thread Shannon Nelson
There have been several changes since the first version of this code, so
we bump the version number.  While we're at it, we can simplify the
version printing a bit and drop a couple lines of code.

Signed-off-by: Shannon Nelson 
---
 drivers/net/ethernet/sun/sunvnet.c |   14 --
 1 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/sun/sunvnet.c 
b/drivers/net/ethernet/sun/sunvnet.c
index 5356a70..4cc2571 100644
--- a/drivers/net/ethernet/sun/sunvnet.c
+++ b/drivers/net/ethernet/sun/sunvnet.c
@@ -38,11 +38,11 @@
 #define VNET_TX_TIMEOUT(5 * HZ)
 
 #define DRV_MODULE_NAME"sunvnet"
-#define DRV_MODULE_VERSION "1.0"
-#define DRV_MODULE_RELDATE "June 25, 2007"
+#define DRV_MODULE_VERSION "2.0"
+#define DRV_MODULE_RELDATE "February 3, 2017"
 
 static char version[] =
-   DRV_MODULE_NAME ".c:v" DRV_MODULE_VERSION " (" DRV_MODULE_RELDATE ")\n";
+   DRV_MODULE_NAME " " DRV_MODULE_VERSION " (" DRV_MODULE_RELDATE ")";
 MODULE_AUTHOR("David S. Miller (da...@davemloft.net)");
 MODULE_DESCRIPTION("Sun LDOM virtual network driver");
 MODULE_LICENSE("GPL");
@@ -303,11 +303,6 @@ static void vnet_cleanup(void)
.handshake_complete = sunvnet_handshake_complete_common,
 };
 
-static void print_version(void)
-{
-   printk_once(KERN_INFO "%s", version);
-}
-
 const char *remote_macaddr_prop = "remote-mac-address";
 
 static int vnet_port_probe(struct vio_dev *vdev, const struct vio_device_id 
*id)
@@ -319,8 +314,6 @@ static int vnet_port_probe(struct vio_dev *vdev, const 
struct vio_device_id *id)
const u64 *rmac;
int len, i, err, switch_port;
 
-   print_version();
-
hp = mdesc_grab();
 
vp = vnet_find_parent(hp, vdev->mp, vdev);
@@ -446,6 +439,7 @@ static int vnet_port_remove(struct vio_dev *vdev)
 
 static int __init vnet_init(void)
 {
+   pr_info("%s\n", version);
return vio_register_driver(_port_driver);
 }
 
-- 
1.7.1



  1   2   3   4   >