[PATCH] net/sunrpc/xprt_sock: fix regression in connection error reporting.

2017-07-18 Thread NeilBrown

Commit 3d4762639dd3 ("tcp: remove poll() flakes when receiving
RST") in v4.12 changed the order in which ->sk_state_change()
and ->sk_error_report() are called when a socket is shut
down - sk_state_change() is now called first.

This causes xs_tcp_state_change() -> xs_sock_mark_closed() ->
xprt_disconnect_done() to wake all pending tasked with -EAGAIN.
When the ->sk_error_report() callback arrives, it is too late to
pass the error on, and it is lost.

As easy way to demonstrate the problem caused is to try to start
rpc.nfsd while rcpbind isn't running.
nfsd will attempt a tcp connection to rpcbind.  A ECONNREFUSED
error is returned, but sunrpc code loses the error and keeps
retrying.  If it saw the ECONNREFUSED, it would abort.

To fix this, handle the sk->sk_err in the TCP_CLOSE branch of
xs_tcp_state_change().

Fixes: 3d4762639dd3 ("tcp: remove poll() flakes when receiving RST")
Cc: sta...@vger.kernel.org (v4.12)
Signed-off-by: NeilBrown 
---
 net/sunrpc/xprtsock.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index d5b54c020dec..4f154d388748 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -1624,6 +1624,8 @@ static void xs_tcp_state_change(struct sock *sk)
if (test_and_clear_bit(XPRT_SOCK_CONNECTING,
>sock_state))
xprt_clear_connecting(xprt);
+   if (sk->sk_err)
+   xprt_wake_pending_tasks(xprt, -sk->sk_err);
xs_sock_mark_closed(xprt);
}
  out:
-- 
2.12.2



signature.asc
Description: PGP signature


Re: [PATCH] net: ethernet: mediatek: remove useless code in mtk_poll_tx()

2017-07-18 Thread Sean Wang
On Tue, 2017-07-18 at 15:48 -0500, Gustavo A. R. Silva wrote:
> Remove useless local variable _condition_ and the code related.
> 
> Signed-off-by: Gustavo A. R. Silva 
> ---
>  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 5 +
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
> b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> index b3d0c2e..7e95cf5 100644
> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> @@ -1027,7 +1027,6 @@ static int mtk_poll_tx(struct mtk_eth *eth, int budget)
>   unsigned int done[MTK_MAX_DEVS];
>   unsigned int bytes[MTK_MAX_DEVS];
>   u32 cpu, dma;
> - static int condition;
>   int total = 0, i;
>  
>   memset(done, 0, sizeof(done));
> @@ -1051,10 +1050,8 @@ static int mtk_poll_tx(struct mtk_eth *eth, int budget)
>   mac = 1;
>  
>   skb = tx_buf->skb;
> - if (!skb) {
> - condition = 1;
> + if (!skb)
>   break;
> - }
>  
>   if (skb != (struct sk_buff *)MTK_DMA_DUMMY_DESC) {
>   bytes[mac] += skb->len;

Acked-by: Sean Wang 







Re: [PATCH net-next 0/5] refine virtio-net XDP

2017-07-18 Thread Jason Wang



On 2017年07月19日 04:13, Michael S. Tsirkin wrote:

On Mon, Jul 17, 2017 at 08:43:56PM +0800, Jason Wang wrote:

Hi:

This series brings two optimizations for virtio-net XDP:

- avoid reset during XDP set
- turn off offloads on demand

I'm glad to see this take shape - this can be
extended to optimize virtnet_get_headroom so we don't
waste room if adjust_head is enabled.


Right, we can do it on top.


I see a couple of issues, responded to individual patches.




Thanks for the reviewing.



Re: [PATCH net-next 5/5] virtio-net: switch off offloads on demand if possible on XDP set

2017-07-18 Thread Jason Wang



On 2017年07月19日 04:07, Michael S. Tsirkin wrote:

On Mon, Jul 17, 2017 at 08:44:01PM +0800, Jason Wang wrote:

Current XDP implementation want guest offloads feature to be disabled

s/want/wants/


on qemu cli.

on the device.


This is inconvenient and means guest can't benefit from
offloads if XDP is not used. This patch tries to address this
limitation by disable

disabling


the offloads on demand through control guest
offloads. Guest offloads will be disabled and enabled on demand on XDP
set.

Signed-off-by: Jason Wang 

In fact, since we no longer reset when XDP is set,
here device might have offloads enabled, buffers are
used but not consumed, then XDP is set.

This can result in
- packet scattered across multiple buffers
   (handled correctly but need to update the comment)


Ok.


- packet may have VIRTIO_NET_HDR_F_NEEDS_CSUM, in that case
   the spec says "The checksum on the packet is incomplete".
   (probably needs to be handled by calculating the checksum).


That's an option. Maybe it's tricky but I was thinking whether or not we 
can just keep the CHECKSUM_PARTIAL here.





Ideas for follow-up patches:

- skip looking at packet data completely
   won't work if you play with checksums dynamically
   but can be done if disabled on device
- allow ethtools to tweak offloads from userspace as well


Right.

Thanks




---
  drivers/net/virtio_net.c | 70 
  1 file changed, 65 insertions(+), 5 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index e732bd6..d970c2d 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -57,6 +57,11 @@ DECLARE_EWMA(pkt_len, 0, 64)
  
  #define VIRTNET_DRIVER_VERSION "1.0.0"
  
+const unsigned long guest_offloads[] = { VIRTIO_NET_F_GUEST_TSO4,

+VIRTIO_NET_F_GUEST_TSO6,
+VIRTIO_NET_F_GUEST_ECN,
+VIRTIO_NET_F_GUEST_UFO };
+
  struct virtnet_stats {
struct u64_stats_sync tx_syncp;
struct u64_stats_sync rx_syncp;
@@ -164,10 +169,13 @@ struct virtnet_info {
u8 ctrl_promisc;
u8 ctrl_allmulti;
u16 ctrl_vid;
+   u64 ctrl_offloads;
  
  	/* Ethtool settings */

u8 duplex;
u32 speed;
+
+   unsigned long guest_offloads;
  };
  
  struct padded_vnet_hdr {

@@ -1889,6 +1897,47 @@ static int virtnet_restore_up(struct virtio_device *vdev)
return err;
  }
  
+static int virtnet_set_guest_offloads(struct virtnet_info *vi, u64 offloads)

+{
+   struct scatterlist sg;
+   vi->ctrl_offloads = cpu_to_virtio64(vi->vdev, offloads);
+
+   sg_init_one(, >ctrl_offloads, sizeof(vi->ctrl_offloads));
+
+   if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_GUEST_OFFLOADS,
+ VIRTIO_NET_CTRL_GUEST_OFFLOADS_SET, )) {
+   dev_warn(>dev->dev, "Fail to set guest offload. \n");
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
+static int virtnet_clear_guest_offloads(struct virtnet_info *vi)
+{
+   u64 offloads = 0;
+
+   if (!vi->guest_offloads)
+   return 0;
+
+   if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_CSUM))
+   offloads = 1ULL << VIRTIO_NET_F_GUEST_CSUM;
+
+   return virtnet_set_guest_offloads(vi, offloads);
+}
+
+static int virtnet_restore_guest_offloads(struct virtnet_info *vi)
+{
+   u64 offloads = vi->guest_offloads;
+
+   if (!vi->guest_offloads)
+   return 0;
+   if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_CSUM))
+   offloads |= 1ULL << VIRTIO_NET_F_GUEST_CSUM;
+
+   return virtnet_set_guest_offloads(vi, offloads);
+}
+
  static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog,
   struct netlink_ext_ack *extack)
  {
@@ -1898,10 +1947,11 @@ static int virtnet_xdp_set(struct net_device *dev, 
struct bpf_prog *prog,
u16 xdp_qp = 0, curr_qp;
int i, err;
  
-	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_TSO4) ||

-   virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_TSO6) ||
-   virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_ECN) ||
-   virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_UFO)) {
+   if (!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_GUEST_OFFLOADS)
+   && (virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_TSO4) ||
+   virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_TSO6) ||
+   virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_ECN) ||
+   virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_UFO))) {
NL_SET_ERR_MSG_MOD(extack, "Can't set XDP while host is implementing 
LRO, disable LRO first");
return -EOPNOTSUPP;
}
@@ -1950,6 +2000,12 @@ static int virtnet_xdp_set(struct net_device *dev, 
struct bpf_prog *prog,
for (i = 0; i < 

Re: [PATCH net-next 4/5] virtio-net: do not reset during XDP set

2017-07-18 Thread Jason Wang



On 2017年07月19日 03:49, Michael S. Tsirkin wrote:

On Mon, Jul 17, 2017 at 08:44:00PM +0800, Jason Wang wrote:

We used to reset during XDP set, the main reason is we need allocate
extra headroom for header adjustment but there's no way to know the
headroom of exist receive buffer. This works buy maybe complex and may
cause the network down for a while which is bad for user
experience. So this patch tries to avoid this by:

- packing headroom into receive buffer ctx
- check the headroom during XDP, and if it was not sufficient, copy
   the packet into a location which has a large enough headroom

The packing is actually done by previous patches. Here is a
corrected version:

We currently reset the device during XDP set, the main reason is
that we allocate more headroom with XDP (for header adjustment).

This works but causes network downtime for users.

Previous patches encoded the headroom in the buffer context,
this makes it possible to detect the case where a buffer
with headroom insufficient for XDP is added to the queue and
XDP is enabled afterwards.

Upon detection, we handle this case by copying the packet
(slow, but it's a temporary condition).


Ok.





Signed-off-by: Jason Wang 
---
  drivers/net/virtio_net.c | 230 ++-
  1 file changed, 105 insertions(+), 125 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index e31b5b2..e732bd6 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -407,6 +407,67 @@ static unsigned int virtnet_get_headroom(struct 
virtnet_info *vi)
return vi->xdp_queue_pairs ? VIRTIO_XDP_HEADROOM : 0;
  }
  
+/* We copy and linearize packet in the following cases:

+ *
+ * 1) Packet across multiple buffers, this happens normally when rx
+ *buffer size is underestimated. Rarely, since spec does not
+ *forbid using more than one buffer even if a single buffer is
+ *sufficient for the packet, we should also deal with this case.

Latest SVN of the spec actually forbids this. See:
 net: clarify device rules for mergeable buffers


Good to know this.





+ * 2) The header room is smaller than what XDP required. In this case
+ *we should copy the packet and reserve enough headroom for this.
+ *This would be slow but we at most we can copy times of queue
+ *size, this is acceptable. What's more important, this help to
+ *avoid resetting.

Last part of the comment applies to both cases. So

+/* We copy the packet for XDP in the following cases:
+ *
+ * 1) Packet is scattered across multiple rx buffers.
+ * 2) Headroom space is insufficient.
+ *
+ * This is inefficient but it's a temporary condition that
+ * we hit right after XDP is enabled and until queue is refilled
+ * with large buffers with sufficient headroom - so it should affect
+ * at most queue size packets.

+ * Afterwards, the conditions to enable
+ * XDP should preclude the underlying device from sending packets
+ * across multiple buffers (num_buf > 1), and we make sure buffers
+ * have enough headroom.
+ */



Ok.




+ * 2) The header room is smaller than what XDP required. In this case
+ *we should copy the packet and reserve enough headroom for this.
+ *This would be slow but we at most we can copy times of queue
+ *size, this is acceptable. What's more important, this help to
+ *avoid resetting.




+ */
+static struct page *xdp_linearize_page(struct receive_queue *rq,
+  u16 *num_buf,
+  struct page *p,
+  int offset,
+  int page_off,
+  unsigned int *len)
+{
+   struct page *page = alloc_page(GFP_ATOMIC);
+
+   if (!page)
+   return NULL;
+
+   memcpy(page_address(page) + page_off, page_address(p) + offset, *len);
+   page_off += *len;
+
+   while (--*num_buf) {
+   unsigned int buflen;
+   void *buf;
+   int off;
+
+   buf = virtqueue_get_buf(rq->vq, );
+   if (unlikely(!buf))
+   goto err_buf;
+
+   p = virt_to_head_page(buf);
+   off = buf - page_address(p);
+
+   /* guard against a misconfigured or uncooperative backend that
+* is sending packet larger than the MTU.
+*/
+   if ((page_off + buflen) > PAGE_SIZE) {
+   put_page(p);
+   goto err_buf;
+   }
+
+   memcpy(page_address(page) + page_off,
+  page_address(p) + off, buflen);
+   page_off += buflen;
+   put_page(p);
+   }
+
+   /* Headroom does not contribute to packet length */
+   *len = page_off - VIRTIO_XDP_HEADROOM;
+   return page;
+err_buf:
+   __free_pages(page, 0);
+   return NULL;
+}
+

Re: [PATCH net-next 3/5] virtio-net: switch to use new ctx API for small buffer

2017-07-18 Thread Jason Wang



On 2017年07月19日 03:20, Michael S. Tsirkin wrote:

what's needed is ability to store the headroom there.

virtio-net: switch to use ctx API for small buffers

Use ctx API to store headroom for small buffers.
Following patches will retrieve this info and use it for XDP.

On Mon, Jul 17, 2017 at 08:43:59PM +0800, Jason Wang wrote:

Switch to use ctx API for small buffer, this is need for avoiding
reset on XDP.

Signed-off-by: Jason Wang 
---
  drivers/net/virtio_net.c | 12 +++-
  1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 8fae9a8..e31b5b2 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -410,7 +410,8 @@ static unsigned int virtnet_get_headroom(struct 
virtnet_info *vi)
  static struct sk_buff *receive_small(struct net_device *dev,
 struct virtnet_info *vi,
 struct receive_queue *rq,
-void *buf, unsigned int len)
+void *buf, void *ctx,
+unsigned int len)
  {
struct sk_buff *skb;
struct bpf_prog *xdp_prog;
@@ -773,7 +774,7 @@ static int receive_buf(struct virtnet_info *vi, struct 
receive_queue *rq,
else if (vi->big_packets)
skb = receive_big(dev, vi, rq, buf, len);
else
-   skb = receive_small(dev, vi, rq, buf, len);
+   skb = receive_small(dev, vi, rq, buf, ctx, len);
  
  	if (unlikely(!skb))

return 0;
@@ -812,6 +813,7 @@ static int add_recvbuf_small(struct virtnet_info *vi, 
struct receive_queue *rq,

Let's document that ctx API is used a bit differently here:

/* Unlike mergeable buffers, all buffers are allocated to the same size,
  * except for the headroom. For this reason we do not need to use
  * mergeable_len_to_ctx here - it is enough to store the headroom as the
  * context ignoring the truesize.
  */


Ok.

Thanks


as an alternative, reuse the same format as mergeable buffers.


struct page_frag *alloc_frag = >alloc_frag;
char *buf;
unsigned int xdp_headroom = virtnet_get_headroom(vi);
+   void *ctx = (void *)(unsigned long)xdp_headroom;
int len = vi->hdr_len + VIRTNET_RX_PAD + GOOD_PACKET_LEN + xdp_headroom;
int err;
  
@@ -825,7 +827,7 @@ static int add_recvbuf_small(struct virtnet_info *vi, struct receive_queue *rq,

alloc_frag->offset += len;
sg_init_one(rq->sg, buf + VIRTNET_RX_PAD + xdp_headroom,
vi->hdr_len + GOOD_PACKET_LEN);
-   err = virtqueue_add_inbuf(rq->vq, rq->sg, 1, buf, gfp);
+   err = virtqueue_add_inbuf_ctx(rq->vq, rq->sg, 1, buf, ctx, gfp);
if (err < 0)
put_page(virt_to_head_page(buf));
  
@@ -1034,7 +1036,7 @@ static int virtnet_receive(struct receive_queue *rq, int budget)

void *buf;
struct virtnet_stats *stats = this_cpu_ptr(vi->stats);
  
-	if (vi->mergeable_rx_bufs) {

+   if (!vi->big_packets || vi->mergeable_rx_bufs) {
void *ctx;
  
  		while (received < budget &&

@@ -2198,7 +2200,7 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
names = kmalloc(total_vqs * sizeof(*names), GFP_KERNEL);
if (!names)
goto err_names;
-   if (vi->mergeable_rx_bufs) {
+   if (!vi->big_packets || vi->mergeable_rx_bufs) {
ctx = kzalloc(total_vqs * sizeof(*ctx), GFP_KERNEL);
if (!ctx)
goto err_ctx;
--
2.7.4




Re: [PATCH net-next 2/5] virtio-net: pack headroom into ctx for mergeable buffer

2017-07-18 Thread Jason Wang



On 2017年07月19日 02:59, Michael S. Tsirkin wrote:

On Mon, Jul 17, 2017 at 08:43:58PM +0800, Jason Wang wrote:

Pack headroom into ctx, then during XDP set, we could know the size of
headroom and copy if needed. This is required for avoiding reset on
XDP.

Not really when XDP is set - it's when buffers are used.


Of course :)



virtio-net: pack headroom into ctx for mergeable buffers

Pack headroom into ctx - this way when we get a buffer we can figure out
the actual headroom that was allocated for the buffer. Will be helpful
to optimize switching between XDP and non-XDP modes which have different
headroom requirements.


Thanks, let me use this as the commit log.




Signed-off-by: Jason Wang 
---
  drivers/net/virtio_net.c | 29 -
  1 file changed, 24 insertions(+), 5 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 1f8c15c..8fae9a8 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -270,6 +270,23 @@ static void skb_xmit_done(struct virtqueue *vq)
netif_wake_subqueue(vi->dev, vq2txq(vq));
  }
  
+#define MRG_CTX_HEADER_SHIFT 22

+static void *mergeable_len_to_ctx(unsigned int truesize,
+ unsigned int headroom)
+{
+   return (void *)(unsigned long)((headroom << MRG_CTX_HEADER_SHIFT) | 
truesize);
+}
+
+static unsigned int mergeable_ctx_to_headroom(void *mrg_ctx)
+{
+   return (unsigned long)mrg_ctx >> MRG_CTX_HEADER_SHIFT;
+}
+
+static unsigned int mergeable_ctx_to_truesize(void *mrg_ctx)
+{
+   return (unsigned long)mrg_ctx & ((1 << MRG_CTX_HEADER_SHIFT) - 1);
+}
+
  /* Called from bottom half context */
  static struct sk_buff *page_to_skb(struct virtnet_info *vi,
   struct receive_queue *rq,
@@ -639,13 +656,14 @@ static struct sk_buff *receive_mergeable(struct 
net_device *dev,
}
rcu_read_unlock();
  
-	if (unlikely(len > (unsigned long)ctx)) {

+   truesize = mergeable_ctx_to_truesize(ctx);
+   if (unlikely(len > truesize)) {
pr_debug("%s: rx error: len %u exceeds truesize %lu\n",
 dev->name, len, (unsigned long)ctx);
dev->stats.rx_length_errors++;
goto err_skb;
}
-   truesize = (unsigned long)ctx;
+
head_skb = page_to_skb(vi, rq, page, offset, len, truesize);
curr_skb = head_skb;
  
@@ -665,13 +683,14 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,

}
  
  		page = virt_to_head_page(buf);

-   if (unlikely(len > (unsigned long)ctx)) {
+
+   truesize = mergeable_ctx_to_truesize(ctx);
+   if (unlikely(len > truesize)) {
pr_debug("%s: rx error: len %u exceeds truesize %lu\n",
 dev->name, len, (unsigned long)ctx);
dev->stats.rx_length_errors++;
goto err_skb;
}
-   truesize = (unsigned long)ctx;
  
  		num_skb_frags = skb_shinfo(curr_skb)->nr_frags;

if (unlikely(num_skb_frags == MAX_SKB_FRAGS)) {
@@ -889,7 +908,7 @@ static int add_recvbuf_mergeable(struct virtnet_info *vi,
  
  	buf = (char *)page_address(alloc_frag->page) + alloc_frag->offset;

buf += headroom; /* advance address leaving hole at front of pkt */
-   ctx = (void *)(unsigned long)len;
+   ctx = mergeable_len_to_ctx(len, headroom);
get_page(alloc_frag->page);
alloc_frag->offset += len + headroom;
hole = alloc_frag->size - alloc_frag->offset;
--
2.7.4




[PATCH ethtool net] ethtool: fix the rx vs tx mixup in set channel message

2017-07-18 Thread Jakub Kicinski
When set channels (ethtool -L) doesn't modify any settings
a message is printed which contains the current parameters:

# ethtool -L em1
no channel parameters changed, aborting
current values: tx 4 rx 1 other 0 combined 0

or

# ethtool -L em1 rx 4
rx unmodified, ignoring
no channel parameters changed, aborting
current values: tx 4 rx 1 other 0 combined 0

In this message, however, rx and tx values are swapped,
which can be confirmed running get:

# ethtool -l em1
...
Current hardware settings:
RX:  4
TX:  1
Other:   0
Combined:0

Reorder the rx and tx names in the string thus keeping the order
in line with ethtool -l.

Signed-off-by: Jakub Kicinski 
---
 ethtool.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ethtool.c b/ethtool.c
index ad18704e7c5f..4fd340540303 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -1995,7 +1995,7 @@ static int do_schannels(struct cmd_context *ctx)
 
if (!changed) {
fprintf(stderr, "no channel parameters changed, aborting\n");
-   fprintf(stderr, "current values: tx %u rx %u other %u"
+   fprintf(stderr, "current values: rx %u tx %u other %u"
" combined %u\n", echannels.rx_count,
echannels.tx_count, echannels.other_count,
echannels.combined_count);
-- 
2.11.0



Re: [PATCH] rtlwifi: remove useless code

2017-07-18 Thread Larry Finger

On 07/18/2017 03:41 PM, Gustavo A. R. Silva wrote:

Remove useless local variables last_read_point and last_txw_point and
the code related.

Signed-off-by: Gustavo A. R. Silva 
---
  drivers/net/wireless/realtek/rtlwifi/rtl8192ee/trx.c | 6 --
  1 file changed, 6 deletions(-)


Acked-by: Larry Finger 

Thanks,

Larry


[net-next v3 2/5] ixgbe: Enable LASI interrupts for X552 devices

2017-07-18 Thread Jeff Kirsher
From: Tony Nguyen 

Enable LASI interrupts on X552 devices in order to receive notifications of
link configurations of the external PHY and support the configuration of
the internal iXFI link since iXFI does not support auto-negotiation.  This
is not required for X553 devices; add a check to avoid enabling LASI
interrupts for X553 devices.

Signed-off-by: Tony Nguyen 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c | 31 +++
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c 
b/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
index 72d84a065e34..aa34e0b131bb 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
@@ -2404,17 +2404,30 @@ static s32 ixgbe_enable_lasi_ext_t_x550em(struct 
ixgbe_hw *hw)
status = ixgbe_get_lasi_ext_t_x550em(hw, );
 
/* Enable link status change alarm */
-   status = hw->phy.ops.read_reg(hw, IXGBE_MDIO_PMA_TX_VEN_LASI_INT_MASK,
- MDIO_MMD_AN, );
-   if (status)
-   return status;
 
-   reg |= IXGBE_MDIO_PMA_TX_VEN_LASI_INT_EN;
+   /* Enable the LASI interrupts on X552 devices to receive notifications
+* of the link configurations of the external PHY and correspondingly
+* support the configuration of the internal iXFI link, since iXFI does
+* not support auto-negotiation. This is not required for X553 devices
+* having KR support, which performs auto-negotiations and which is used
+* as the internal link to the external PHY. Hence adding a check here
+* to avoid enabling LASI interrupts for X553 devices.
+*/
+   if (hw->mac.type != ixgbe_mac_x550em_a) {
+   status = hw->phy.ops.read_reg(hw,
+   IXGBE_MDIO_PMA_TX_VEN_LASI_INT_MASK,
+   MDIO_MMD_AN, );
+   if (status)
+   return status;
+
+   reg |= IXGBE_MDIO_PMA_TX_VEN_LASI_INT_EN;
 
-   status = hw->phy.ops.write_reg(hw, IXGBE_MDIO_PMA_TX_VEN_LASI_INT_MASK,
-  MDIO_MMD_AN, reg);
-   if (status)
-   return status;
+   status = hw->phy.ops.write_reg(hw,
+   IXGBE_MDIO_PMA_TX_VEN_LASI_INT_MASK,
+   MDIO_MMD_AN, reg);
+   if (status)
+   return status;
+   }
 
/* Enable high temperature failure and global fault alarms */
status = hw->phy.ops.read_reg(hw, IXGBE_MDIO_GLOBAL_INT_MASK,
-- 
2.13.2



[net-next v3 1/5] ixgbe: Ensure MAC filter was added before setting MACVLAN

2017-07-18 Thread Jeff Kirsher
From: Tony Nguyen 

This patch adds a check to ensure that adding the MAC filter was
successful before setting the MACVLAN.  If it was unsuccessful, propagate
the error.

Signed-off-by: Tony Nguyen 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c 
b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
index 0760bd7eeb01..ca492876bd3d 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
@@ -681,6 +681,7 @@ static int ixgbe_set_vf_macvlan(struct ixgbe_adapter 
*adapter,
 {
struct list_head *pos;
struct vf_macvlans *entry;
+   s32 retval = 0;
 
if (index <= 1) {
list_for_each(pos, >vf_mvs.l) {
@@ -721,14 +722,15 @@ static int ixgbe_set_vf_macvlan(struct ixgbe_adapter 
*adapter,
if (!entry || !entry->free)
return -ENOSPC;
 
-   entry->free = false;
-   entry->is_macvlan = true;
-   entry->vf = vf;
-   memcpy(entry->vf_macvlan, mac_addr, ETH_ALEN);
-
-   ixgbe_add_mac_filter(adapter, mac_addr, vf);
+   retval = ixgbe_add_mac_filter(adapter, mac_addr, vf);
+   if (retval >= 0) {
+   entry->free = false;
+   entry->is_macvlan = true;
+   entry->vf = vf;
+   memcpy(entry->vf_macvlan, mac_addr, ETH_ALEN);
+   }
 
-   return 0;
+   return retval;
 }
 
 static inline void ixgbe_vf_reset_event(struct ixgbe_adapter *adapter, u32 vf)
-- 
2.13.2



[net-next v3 3/5] ixgbe: Update NW_MNG_IF_SEL support for X553

2017-07-18 Thread Jeff Kirsher
From: Tony Nguyen 

The MAC register NW_MNG_IF_SEL fields have been redefined for
X553. These changes impact the iXFI driver code flow. Since iXFI is
only supported in X552, add MAC checks for iXFI flows.

Signed-off-by: Tony Nguyen 
Signed-off-by: Paul Greenwalt 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |  2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_type.h |  4 ++--
 drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c | 14 +++---
 3 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c 
b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 0f867dcda65f..96606e3eb965 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -386,7 +386,7 @@ u32 ixgbe_read_reg(struct ixgbe_hw *hw, u32 reg)
if (ixgbe_removed(reg_addr))
return IXGBE_FAILED_READ_REG;
if (unlikely(hw->phy.nw_mng_if_sel &
-IXGBE_NW_MNG_IF_SEL_ENABLE_10_100M)) {
+IXGBE_NW_MNG_IF_SEL_SGMII_ENABLE)) {
struct ixgbe_adapter *adapter;
int i;
 
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h 
b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
index 9c2460c5ef1b..ffa0ee5cd0f5 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
@@ -3778,8 +3778,8 @@ struct ixgbe_info {
 #define IXGBE_NW_MNG_IF_SEL_PHY_SPEED_1G   BIT(19)
 #define IXGBE_NW_MNG_IF_SEL_PHY_SPEED_2_5G BIT(20)
 #define IXGBE_NW_MNG_IF_SEL_PHY_SPEED_10G  BIT(21)
-#define IXGBE_NW_MNG_IF_SEL_ENABLE_10_100M BIT(23)
-#define IXGBE_NW_MNG_IF_SEL_INT_PHY_MODE   BIT(24)
+#define IXGBE_NW_MNG_IF_SEL_SGMII_ENABLE   BIT(25)
+#define IXGBE_NW_MNG_IF_SEL_INT_PHY_MODE   BIT(24) /* X552 only */
 #define IXGBE_NW_MNG_IF_SEL_MDIO_PHY_ADD_SHIFT 3
 #define IXGBE_NW_MNG_IF_SEL_MDIO_PHY_ADD   \
(0x1F << IXGBE_NW_MNG_IF_SEL_MDIO_PHY_ADD_SHIFT)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c 
b/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
index aa34e0b131bb..95adbda36235 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
@@ -1555,9 +1555,14 @@ static s32 ixgbe_restart_an_internal_phy_x550em(struct 
ixgbe_hw *hw)
  **/
 static s32 ixgbe_setup_ixfi_x550em(struct ixgbe_hw *hw, ixgbe_link_speed 
*speed)
 {
+   struct ixgbe_mac_info *mac = >mac;
s32 status;
u32 reg_val;
 
+   /* iXFI is only supported with X552 */
+   if (mac->type != ixgbe_mac_X550EM_x)
+   return IXGBE_ERR_LINK_SETUP;
+
/* Disable AN and force speed to 10G Serial. */
status = ixgbe_read_iosf_sb_reg_x550(hw,
IXGBE_KRM_LINK_CTRL_1(hw->bus.lan_id),
@@ -1874,8 +1879,10 @@ static s32 ixgbe_setup_mac_link_t_X550em(struct ixgbe_hw 
*hw,
else
force_speed = IXGBE_LINK_SPEED_1GB_FULL;
 
-   /* If internal link mode is XFI, then setup XFI internal link. */
-   if (!(hw->phy.nw_mng_if_sel & IXGBE_NW_MNG_IF_SEL_INT_PHY_MODE)) {
+   /* If X552 and internal link mode is XFI, then setup XFI internal link.
+*/
+   if (hw->mac.type == ixgbe_mac_X550EM_x &&
+   !(hw->phy.nw_mng_if_sel & IXGBE_NW_MNG_IF_SEL_INT_PHY_MODE)) {
status = ixgbe_setup_ixfi_x550em(hw, _speed);
 
if (status)
@@ -2628,7 +2635,8 @@ static s32 ixgbe_setup_internal_phy_t_x550em(struct 
ixgbe_hw *hw)
if (hw->mac.ops.get_media_type(hw) != ixgbe_media_type_copper)
return IXGBE_ERR_CONFIG;
 
-   if (hw->phy.nw_mng_if_sel & IXGBE_NW_MNG_IF_SEL_INT_PHY_MODE) {
+   if (!(hw->mac.type == ixgbe_mac_X550EM_x &&
+ !(hw->phy.nw_mng_if_sel & IXGBE_NW_MNG_IF_SEL_INT_PHY_MODE))) {
speed = IXGBE_LINK_SPEED_10GB_FULL |
IXGBE_LINK_SPEED_1GB_FULL;
return ixgbe_setup_kr_speed_x550em(hw, speed);
-- 
2.13.2



[net-next v3 4/5] ixgbe: Do not support flow control autonegotiation for X553

2017-07-18 Thread Jeff Kirsher
From: Tony Nguyen 

Flow control autonegotiation is not supported for fiber on X553.  Add
device ID checks in ixgbe_device_supports_autoneg_fc() to return the
appropriate value.

Signed-off-by: Tony Nguyen 
Signed-off-by: Emil Tantilov 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_common.c | 25 +++--
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c 
b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
index 4e35e7017f3d..40ae7db468ea 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
@@ -79,13 +79,22 @@ bool ixgbe_device_supports_autoneg_fc(struct ixgbe_hw *hw)
 
switch (hw->phy.media_type) {
case ixgbe_media_type_fiber:
-   hw->mac.ops.check_link(hw, , _up, false);
-   /* if link is down, assume supported */
-   if (link_up)
-   supported = speed == IXGBE_LINK_SPEED_1GB_FULL ?
+   /* flow control autoneg black list */
+   switch (hw->device_id) {
+   case IXGBE_DEV_ID_X550EM_A_SFP:
+   case IXGBE_DEV_ID_X550EM_A_SFP_N:
+   supported = false;
+   break;
+   default:
+   hw->mac.ops.check_link(hw, , _up, false);
+   /* if link is down, assume supported */
+   if (link_up)
+   supported = speed == IXGBE_LINK_SPEED_1GB_FULL ?
true : false;
-   else
-   supported = true;
+   else
+   supported = true;
+   }
+
break;
case ixgbe_media_type_backplane:
supported = true;
@@ -111,6 +120,10 @@ bool ixgbe_device_supports_autoneg_fc(struct ixgbe_hw *hw)
break;
}
 
+   if (!supported)
+   hw_dbg(hw, "Device %x does not support flow control autoneg\n",
+  hw->device_id);
+
return supported;
 }
 
-- 
2.13.2



[net-next v3 5/5] ixgbe: Disable flow control for XFI

2017-07-18 Thread Jeff Kirsher
From: Tony Nguyen 

Flow control autonegotiation is not supported for XFI.  Make sure that
ixgbe_device_supports_autoneg_fc() returns false and
hw->fc.disable_fc_autoneg is set to true to avoid running the fc_autoneg
function for that device.

Signed-off-by: Tony Nguyen 
Signed-off-by: Emil Tantilov 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_common.c |  5 ++-
 drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c   | 57 ++---
 2 files changed, 35 insertions(+), 27 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c 
b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
index 40ae7db468ea..2c19070d2a0b 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
@@ -97,7 +97,10 @@ bool ixgbe_device_supports_autoneg_fc(struct ixgbe_hw *hw)
 
break;
case ixgbe_media_type_backplane:
-   supported = true;
+   if (hw->device_id == IXGBE_DEV_ID_X550EM_X_XFI)
+   supported = false;
+   else
+   supported = true;
break;
case ixgbe_media_type_copper:
/* only some copper devices support flow control autoneg */
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c 
b/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
index 95adbda36235..19fbb2f28ea4 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c
@@ -2843,7 +2843,7 @@ static s32 ixgbe_setup_fc_x550em(struct ixgbe_hw *hw)
 {
bool pause, asm_dir;
u32 reg_val;
-   s32 rc;
+   s32 rc = 0;
 
/* Validate the requested mode */
if (hw->fc.strict_ieee && hw->fc.requested_mode == ixgbe_fc_rx_pause) {
@@ -2886,32 +2886,37 @@ static s32 ixgbe_setup_fc_x550em(struct ixgbe_hw *hw)
return IXGBE_ERR_CONFIG;
}
 
-   if (hw->device_id != IXGBE_DEV_ID_X550EM_X_KR &&
-   hw->device_id != IXGBE_DEV_ID_X550EM_A_KR &&
-   hw->device_id != IXGBE_DEV_ID_X550EM_A_KR_L)
-   return 0;
-
-   rc = hw->mac.ops.read_iosf_sb_reg(hw,
- IXGBE_KRM_AN_CNTL_1(hw->bus.lan_id),
- IXGBE_SB_IOSF_TARGET_KR_PHY,
- _val);
-   if (rc)
-   return rc;
-
-   reg_val &= ~(IXGBE_KRM_AN_CNTL_1_SYM_PAUSE |
-IXGBE_KRM_AN_CNTL_1_ASM_PAUSE);
-   if (pause)
-   reg_val |= IXGBE_KRM_AN_CNTL_1_SYM_PAUSE;
-   if (asm_dir)
-   reg_val |= IXGBE_KRM_AN_CNTL_1_ASM_PAUSE;
-   rc = hw->mac.ops.write_iosf_sb_reg(hw,
-  IXGBE_KRM_AN_CNTL_1(hw->bus.lan_id),
-  IXGBE_SB_IOSF_TARGET_KR_PHY,
-  reg_val);
-
-   /* This device does not fully support AN. */
-   hw->fc.disable_fc_autoneg = true;
+   switch (hw->device_id) {
+   case IXGBE_DEV_ID_X550EM_X_KR:
+   case IXGBE_DEV_ID_X550EM_A_KR:
+   case IXGBE_DEV_ID_X550EM_A_KR_L:
+   rc = hw->mac.ops.read_iosf_sb_reg(hw,
+   IXGBE_KRM_AN_CNTL_1(hw->bus.lan_id),
+   IXGBE_SB_IOSF_TARGET_KR_PHY,
+   _val);
+   if (rc)
+   return rc;
 
+   reg_val &= ~(IXGBE_KRM_AN_CNTL_1_SYM_PAUSE |
+IXGBE_KRM_AN_CNTL_1_ASM_PAUSE);
+   if (pause)
+   reg_val |= IXGBE_KRM_AN_CNTL_1_SYM_PAUSE;
+   if (asm_dir)
+   reg_val |= IXGBE_KRM_AN_CNTL_1_ASM_PAUSE;
+   rc = hw->mac.ops.write_iosf_sb_reg(hw,
+   IXGBE_KRM_AN_CNTL_1(hw->bus.lan_id),
+   IXGBE_SB_IOSF_TARGET_KR_PHY,
+   reg_val);
+
+   /* This device does not fully support AN. */
+   hw->fc.disable_fc_autoneg = true;
+   break;
+   case IXGBE_DEV_ID_X550EM_X_XFI:
+   hw->fc.disable_fc_autoneg = true;
+   break;
+   default:
+   break;
+   }
return rc;
 }
 
-- 
2.13.2



[net-next v3 0/5][pull request] 10GbE Intel Wired LAN Driver Updates 2017-07-18

2017-07-18 Thread Jeff Kirsher
This series contains updates to ixgbe only.

Tony provides all of the changes in the series, starting with adding a
check to ensure that adding a MAC filter was successful, before setting the
MACVLAN.  In order to receive notifications of link configurations of the
external PHY and support the configuration of the internal iXFI link on
X552 devices, Tony enables LASI interrupts.  Update the iXFI driver code
flow, since the MAC register NW_MNG_IF_SEL fields have been redefined for
X553 devices, so add MAC checks for iXFI flows.  Added additional checks
for flow control autonegotiation, since it is not support for X553 fiber
 and XFI devices.

v2: removed unnecessary parens noticed by David Miller in patch 6 of the
series.
v3: dropped patch 6 of the original series, while we work out a more
generic solution for malicious driver detection (MDD) support.

The following are changes since commit 8c5e9fb8ac8f60253cd9589b61403d616dbdaf69:
  Merge branch 'net-attribute_group-const'
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue 10GbE

Tony Nguyen (5):
  ixgbe: Ensure MAC filter was added before setting MACVLAN
  ixgbe: Enable LASI interrupts for X552 devices
  ixgbe: Update NW_MNG_IF_SEL support for X553
  ixgbe: Do not support flow control autonegotiation for X553
  ixgbe: Disable flow control for XFI

 drivers/net/ethernet/intel/ixgbe/ixgbe_common.c |  30 +--
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c   |   2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c  |  16 ++--
 drivers/net/ethernet/intel/ixgbe/ixgbe_type.h   |   4 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_x550.c   | 102 +++-
 5 files changed, 99 insertions(+), 55 deletions(-)

-- 
2.13.2



Re: [PATCH iproute2 net-next] bridge: this patch adds json support for bridge mdb show

2017-07-18 Thread Stephen Hemminger
On Fri,  7 Jul 2017 15:24:16 -0700
Roopa Prabhu  wrote:

> From: Nikhil Gajendrakumar 
> 
> This patch adds json output to bridge mdb show
> 
> Normal Output:
> $ bridge -d -s mdb show
> dev br0 port swp3 grp 239.0.0.1 temp  vid 128 172.26
> dev br0 port swp3 grp 239.0.0.1 temp  vid 64 172.26
> dev br0 port swp2 grp 239.0.0.2 temp  vid 1024 172.26
> dev br0 port swp2 grp 239.0.0.2 temp  vid 256 172.26
> dev br0 port swp2 grp 239.0.0.2 temp  vid 1 172.26
> dev br0 port swp3 grp 239.0.0.1 temp  vid 1 172.26
> router ports on br0: swp40.00 permanent
> router ports on br0: swp50.00 permanent
> 
> Json Output:
> $ bridge -d -s -j mdb show
> {
> "mdb": [{
> "dev": "br0",
> "port": "swp3",
> "grp": "239.0.0.1",
> "state": "temp",
> "vid": 128,
> "timer": " 166.74"
> },{
> "dev": "br0",
> "port": "swp3",
> "grp": "239.0.0.1",
> "state": "temp",
> "vid": 64,
> "timer": " 166.74"
> },{
> "dev": "br0",
> "port": "swp2",
> "grp": "239.0.0.2",
> "state": "temp",
> "vid": 1024,
> "timer": " 166.74"
> },{
> "dev": "br0",
> "port": "swp2",
> "grp": "239.0.0.2",
> "state": "temp",
> "vid": 256,
> "timer": " 166.74"
> },{
> "dev": "br0",
> "port": "swp2",
> "grp": "239.0.0.2",
> "state": "temp",
> "vid": 1,
> "timer": " 166.74"
> },{
> "dev": "br0",
> "port": "swp3",
> "grp": "239.0.0.1",
> "state": "temp",
> "vid": 1,
> "timer": " 166.74"
> }
> ],
> "router": {
> "br0": [{
> "port": "swp4",
> "timer": "   0.00",
> "type": "permanent"
> },{
> "port": "swp5",
> "timer": "   0.00",
> "type": "permanent"
> }
> ]
> }
> }
> 
> Signed-off-by: Nikhil Gajendrakumar 
> Signed-off-by: Roopa Prabhu 

Applied, you should also update usage message and man page.



Re: [PATCH RFC, iproute2] tc/mqprio: Add support to configure bandwidth rate limit through mqprio

2017-07-18 Thread Stephen Hemminger
On Fri, 14 Jul 2017 18:26:13 -0700
Amritha Nambiar  wrote:

> Support bandwidth rate limit information for a traffic
> class in addition to the number of TCs and associated
> queue configuration data. This is supported in the new
> hardware offload mode in mqprio by setting the value of
> 'hw' option to 2. This new hardware offload mode in mqprio
> makes full use of the mqprio options, the TCs, the
> queue configurations and the bandwidth rates for the TCs.
> 
> # tc qdisc add dev eth0 root mqprio num_tc 2  map 0 0 0 0 1 1 1 1\
>   queues 4@0 4@4 min_rate 0Mbit 0Mbit max_rate 55Mbit 60Mbit hw 2
> 
> # tc qdisc show dev eth0
> 
> qdisc mqprio 804a: root  tc 2 map 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0
>  queues:(0:3) (4:7)
>  min rates:0bit 0bit
>  max rates:55Mbit 60Mbit
> 
> Signed-off-by: Amritha Nambiar 

As Jamal said, the syntax is awquard.
Also, for iproute2 commands the output format should match the input format




Re: [PATCH iproute2] tc: fix typo in manpage

2017-07-18 Thread Stephen Hemminger
On Fri,  7 Jul 2017 15:08:33 +0200
Matteo Croce  wrote:

> Fix a typo in the 'tc' manpage and reword some sentences.
> 
> Signed-off-by: Matteo Croce 

Applied, thanks.


Re: [PATCH iproute2 -master 0/3] BPF updates

2017-07-18 Thread Stephen Hemminger
On Mon, 17 Jul 2017 17:18:49 +0200
Daniel Borkmann  wrote:

> Couple of misc updates related to BPF. First removes a BPF sample
> that is long legacy (pre BPF fs times), then we add support for the
> loader to be able to take care of map in map, and last but not least
> we dump id and whether prog was jited in tc filter show for cls/act
> BPF programs.
> 
> The set is intended for -master branch _after_ the -net-next branch
> got merged into -master branch.
> 
> Thanks!
> 
> Daniel Borkmann (3):
>   bpf: remove obsolete samples
>   bpf: support loading map in map from obj
>   bpf: dump id/jited info for cls/act programs
> 
>  examples/bpf/bpf_agent.c  | 258 --
>  examples/bpf/bpf_map_in_map.c |  56 +
>  examples/bpf/bpf_prog.c   | 501 
> --
>  examples/bpf/bpf_shared.h |  22 --
>  examples/bpf/bpf_sys.h|  23 --
>  include/bpf_elf.h |   2 +
>  include/bpf_util.h|   2 +
>  lib/bpf.c | 205 -
>  tc/f_bpf.c|   3 +
>  tc/m_bpf.c|   3 +
>  10 files changed, 261 insertions(+), 814 deletions(-)
>  delete mode 100644 examples/bpf/bpf_agent.c
>  create mode 100644 examples/bpf/bpf_map_in_map.c
>  delete mode 100644 examples/bpf/bpf_prog.c
>  delete mode 100644 examples/bpf/bpf_shared.h
>  delete mode 100644 examples/bpf/bpf_sys.h
> 

Applied.


[PATCH v2 net-next] net: systemport: Support 64bit statistics

2017-07-18 Thread Jianming.qiao
Signed-off-by: Jianming.qiao 
---
 drivers/net/ethernet/broadcom/bcmsysport.c | 52 +++---
 drivers/net/ethernet/broadcom/bcmsysport.h |  9 --
 2 files changed, 55 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c 
b/drivers/net/ethernet/broadcom/bcmsysport.c
index 5274501..56f8951 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.c
+++ b/drivers/net/ethernet/broadcom/bcmsysport.c
@@ -662,6 +662,7 @@ static int bcm_sysport_alloc_rx_bufs(struct 
bcm_sysport_priv *priv)
 static unsigned int bcm_sysport_desc_rx(struct bcm_sysport_priv *priv,
unsigned int budget)
 {
+   struct bcm_sysport_stats *stats64 = >stats64;
struct net_device *ndev = priv->netdev;
unsigned int processed = 0, to_process;
struct bcm_sysport_cb *cb;
@@ -765,6 +766,10 @@ static unsigned int bcm_sysport_desc_rx(struct 
bcm_sysport_priv *priv,
skb->protocol = eth_type_trans(skb, ndev);
ndev->stats.rx_packets++;
ndev->stats.rx_bytes += len;
+   u64_stats_update_begin(>syncp);
+   stats64->rx_packets++;
+   stats64->rx_bytes += len;
+   u64_stats_update_end(>syncp);
 
napi_gro_receive(>napi, skb);
 next:
@@ -784,24 +789,31 @@ static void bcm_sysport_tx_reclaim_one(struct 
bcm_sysport_tx_ring *ring,
   unsigned int *pkts_compl)
 {
struct bcm_sysport_priv *priv = ring->priv;
+   struct bcm_sysport_stats *stats64 = >stats64;
struct device *kdev = >pdev->dev;
+   unsigned int len = 0;
 
if (cb->skb) {
-   ring->bytes += cb->skb->len;
-   *bytes_compl += cb->skb->len;
+   len = cb->skb->len;
+   *bytes_compl += len;
dma_unmap_single(kdev, dma_unmap_addr(cb, dma_addr),
 dma_unmap_len(cb, dma_len),
 DMA_TO_DEVICE);
-   ring->packets++;
(*pkts_compl)++;
bcm_sysport_free_cb(cb);
/* SKB fragment */
} else if (dma_unmap_addr(cb, dma_addr)) {
-   ring->bytes += dma_unmap_len(cb, dma_len);
+   len = dma_unmap_len(cb, dma_len);
dma_unmap_page(kdev, dma_unmap_addr(cb, dma_addr),
   dma_unmap_len(cb, dma_len), DMA_TO_DEVICE);
dma_unmap_addr_set(cb, dma_addr, 0);
}
+
+   u64_stats_update_begin(>syncp);
+   ring->bytes += len;
+   if (cb->skb)
+   ring->packets++;
+   u64_stats_update_end(>syncp);
 }
 
 /* Reclaim queued SKBs for transmission completion, lockless version */
@@ -1923,6 +1935,37 @@ static int bcm_sysport_stop(struct net_device *dev)
return 0;
 }
 
+static void bcm_sysport_get_stats64(struct net_device *dev,
+   struct rtnl_link_stats64 *stats)
+{
+   struct bcm_sysport_priv *priv = netdev_priv(dev);
+   struct bcm_sysport_stats *stats64 = >stats64;
+   struct bcm_sysport_tx_ring *ring;
+   u64 tx_packets = 0, tx_bytes = 0;
+   unsigned int start;
+   unsigned int q;
+
+   netdev_stats_to_stats64(stats, >stats);
+
+   for (q = 0; q < dev->num_tx_queues; q++) {
+   ring = >tx_rings[q];
+   do {
+   start = u64_stats_fetch_begin_irq(>syncp);
+   tx_bytes += ring->bytes;
+   tx_packets += ring->packets;
+   } while (u64_stats_fetch_retry_irq(>syncp, start));
+   }
+
+   stats->tx_packets = tx_packets;
+   stats->tx_bytes = tx_bytes;
+
+   do {
+   start = u64_stats_fetch_begin_irq(>syncp);
+   stats->rx_packets = stats64->rx_packets;
+   stats->rx_bytes = stats64->rx_bytes;
+   } while (u64_stats_fetch_retry_irq(>syncp, start));
+}
+
 static const struct ethtool_ops bcm_sysport_ethtool_ops = {
.get_drvinfo= bcm_sysport_get_drvinfo,
.get_msglevel   = bcm_sysport_get_msglvl,
@@ -1951,6 +1994,7 @@ static int bcm_sysport_stop(struct net_device *dev)
.ndo_poll_controller= bcm_sysport_poll_controller,
 #endif
.ndo_get_stats  = bcm_sysport_get_nstats,
+   .ndo_get_stats64= bcm_sysport_get_stats64,
 };
 
 #define REV_FMT"v%2x.%02x"
diff --git a/drivers/net/ethernet/broadcom/bcmsysport.h 
b/drivers/net/ethernet/broadcom/bcmsysport.h
index 77a51c1..c03a176 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.h
+++ b/drivers/net/ethernet/broadcom/bcmsysport.h
@@ -657,6 +657,9 @@ struct bcm_sysport_stats {
enum bcm_sysport_stat_type type;
/* reg offset from UMAC base for misc counters */
u16 reg_offset;
+   u64 rx_packets;
+   u64 rx_bytes;
+   struct u64_stats_sync   syncp;
 };
 
 

Re: [PATCH iproute2 net-next] iproute: extend route get for mpls routes

2017-07-18 Thread Stephen Hemminger
On Fri,  7 Jul 2017 15:08:11 -0700
Roopa Prabhu  wrote:

> From: Roopa Prabhu 
> 
> This patch extends route get to support mpls specific
> route attributes like RTA_NEWDST.
> 
> Input:
> RTA_DST - input label
> RTA_NEWDST - labels in packet for multipath selection
> 
> By default the getroute handler returns matched
> nexthop label, via and oif
> 
> With fibmatch keyword (RTM_F_FIB_MATCH flag), full matched
> route is returned.
> 
> example:
> $ip -f mpls route show
> 101
> nexthop as to 102/103 via inet 172.16.2.2 dev virt1-2
> nexthop as to 302/303 via inet 172.16.12.2 dev virt1-12
> 201
> nexthop as to 202/203 via inet6 2001:db8:2::2 dev virt1-2
> nexthop as to 402/403 via inet6 2001:db8:12::2 dev virt1-12
> 
> $ip -f mpls route get 103
> RTNETLINK answers: Network is unreachable
> 
> $ip -f mpls route get 101
> 101 as to 102/103 via inet 172.16.2.2 dev virt1-2
> 
> $ip -f mpls route get as to 302/303 101
> 101 as to 302/303 via inet 172.16.12.2 dev virt1-12
> 
> $ip -f mpls route get fibmatch 103
> RTNETLINK answers: Network is unreachable
> 
> $ip -f mpls route get fibmatch 101
> 101
> nexthop as to 102/103 via inet 172.16.2.2 dev virt1-2
> nexthop as to 302/303 via inet 172.16.12.2 dev virt1-12
> 
> Signed-off-by: Roopa Prabhu 

Applied, thanks.



Re: [PATCH] netns: avoid directory traversal (was: ip netns: Make sure netns name is sane)

2017-07-18 Thread Stephen Hemminger
On Mon, 10 Jul 2017 14:08:31 +0200
Matteo Croce  wrote:

> Hi Phil,
> 
> I noticed that your patch still leaves an uncovered scenario, the one where 
> the
> namespace name is "." or "..".
> Calling 'ip netns del ..' will remove /var/run which is a symlink to /run on
> most systems causing some daemons, eg. dbus, to fail.
> 
> ip netns doesn't validate input, allowing creation and deletion of files
> relatives to /var/run/netns.
> This patch denies creation or deletion of namespaces with names contaning
> "/" or that matches exactly "." or "..".
> ---
>  ip/ipnetns.c | 10 ++
>  1 file changed, 10 insertions(+)
> 

The patch itself is good, but the commit message needs fixing.
Please rewrite it to describe the problem, and add signed-off-by


[git][98ed5bb] rvbd/rbt-kernel : bnx2-fix

2017-07-18 Thread steven.la

@new_changed_in []


Repository: g...@gitlab.lab.nbttech.com:rvbd/rbt-kernel.git
Branch: bnx2-fix
Author: Steven La 
Date: 2017-07-18T16:34:31-07:00
New Revision: 98ed5bbc446dca588ab8a1a6edbfc870dc9d6933


Log:
Apply the following patches from upstream and port extra skbuff
operating routines used the these patches.

commit b7b6a688d217936459ff5cf1087b2361db952509
Author: Ian Campbell 
Date:   Wed Aug 24 22:28:12 2011 +

bnx2: convert to SKB paged frag API.

Signed-off-by: Ian Campbell 
Reviewed-by: Konrad Rzeszutek Wilk 
Cc: Michael Chan 
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller 

commit a1f4e8bcbccf50cf1894c263af4d677d4f566533
Author: Eric Dumazet 
Date:   Thu Oct 13 07:50:19 2011 +

bnx2: fix skb truesize underestimation

bnx2 allocates a full page per fragment. We must account PAGE_SIZE
increments on skb->truesize, not the actual frag length.

Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 

commit 9e903e085262ffbf1fc44a17ac06058aca03524a
Author: Eric Dumazet 
Date:   Tue Oct 18 21:00:24 2011 +

net: add skb frag size accessors

To ease skb->truesize sanitization, its better to be able to localize
all references to skb frags size.

Define accessors : skb_frag_size() to fetch frag size, and
skb_frag_size_{set|add|sub}() to manipulate it.

Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 

Root Issue Details:
For bug285465, there are some issues for handling paged data.

Fix Summary Details:
Apply the above three patches taken from upstream's code. Since RiOS is
Redhat 6.1 release base, these patches need additionally skbuff helper
routines, porting those as well.

Testing:
Run stress test on the interface that uses this driver.

Fix Complete: # Yes
Risk of Fix: # Low

BugID: 285465
DesignID:
ReviewID:
CC:
Reviewed-By: akepner
Approved-By:


Open this commit in your browser:
https://gitlab.lab.nbttech.com/rvbd/rbt-kernel/commit/98ed5bbc446dca588ab8a1a6edbfc870dc9d6933


[PATCH net-next v2 1/1] geneve: add rtnl changelink support

2017-07-18 Thread Girish Moodalbail
This patch adds changelink rtnl operation support for geneve devices
and the code changes involve:

  - add geneve_quiesce() which quiesces the geneve device data path
for both TX and RX. This lets us perform the changelink operation
atomically w.r.t data path. Also add geneve_unquiesce() to
reverse the operation of geneve_quiesce().

  - refactor geneve_newlink into geneve_nl2info to be used by both
geneve_newlink and geneve_changelink

  - geneve_nl2info takes a changelink boolean argument to isolate
changelink checks.

  - Allow changing only a few attributes (ttl, tos, and remote tunnel
endpoint IP address (within the same address family)):
- return -EOPNOTSUPP for attributes that cannot be changed for
  now. Incremental patches can make the non-supported one
  available in the future if needed.

Signed-off-by: Girish Moodalbail 
---
v0 -> v1:
   - added geneve_quiesce() and geneve_unquiesce() functions to
 perform the changelink operation atomically w.r.t data path
---
 drivers/net/geneve.c | 192 +--
 1 file changed, 157 insertions(+), 35 deletions(-)

diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index de8156c..829f541 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -827,6 +827,9 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct 
net_device *dev,
__be16 df;
int err;
 
+   if (!gs4)
+   return -EIO;
+
rt = geneve_get_v4_rt(skb, dev, , info);
if (IS_ERR(rt))
return PTR_ERR(rt);
@@ -866,6 +869,9 @@ static int geneve6_xmit_skb(struct sk_buff *skb, struct 
net_device *dev,
__be16 sport;
int err;
 
+   if (!gs6)
+   return -EIO;
+
dst = geneve_get_v6_dst(skb, dev, , info);
if (IS_ERR(dst))
return PTR_ERR(dst);
@@ -1140,6 +1146,15 @@ static bool is_tnl_info_zero(const struct ip_tunnel_info 
*info)
return true;
 }
 
+static inline bool geneve_dst_addr_equal(struct ip_tunnel_info *a,
+struct ip_tunnel_info *b)
+{
+   if (ip_tunnel_info_af(a) == AF_INET)
+   return a->key.u.ipv4.dst == b->key.u.ipv4.dst;
+   else
+   return ipv6_addr_equal(>key.u.ipv6.dst, >key.u.ipv6.dst);
+}
+
 static int geneve_configure(struct net *net, struct net_device *dev,
const struct ip_tunnel_info *info,
bool metadata, bool ipv6_rx_csum)
@@ -1197,24 +1212,22 @@ static void init_tnl_info(struct ip_tunnel_info *info, 
__u16 dst_port)
info->key.tp_dst = htons(dst_port);
 }
 
-static int geneve_newlink(struct net *net, struct net_device *dev,
- struct nlattr *tb[], struct nlattr *data[],
- struct netlink_ext_ack *extack)
+static int geneve_nl2info(struct net_device *dev, struct nlattr *tb[],
+ struct nlattr *data[], struct ip_tunnel_info *info,
+ bool *metadata, bool *use_udp6_rx_checksums,
+ bool changelink)
 {
-   bool use_udp6_rx_checksums = false;
-   struct ip_tunnel_info info;
-   bool metadata = false;
-
-   init_tnl_info(, GENEVE_UDP_PORT);
-
if (data[IFLA_GENEVE_REMOTE] && data[IFLA_GENEVE_REMOTE6])
return -EINVAL;
 
if (data[IFLA_GENEVE_REMOTE]) {
-   info.key.u.ipv4.dst =
+   if (changelink && (ip_tunnel_info_af(info) == AF_INET6))
+   return -EOPNOTSUPP;
+
+   info->key.u.ipv4.dst =
nla_get_in_addr(data[IFLA_GENEVE_REMOTE]);
 
-   if (IN_MULTICAST(ntohl(info.key.u.ipv4.dst))) {
+   if (IN_MULTICAST(ntohl(info->key.u.ipv4.dst))) {
netdev_dbg(dev, "multicast remote is unsupported\n");
return -EINVAL;
}
@@ -1222,21 +1235,24 @@ static int geneve_newlink(struct net *net, struct 
net_device *dev,
 
if (data[IFLA_GENEVE_REMOTE6]) {
  #if IS_ENABLED(CONFIG_IPV6)
-   info.mode = IP_TUNNEL_INFO_IPV6;
-   info.key.u.ipv6.dst =
+   if (changelink && (ip_tunnel_info_af(info) == AF_INET))
+   return -EOPNOTSUPP;
+
+   info->mode = IP_TUNNEL_INFO_IPV6;
+   info->key.u.ipv6.dst =
nla_get_in6_addr(data[IFLA_GENEVE_REMOTE6]);
 
-   if (ipv6_addr_type() &
+   if (ipv6_addr_type(>key.u.ipv6.dst) &
IPV6_ADDR_LINKLOCAL) {
netdev_dbg(dev, "link-local remote is unsupported\n");
return -EINVAL;
}
-   if (ipv6_addr_is_multicast()) {
+   if (ipv6_addr_is_multicast(>key.u.ipv6.dst)) {
netdev_dbg(dev, "multicast remote is 

Re: commit 16ecba59 breaks 82574L under heavy load.

2017-07-18 Thread Benjamin Poirier
On 2017/07/18 10:21, Lennart Sorensen wrote:
> Commit 16ecba59bc333d6282ee057fb02339f77a880beb has apparently broken
> at least the 82574L under heavy load (as in load heavy enough to cause
> packet drops).  In this case, when running in MSI-X mode, the Other
> Causes interrupt fires about 3000 times per second, but not due to link
> state changes.  Unfortunately this commit changed the driver to assume
> that the Other Causes interrupt can only mean link state change and

Thanks for the detailed analysis.

Refering to the original discussion around this patch series, it seemed like
the IMS bit for a condition had to be set for the Other interrupt to be raised
for that condition.

https://lkml.org/lkml/2015/11/4/683

In this case however, E1000_ICR_RXT0 is not set in IMS so Other shouldn't be
raised for Receiver Overrun. Apparently something is going on...

I can reproduce the spurious Other interrupts with a simple mdelay()
With the debugging patch at the end of the mail I see stuff like this
while blasting with udp frames:
  -0 [086] d.h1 15338.742675: e1000_msix_other: got Other 
interrupt, count 15127
   <...>-54504 [086] d.h. 15338.742724: e1000_msix_other: got Other 
interrupt, count 1
   <...>-54504 [086] d.h. 15338.742774: e1000_msix_other: got Other 
interrupt, count 1
   <...>-54504 [086] d.h. 15338.742824: e1000_msix_other: got Other 
interrupt, count 1
  -0 [086] d.h1 15340.745123: e1000_msix_other: got Other 
interrupt, count 27584
   <...>-54504 [086] d.h. 15340.745172: e1000_msix_other: got Other 
interrupt, count 1
   <...>-54504 [086] d.h. 15340.745222: e1000_msix_other: got Other 
interrupt, count 1
   <...>-54504 [086] d.h. 15340.745272: e1000_msix_other: got Other 
interrupt, count 1

> hence sets the flag that (unfortunately) means both link is down and link
> state should be checked.  Since this now happens 3000 times per second,
> the chances of it happening while the watchdog_task is checking the link
> state becomes pretty high, and it if does happen to coincice, then the
> watchdog_task will reset the adapter, which causes a real loss of link.

Through which path does watchdog_task reset the adapter? I didn't
reproduce that.

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c 
b/drivers/net/ethernet/intel/e1000e/netdev.c
index b3679728caac..689ad76d0d12 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -46,6 +46,8 @@
 
 #include "e1000.h"
 
+DEFINE_RATELIMIT_STATE(e1000e_ratelimit_state, 2 * HZ, 4);
+
 #define DRV_EXTRAVERSION "-k"
 
 #define DRV_VERSION "3.2.6" DRV_EXTRAVERSION
@@ -937,6 +939,8 @@ static bool e1000_clean_rx_irq(struct e1000_ring *rx_ring, 
int *work_done,
bool cleaned = false;
unsigned int total_rx_bytes = 0, total_rx_packets = 0;
 
+   mdelay(10);
+
i = rx_ring->next_to_clean;
rx_desc = E1000_RX_DESC_EXT(*rx_ring, i);
staterr = le32_to_cpu(rx_desc->wb.upper.status_error);
@@ -1067,6 +1071,13 @@ static bool e1000_clean_rx_irq(struct e1000_ring 
*rx_ring, int *work_done,
 
adapter->total_rx_bytes += total_rx_bytes;
adapter->total_rx_packets += total_rx_packets;
+
+   if (__ratelimit(_ratelimit_state)) {
+   static unsigned int max;
+   max = max(max, total_rx_packets);
+   trace_printk("received %u max %u\n", total_rx_packets, max);
+   }
+
return cleaned;
 }
 
@@ -1904,9 +1915,16 @@ static irqreturn_t e1000_msix_other(int __always_unused 
irq, void *data)
struct net_device *netdev = data;
struct e1000_adapter *adapter = netdev_priv(netdev);
struct e1000_hw *hw = >hw;
+   static unsigned int count;
 
hw->mac.get_link_status = true;
 
+   count++;
+   if (__ratelimit(_ratelimit_state)) {
+   trace_printk("got Other interrupt, count %u\n", count);
+   count = 0;
+   }
+
/* guard against interrupt when we're going down */
if (!test_bit(__E1000_DOWN, >state)) {
mod_timer(>watchdog_timer, jiffies + 1);
@@ -7121,7 +7139,7 @@ static int e1000_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
netdev->netdev_ops = _netdev_ops;
e1000e_set_ethtool_ops(netdev);
netdev->watchdog_timeo = 5 * HZ;
-   netif_napi_add(netdev, >napi, e1000e_poll, 64);
+   netif_napi_add(netdev, >napi, e1000e_poll, 500);
strlcpy(netdev->name, pci_name(pdev), sizeof(netdev->name));
 
netdev->mem_start = mmio_start;
@@ -7327,6 +7345,8 @@ static int e1000_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
if (err)
goto err_register;
 
+   ratelimit_set_flags(_ratelimit_state, RATELIMIT_MSG_ON_RELEASE);
+
/* carrier off reporting is important to ethtool even BEFORE open */
netif_carrier_off(netdev);
 


[PATCH net-next 1/3] bluetooth: 6lowpan dev_close never returns error

2017-07-18 Thread Stephen Hemminger
The function dev_close in current kernel will never return an
error. Later changes will make it void.

Signed-off-by: Stephen Hemminger 
---
 net/bluetooth/6lowpan.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/net/bluetooth/6lowpan.c b/net/bluetooth/6lowpan.c
index ab3b654b05cc..e542b8959d88 100644
--- a/net/bluetooth/6lowpan.c
+++ b/net/bluetooth/6lowpan.c
@@ -621,9 +621,7 @@ static void ifdown(struct net_device *netdev)
int err;
 
rtnl_lock();
-   err = dev_close(netdev);
-   if (err < 0)
-   BT_INFO("iface %s cannot be closed (%d)", netdev->name, err);
+   dev_close(netdev);
rtnl_unlock();
 }
 
-- 
2.11.0



[PATCH net-next 3/3] net: make dev_close and related functions void

2017-07-18 Thread Stephen Hemminger
There is no useful return value from dev_close. All paths return 0.
Change dev_close and helper functions to void.

Signed-off-by: Stephen Hemminger 
---
 include/linux/netdevice.h |  4 ++--
 net/core/dev.c| 26 +++---
 2 files changed, 13 insertions(+), 17 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index c60351b84323..614642eb7eb7 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2432,8 +2432,8 @@ struct net_device *dev_get_by_name_rcu(struct net *net, 
const char *name);
 struct net_device *__dev_get_by_name(struct net *net, const char *name);
 int dev_alloc_name(struct net_device *dev, const char *name);
 int dev_open(struct net_device *dev);
-int dev_close(struct net_device *dev);
-int dev_close_many(struct list_head *head, bool unlink);
+void dev_close(struct net_device *dev);
+void dev_close_many(struct list_head *head, bool unlink);
 void dev_disable_lro(struct net_device *dev);
 int dev_loopback_xmit(struct net *net, struct sock *sk, struct sk_buff 
*newskb);
 int dev_queue_xmit(struct sk_buff *skb);
diff --git a/net/core/dev.c b/net/core/dev.c
index 467420eda02e..d1b9c9b6c970 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1413,7 +1413,7 @@ int dev_open(struct net_device *dev)
 }
 EXPORT_SYMBOL(dev_open);
 
-static int __dev_close_many(struct list_head *head)
+static void __dev_close_many(struct list_head *head)
 {
struct net_device *dev;
 
@@ -1455,23 +1455,18 @@ static int __dev_close_many(struct list_head *head)
dev->flags &= ~IFF_UP;
netpoll_poll_enable(dev);
}
-
-   return 0;
 }
 
-static int __dev_close(struct net_device *dev)
+static void __dev_close(struct net_device *dev)
 {
-   int retval;
LIST_HEAD(single);
 
list_add(>close_list, );
-   retval = __dev_close_many();
+   __dev_close_many();
list_del();
-
-   return retval;
 }
 
-int dev_close_many(struct list_head *head, bool unlink)
+void dev_close_many(struct list_head *head, bool unlink)
 {
struct net_device *dev, *tmp;
 
@@ -1488,8 +1483,6 @@ int dev_close_many(struct list_head *head, bool unlink)
if (unlink)
list_del_init(>close_list);
}
-
-   return 0;
 }
 EXPORT_SYMBOL(dev_close_many);
 
@@ -1502,7 +1495,7 @@ EXPORT_SYMBOL(dev_close_many);
  * is then deactivated and finally a %NETDEV_DOWN is sent to the notifier
  * chain.
  */
-int dev_close(struct net_device *dev)
+void dev_close(struct net_device *dev)
 {
if (dev->flags & IFF_UP) {
LIST_HEAD(single);
@@ -1511,7 +1504,6 @@ int dev_close(struct net_device *dev)
dev_close_many(, true);
list_del();
}
-   return 0;
 }
 EXPORT_SYMBOL(dev_close);
 
@@ -6725,8 +6717,12 @@ int __dev_change_flags(struct net_device *dev, unsigned 
int flags)
 */
 
ret = 0;
-   if ((old_flags ^ flags) & IFF_UP)
-   ret = ((old_flags & IFF_UP) ? __dev_close : __dev_open)(dev);
+   if ((old_flags ^ flags) & IFF_UP) {
+   if (old_flags & IFF_UP)
+   __dev_close(dev);
+   else
+   ret = __dev_open(dev);
+   }
 
if ((flags ^ dev->gflags) & IFF_PROMISC) {
int inc = (flags & IFF_PROMISC) ? 1 : -1;
-- 
2.11.0



[PATCH net-next 2/3] hns: remove useless void cast

2017-07-18 Thread Stephen Hemminger
There is no need to cast away return value of dev_close.

Signed-off-by: Stephen Hemminger 
---
 drivers/net/ethernet/hisilicon/hns/hns_ethtool.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c 
b/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c
index a8db27e86a11..78cb20c67aa6 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c
@@ -595,7 +595,7 @@ static void hns_nic_self_test(struct net_device *ndev,
set_bit(NIC_STATE_TESTING, >state);
 
if (if_running)
-   (void)dev_close(ndev);
+   dev_close(ndev);
 
for (i = 0; i < SELF_TEST_TPYE_NUM; i++) {
if (!st_param[i][1])
-- 
2.11.0



[PATCH net-next 0/3] net: make dev_close void

2017-07-18 Thread Stephen Hemminger
Noticed while working on other changes. Why is dev_close()
returning int, it should be void.  Should also change
ndo_close to be void, but that requires more work and someone
with more coccinelle foo (smpl) than me.

Stephen Hemminger (3):
  bluetooth: 6lowpan dev_close never returns error
  hns: remove useless void cast
  net: make dev_close and related functions void

 drivers/net/ethernet/hisilicon/hns/hns_ethtool.c |  2 +-
 include/linux/netdevice.h|  4 ++--
 net/bluetooth/6lowpan.c  |  4 +---
 net/core/dev.c   | 26 ++--
 4 files changed, 15 insertions(+), 21 deletions(-)

-- 
2.11.0



Re: [PATCH 0/5] Netfilter fixes for net

2017-07-18 Thread David Miller
From: Florian Westphal 
Date: Tue, 18 Jul 2017 23:11:57 +0200

> David Miller  wrote:
>> What about that change Eric Dumazet was talking about with Florian
>> that stopped instantiating conntrack by default in new namespaces?
> 
> Seems more appropriate for -next.  If you prefer net instead, let me know
> and I'll get to work.

Yeah it's more on the -next side, albeit annoying.

Ok, so nevermind :)


Re: [PATCH] net: Convert to using %pOF instead of full_name

2017-07-18 Thread David Miller
From: Rob Herring 
Date: Tue, 18 Jul 2017 16:43:19 -0500

> Now that we have a custom printf format specifier, convert users of
> full_name to use %pOF instead. This is preparation to remove storing
> of the full path string for each node.
> 
> Signed-off-by: Rob Herring 

Acked-by: David S. Miller 


[PATCH] net: Convert to using %pOF instead of full_name

2017-07-18 Thread Rob Herring
Now that we have a custom printf format specifier, convert users of
full_name to use %pOF instead. This is preparation to remove storing
of the full path string for each node.

Signed-off-by: Rob Herring 
Cc: Andrew Lunn 
Cc: Vivien Didelot 
Cc: Florian Fainelli 
Cc: Madalin Bucur 
Cc: Douglas Miller 
Cc: Grygorii Strashko 
Cc: Michal Simek 
Cc: "Sören Brinkmann" 
Cc: netdev@vger.kernel.org
Cc: linux-o...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
---
 drivers/net/dsa/mv88e6xxx/chip.c|  2 +-
 drivers/net/ethernet/apple/mace.c   |  8 ++--
 drivers/net/ethernet/freescale/dpaa/dpaa_eth.c  |  4 +-
 drivers/net/ethernet/freescale/fec_mpc52xx.c|  4 +-
 drivers/net/ethernet/freescale/fman/fman.c  | 12 +++---
 drivers/net/ethernet/freescale/fman/fman_port.c |  4 +-
 drivers/net/ethernet/freescale/fman/mac.c   | 50 +++
 drivers/net/ethernet/freescale/fsl_pq_mdio.c| 20 +-
 drivers/net/ethernet/ibm/ehea/ehea_main.c   |  5 +--
 drivers/net/ethernet/ibm/emac/core.c| 53 +++--
 drivers/net/ethernet/ibm/emac/debug.h   |  2 +-
 drivers/net/ethernet/ibm/emac/mal.c |  8 ++--
 drivers/net/ethernet/ibm/emac/rgmii.c   | 18 -
 drivers/net/ethernet/ibm/emac/tah.c | 12 ++
 drivers/net/ethernet/ibm/emac/zmii.c| 17 
 drivers/net/ethernet/sun/niu.c  | 24 +--
 drivers/net/ethernet/ti/cpsw.c  |  8 ++--
 drivers/net/ethernet/ti/davinci_emac.c  |  4 +-
 drivers/net/ethernet/xilinx/ll_temac_main.c |  2 +-
 drivers/net/phy/mdio-mux-mmioreg.c  | 18 -
 drivers/net/phy/mdio-mux.c  | 16 
 21 files changed, 134 insertions(+), 157 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 53b088166c28..f7f6526baedb 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -2236,7 +2236,7 @@ static int mv88e6xxx_mdio_register(struct mv88e6xxx_chip 
*chip,

if (np) {
bus->name = np->full_name;
-   snprintf(bus->id, MII_BUS_ID_SIZE, "%s", np->full_name);
+   snprintf(bus->id, MII_BUS_ID_SIZE, "%pOF", np);
} else {
bus->name = "mv88e6xxx SMI";
snprintf(bus->id, MII_BUS_ID_SIZE, "mv88e6xxx-%d", index++);
diff --git a/drivers/net/ethernet/apple/mace.c 
b/drivers/net/ethernet/apple/mace.c
index 96dd5300e0e5..e58b157b7d7c 100644
--- a/drivers/net/ethernet/apple/mace.c
+++ b/drivers/net/ethernet/apple/mace.c
@@ -114,8 +114,8 @@ static int mace_probe(struct macio_dev *mdev, const struct 
of_device_id *match)
int j, rev, rc = -EBUSY;

if (macio_resource_count(mdev) != 3 || macio_irq_count(mdev) != 3) {
-   printk(KERN_ERR "can't use MACE %s: need 3 addrs and 3 irqs\n",
-  mace->full_name);
+   printk(KERN_ERR "can't use MACE %pOF: need 3 addrs and 3 
irqs\n",
+  mace);
return -ENODEV;
}

@@ -123,8 +123,8 @@ static int mace_probe(struct macio_dev *mdev, const struct 
of_device_id *match)
if (addr == NULL) {
addr = of_get_property(mace, "local-mac-address", NULL);
if (addr == NULL) {
-   printk(KERN_ERR "Can't get mac-address for MACE %s\n",
-  mace->full_name);
+   printk(KERN_ERR "Can't get mac-address for MACE %pOF\n",
+  mace);
return -ENODEV;
}
}
diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c 
b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
index 757b873735a5..550ea1ec7b6c 100644
--- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
+++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
@@ -398,8 +398,8 @@ static struct mac_device *dpaa_mac_dev_get(struct 
platform_device *pdev)

of_dev = of_find_device_by_node(mac_node);
if (!of_dev) {
-   dev_err(dpaa_dev, "of_find_device_by_node(%s) failed\n",
-   mac_node->full_name);
+   dev_err(dpaa_dev, "of_find_device_by_node(%pOF) failed\n",
+   mac_node);
of_node_put(mac_node);
return ERR_PTR(-EINVAL);
}
diff --git a/drivers/net/ethernet/freescale/fec_mpc52xx.c 
b/drivers/net/ethernet/freescale/fec_mpc52xx.c
index aa8cf5d2a53c..6d7269d87a85 100644
--- a/drivers/net/ethernet/freescale/fec_mpc52xx.c
+++ b/drivers/net/ethernet/freescale/fec_mpc52xx.c
@@ -960,8 +960,8 @@ static int mpc52xx_fec_probe(struct platform_device *op)

/* 

A buggy behavior for Linux TCP Reno and HTCP

2017-07-18 Thread Wei Sun
Hi there,

We find a buggy behavior when using Linux TCP Reno and HTCP in low
bandwidth or highly congested network environments.

In a simple word, their undo functions may mistakenly double the cwnd,
leading to a more aggressive behavior in a highly congested scenario.


The detailed reason:

The current reno undo function assumes cwnd halving (and thus doubles
the cwnd), but it doesn't consider a corner case condition that
ssthresh is at least 2.

e.g.,
 cwnd  ssth
An initial state: 25
A spurious loss:   12
Undo:   45

Here the cwnd after undo is two times as that before undo. Attached is
a simple script to reproduce it.

A similar reason for HTCP, so we recommend to store the cwnd on loss
in .ssthresh implementation and restore it again in .undo_cwnd for TCP
Reno and HTCP implementations.

Thanks


undo-2-1-4.pkt
Description: Binary data


[PATCH] liquidio: lio_vf_main: remove unnecessary static in setup_io_queues()

2017-07-18 Thread Gustavo A. R. Silva
Remove unnecessary static on local variables cpu_id_modulus and cpu_id.
Such variables are initialized before being used, on every execution
path throughout the function. The static has no benefit and, removing
it reduces the object file size.

This issue was detected using Coccinelle and the following semantic patch:

@bad exists@
position p;
identifier x;
type T;
@@

static T x@p;
...
x = <+...x...+>

@@
identifier x;
expression e;
type T;
position p != bad.p;
@@

-static
 T x@p;
 ... when != x
 when strict
?x = e;

In the following log you can see a significant difference in the object
file size. Also, there is a significant difference in the bss segment.
This log is the output of the size command, before and after the code
change:

before:
   textdata bss dec hex filename
  55656   10680 576   66912   10560 
drivers/net/ethernet/cavium/liquidio/lio_vf_main.o

after:
   textdata bss dec hex filename
  55796   10536 448   66780   104dc 
drivers/net/ethernet/cavium/liquidio/lio_vf_main.o

Signed-off-by: Gustavo A. R. Silva 
---
 drivers/net/ethernet/cavium/liquidio/lio_vf_main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c 
b/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c
index 9b24710..935ff29 100644
--- a/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c
+++ b/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c
@@ -1663,10 +1663,10 @@ static int setup_io_queues(struct octeon_device 
*octeon_dev, int ifidx)
 {
struct octeon_droq_ops droq_ops;
struct net_device *netdev;
-   static int cpu_id_modulus;
+   int cpu_id_modulus;
struct octeon_droq *droq;
struct napi_struct *napi;
-   static int cpu_id;
+   int cpu_id;
int num_tx_descs;
struct lio *lio;
int retval = 0;
-- 
2.5.0



Re: [PATCH 0/5] Netfilter fixes for net

2017-07-18 Thread Florian Westphal
David Miller  wrote:
> What about that change Eric Dumazet was talking about with Florian
> that stopped instantiating conntrack by default in new namespaces?

Seems more appropriate for -next.  If you prefer net instead, let me know
and I'll get to work.


[PATCH] net: ethernet: mediatek: remove useless code in mtk_poll_tx()

2017-07-18 Thread Gustavo A. R. Silva
Remove useless local variable _condition_ and the code related.

Signed-off-by: Gustavo A. R. Silva 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index b3d0c2e..7e95cf5 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -1027,7 +1027,6 @@ static int mtk_poll_tx(struct mtk_eth *eth, int budget)
unsigned int done[MTK_MAX_DEVS];
unsigned int bytes[MTK_MAX_DEVS];
u32 cpu, dma;
-   static int condition;
int total = 0, i;
 
memset(done, 0, sizeof(done));
@@ -1051,10 +1050,8 @@ static int mtk_poll_tx(struct mtk_eth *eth, int budget)
mac = 1;
 
skb = tx_buf->skb;
-   if (!skb) {
-   condition = 1;
+   if (!skb)
break;
-   }
 
if (skb != (struct sk_buff *)MTK_DMA_DUMMY_DESC) {
bytes[mac] += skb->len;
-- 
2.5.0



[PATCH] qlcnic: remove unnecessary static in qlcnic_dump_fw()

2017-07-18 Thread Gustavo A. R. Silva
Remove unnecessary static on local variable fw_dump_ops.
Such variable is initialized before being used, on every
execution path throughout the function. The static has no
benefit and, removing it reduces the object file size.

This issue was detected using Coccinelle and the following semantic patch:

@bad exists@
position p;
identifier x;
type T;
@@

static T x@p;
...
x = <+...x...+>

@@
identifier x;
expression e;
type T;
position p != bad.p;
@@

-static
 T x@p;
 ... when != x
 when strict
?x = e;

In the following log you can see a difference in the object file size.
This log is the output of the size command, before and after the code
change:

before:
   textdata bss dec hex filename
  190322136  64   2123252f0 
drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.o

after:
   textdata bss dec hex filename
  190202048   0   21068524c 
drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.o

Signed-off-by: Gustavo A. R. Silva 
---
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c 
b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c
index 0844b7c..afa10a1 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c
@@ -1285,7 +1285,7 @@ int qlcnic_fw_cmd_get_minidump_temp(struct qlcnic_adapter 
*adapter)
 int qlcnic_dump_fw(struct qlcnic_adapter *adapter)
 {
struct qlcnic_fw_dump *fw_dump = >ahw->fw_dump;
-   static const struct qlcnic_dump_operations *fw_dump_ops;
+   const struct qlcnic_dump_operations *fw_dump_ops;
struct qlcnic_83xx_dump_template_hdr *hdr_83xx;
u32 entry_offset, dump, no_entries, buf_offset = 0;
int i, k, ops_cnt, ops_index, dump_size = 0;
-- 
2.5.0



Re: [PATCH] liquidio: lio_vf_main: remove unnecessary static in setup_io_queues()

2017-07-18 Thread Felix Manlunas
On Tue, Jul 18, 2017 at 03:50:15PM -0500, Gustavo A. R. Silva wrote:
> Remove unnecessary static on local variables cpu_id_modulus and cpu_id.
> Such variables are initialized before being used, on every execution
> path throughout the function. The static has no benefit and, removing
> it reduces the object file size.
> 
> This issue was detected using Coccinelle and the following semantic patch:
> 
> @bad exists@
> position p;
> identifier x;
> type T;
> @@
> 
> static T x@p;
> ...
> x = <+...x...+>
> 
> @@
> identifier x;
> expression e;
> type T;
> position p != bad.p;
> @@
> 
> -static
>  T x@p;
>  ... when != x
>  when strict
> ?x = e;
> 
> In the following log you can see a significant difference in the object
> file size. Also, there is a significant difference in the bss segment.
> This log is the output of the size command, before and after the code
> change:
> 
> before:
>textdata bss dec hex filename
>   55656   10680 576   66912   10560 
> drivers/net/ethernet/cavium/liquidio/lio_vf_main.o
> 
> after:
>textdata bss dec hex filename
>   55796   10536 448   66780   104dc 
> drivers/net/ethernet/cavium/liquidio/lio_vf_main.o
> 
> Signed-off-by: Gustavo A. R. Silva 
> ---
>  drivers/net/ethernet/cavium/liquidio/lio_vf_main.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c 
> b/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c
> index 9b24710..935ff29 100644
> --- a/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c
> +++ b/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c
> @@ -1663,10 +1663,10 @@ static int setup_io_queues(struct octeon_device 
> *octeon_dev, int ifidx)
>  {
>   struct octeon_droq_ops droq_ops;
>   struct net_device *netdev;
> - static int cpu_id_modulus;
> + int cpu_id_modulus;
>   struct octeon_droq *droq;
>   struct napi_struct *napi;
> - static int cpu_id;
> + int cpu_id;
>   int num_tx_descs;
>   struct lio *lio;
>   int retval = 0;
> -- 
> 2.5.0
> 

Thanks.

Acked-by: Felix Manlunas 


[PATCH] net: tulip: remove useless code in tulip_init_one()

2017-07-18 Thread Gustavo A. R. Silva
Remove useless local variable multiport_cnt and the code related.

Signed-off-by: Gustavo A. R. Silva 
---
 drivers/net/ethernet/dec/tulip/tulip_core.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/ethernet/dec/tulip/tulip_core.c 
b/drivers/net/ethernet/dec/tulip/tulip_core.c
index 17e566a..84394b4 100644
--- a/drivers/net/ethernet/dec/tulip/tulip_core.c
+++ b/drivers/net/ethernet/dec/tulip/tulip_core.c
@@ -1303,7 +1303,6 @@ static int tulip_init_one(struct pci_dev *pdev, const 
struct pci_device_id *ent)
0x00, 'L', 'i', 'n', 'u', 'x'
};
static int last_irq;
-   static int multiport_cnt;   /* For four-port boards w/one EEPROM */
int i, irq;
unsigned short sum;
unsigned char *ee_data;
@@ -1557,7 +1556,6 @@ static int tulip_init_one(struct pci_dev *pdev, const 
struct pci_device_id *ent)
} else if (ee_data[0] == 0xff  &&  ee_data[1] == 0xff &&
   ee_data[2] == 0) {
sa_offset = 2;  /* Grrr, damn Matrox boards. */
-   multiport_cnt = 4;
}
 #ifdef CONFIG_MIPS_COBALT
if ((pdev->bus->number == 0) &&
-- 
2.5.0



Re: [PATCH] liquidio: lio_main: remove unnecessary static in setup_io_queues()

2017-07-18 Thread Felix Manlunas
On Tue, Jul 18, 2017 at 03:53:48PM -0500, Gustavo A. R. Silva wrote:
> Remove unnecessary static on local variables cpu_id_modulus and cpu_id.
> Such variables are initialized before being used, on every execution
> path throughout the function. The static has no benefit and, removing
> it reduces the object file size.
> 
> This issue was detected using Coccinelle and the following semantic patch:
> 
> @bad exists@
> position p;
> identifier x;
> type T;
> @@
> 
> static T x@p;
> ...
> x = <+...x...+>
> 
> @@
> identifier x;
> expression e;
> type T;
> position p != bad.p;
> @@
> 
> -static
>  T x@p;
>  ... when != x
>  when strict
> ?x = e;
> 
> In the following log you can see a significant difference in the object
> file size. Also, there is a significant difference in the bss segment.
> This log is the output of the size command, before and after the code
> change:
> 
> before:
>textdata bss dec hex filename
>   78689   15272   27808  121769   1dba9 
> drivers/net/ethernet/cavium/liquidio/lio_main.o
> 
> after:
>textdata bss dec hex filename
>   78667   15128   27680  121475   1da83 
> drivers/net/ethernet/cavium/liquidio/lio_main.o
> 
> Signed-off-by: Gustavo A. R. Silva 
> ---
>  drivers/net/ethernet/cavium/liquidio/lio_main.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/cavium/liquidio/lio_main.c 
> b/drivers/net/ethernet/cavium/liquidio/lio_main.c
> index 51583ae..1d8fefa 100644
> --- a/drivers/net/ethernet/cavium/liquidio/lio_main.c
> +++ b/drivers/net/ethernet/cavium/liquidio/lio_main.c
> @@ -2544,8 +2544,8 @@ static inline int setup_io_queues(struct octeon_device 
> *octeon_dev,
>  {
>   struct octeon_droq_ops droq_ops;
>   struct net_device *netdev;
> - static int cpu_id;
> - static int cpu_id_modulus;
> + int cpu_id;
> + int cpu_id_modulus;
>   struct octeon_droq *droq;
>   struct napi_struct *napi;
>   int q, q_no, retval = 0;
> -- 
> 2.5.0
> 

Thanks.

Acked-by: Felix Manlunas 


[PATCH] rtlwifi: remove useless code

2017-07-18 Thread Gustavo A. R. Silva
Remove useless local variables last_read_point and last_txw_point and
the code related.

Signed-off-by: Gustavo A. R. Silva 
---
 drivers/net/wireless/realtek/rtlwifi/rtl8192ee/trx.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8192ee/trx.c 
b/drivers/net/wireless/realtek/rtlwifi/rtl8192ee/trx.c
index 55f238a..c58393e 100644
--- a/drivers/net/wireless/realtek/rtlwifi/rtl8192ee/trx.c
+++ b/drivers/net/wireless/realtek/rtlwifi/rtl8192ee/trx.c
@@ -478,7 +478,6 @@ u16 rtl92ee_rx_desc_buff_remained_cnt(struct ieee80211_hw 
*hw, u8 queue_index)
struct rtl_priv *rtlpriv = rtl_priv(hw);
u16 read_point = 0, write_point = 0, remind_cnt = 0;
u32 tmp_4byte = 0;
-   static u16 last_read_point;
static bool start_rx;
 
tmp_4byte = rtl_read_dword(rtlpriv, REG_RXQ_TXBD_IDX);
@@ -506,7 +505,6 @@ u16 rtl92ee_rx_desc_buff_remained_cnt(struct ieee80211_hw 
*hw, u8 queue_index)
 
rtlpci->rx_ring[queue_index].next_rx_rp = write_point;
 
-   last_read_point = read_point;
return remind_cnt;
 }
 
@@ -917,7 +915,6 @@ void rtl92ee_set_desc(struct ieee80211_hw *hw, u8 *pdesc, 
bool istx,
struct rtl_priv *rtlpriv = rtl_priv(hw);
u16 cur_tx_rp = 0;
u16 cur_tx_wp = 0;
-   static u16 last_txw_point;
static bool over_run;
u32 tmp = 0;
u8 q_idx = *val;
@@ -951,9 +948,6 @@ void rtl92ee_set_desc(struct ieee80211_hw *hw, u8 *pdesc, 
bool istx,
rtl_write_word(rtlpriv,
   get_desc_addr_fr_q_idx(q_idx),
   ring->cur_tx_wp);
-
-   if (q_idx == 1)
-   last_txw_point = cur_tx_wp;
}
 
if (ring->avl_desc < (max_tx_desc - 15)) {
-- 
2.5.0



[PATCH] wireless: airo: remove unnecessary static in writerids()

2017-07-18 Thread Gustavo A. R. Silva
Remove unnecessary static on local function pointer _writer_.
Such pointer is initialized before being used, on every
execution path throughout the function. The static has no
benefit and, removing it reduces the object file size.

This issue was detected using Coccinelle and the following semantic patch:

@bad exists@
position p;
identifier x;
type T;
@@

static T x@p;
...
x = <+...x...+>

@@
identifier x;
expression e;
type T;
position p != bad.p;
@@

-static
 T x@p;
 ... when != x
 when strict
?x = e;

In the following log you can see a significant difference in the object
file size. This log is the output of the size command, before and after
the code change:

before:
   textdata bss dec hex filename
 113797   191521216  134165   20c15 drivers/net/wireless/cisco/airo.o

after:
   textdata bss dec hex filename
 113881   190961152  134129   20bf1 drivers/net/wireless/cisco/airo.o

Signed-off-by: Gustavo A. R. Silva 
---
 drivers/net/wireless/cisco/airo.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/cisco/airo.c 
b/drivers/net/wireless/cisco/airo.c
index 84143a0..1066d84 100644
--- a/drivers/net/wireless/cisco/airo.c
+++ b/drivers/net/wireless/cisco/airo.c
@@ -7837,7 +7837,7 @@ static int writerids(struct net_device *dev, 
aironet_ioctl *comp) {
struct airo_info *ai = dev->ml_priv;
int  ridcode;
 int  enabled;
-   static int (* writer)(struct airo_info *, u16 rid, const void *, int, 
int);
+   int (*writer)(struct airo_info *, u16 rid, const void *, int, int);
unsigned char *iobuf;
 
/* Only super-user can write RIDs */
-- 
2.5.0



[PATCH] liquidio: lio_main: remove unnecessary static in setup_io_queues()

2017-07-18 Thread Gustavo A. R. Silva
Remove unnecessary static on local variables cpu_id_modulus and cpu_id.
Such variables are initialized before being used, on every execution
path throughout the function. The static has no benefit and, removing
it reduces the object file size.

This issue was detected using Coccinelle and the following semantic patch:

@bad exists@
position p;
identifier x;
type T;
@@

static T x@p;
...
x = <+...x...+>

@@
identifier x;
expression e;
type T;
position p != bad.p;
@@

-static
 T x@p;
 ... when != x
 when strict
?x = e;

In the following log you can see a significant difference in the object
file size. Also, there is a significant difference in the bss segment.
This log is the output of the size command, before and after the code
change:

before:
   textdata bss dec hex filename
  78689   15272   27808  121769   1dba9 
drivers/net/ethernet/cavium/liquidio/lio_main.o

after:
   textdata bss dec hex filename
  78667   15128   27680  121475   1da83 
drivers/net/ethernet/cavium/liquidio/lio_main.o

Signed-off-by: Gustavo A. R. Silva 
---
 drivers/net/ethernet/cavium/liquidio/lio_main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/cavium/liquidio/lio_main.c 
b/drivers/net/ethernet/cavium/liquidio/lio_main.c
index 51583ae..1d8fefa 100644
--- a/drivers/net/ethernet/cavium/liquidio/lio_main.c
+++ b/drivers/net/ethernet/cavium/liquidio/lio_main.c
@@ -2544,8 +2544,8 @@ static inline int setup_io_queues(struct octeon_device 
*octeon_dev,
 {
struct octeon_droq_ops droq_ops;
struct net_device *netdev;
-   static int cpu_id;
-   static int cpu_id_modulus;
+   int cpu_id;
+   int cpu_id_modulus;
struct octeon_droq *droq;
struct napi_struct *napi;
int q, q_no, retval = 0;
-- 
2.5.0



[PATCH net-next] net: dsa: unexport dsa_is_port_initialized

2017-07-18 Thread Vivien Didelot
The dsa_is_port_initialized helper is only used by dsa_switch_resume and
dsa_switch_suspend, if CONFIG_PM_SLEEP is enabled. Make it static to
dsa.c.

Signed-off-by: Vivien Didelot 
---
 include/net/dsa.h | 5 -
 net/dsa/dsa.c | 5 +
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index 58969b9a090c..88da272d20d0 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -256,11 +256,6 @@ static inline bool dsa_is_normal_port(struct dsa_switch 
*ds, int p)
return !dsa_is_cpu_port(ds, p) && !dsa_is_dsa_port(ds, p);
 }
 
-static inline bool dsa_is_port_initialized(struct dsa_switch *ds, int p)
-{
-   return ds->enabled_port_mask & (1 << p) && ds->ports[p].netdev;
-}
-
 static inline u8 dsa_upstream_port(struct dsa_switch *ds)
 {
struct dsa_switch_tree *dst = ds->dst;
diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index 416ac4ef9ba9..a55e2e4087a4 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -220,6 +220,11 @@ static int dsa_switch_rcv(struct sk_buff *skb, struct 
net_device *dev,
 }
 
 #ifdef CONFIG_PM_SLEEP
+static bool dsa_is_port_initialized(struct dsa_switch *ds, int p)
+{
+   return ds->enabled_port_mask & (1 << p) && ds->ports[p].netdev;
+}
+
 int dsa_switch_suspend(struct dsa_switch *ds)
 {
int i, ret = 0;
-- 
2.13.3



Re: [PATCH net-next 0/5] refine virtio-net XDP

2017-07-18 Thread Michael S. Tsirkin
On Mon, Jul 17, 2017 at 08:43:56PM +0800, Jason Wang wrote:
> Hi:
> 
> This series brings two optimizations for virtio-net XDP:
> 
> - avoid reset during XDP set
> - turn off offloads on demand

I'm glad to see this take shape - this can be
extended to optimize virtnet_get_headroom so we don't
waste room if adjust_head is enabled.

I see a couple of issues, responded to individual patches.


> Please review.
> 
> Thanks
> 
> Jason Wang (5):
>   virtio_ring: allow to store zero as the ctx
>   virtio-net: pack headroom into ctx for mergeable buffer
>   virtio-net: switch to use new ctx API for small buffer
>   virtio-net: do not reset during XDP set
>   virtio-net: switch off offloads on demand if possible on XDP set
> 
>  drivers/net/virtio_net.c | 325 
> +--
>  drivers/virtio/virtio_ring.c |   2 +-
>  2 files changed, 194 insertions(+), 133 deletions(-)
> 
> -- 
> 2.7.4


Re: [PATCH net-next 5/5] virtio-net: switch off offloads on demand if possible on XDP set

2017-07-18 Thread Michael S. Tsirkin
On Mon, Jul 17, 2017 at 08:44:01PM +0800, Jason Wang wrote:
> Current XDP implementation want guest offloads feature to be disabled

s/want/wants/

> on qemu cli.

on the device.

> This is inconvenient and means guest can't benefit from
> offloads if XDP is not used. This patch tries to address this
> limitation by disable

disabling

> the offloads on demand through control guest
> offloads. Guest offloads will be disabled and enabled on demand on XDP
> set.
> 
> Signed-off-by: Jason Wang 

In fact, since we no longer reset when XDP is set,
here device might have offloads enabled, buffers are
used but not consumed, then XDP is set.

This can result in
- packet scattered across multiple buffers
  (handled correctly but need to update the comment)
- packet may have VIRTIO_NET_HDR_F_NEEDS_CSUM, in that case
  the spec says "The checksum on the packet is incomplete".
  (probably needs to be handled by calculating the checksum).


Ideas for follow-up patches:

- skip looking at packet data completely
  won't work if you play with checksums dynamically
  but can be done if disabled on device
- allow ethtools to tweak offloads from userspace as well


> ---
>  drivers/net/virtio_net.c | 70 
> 
>  1 file changed, 65 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index e732bd6..d970c2d 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -57,6 +57,11 @@ DECLARE_EWMA(pkt_len, 0, 64)
>  
>  #define VIRTNET_DRIVER_VERSION "1.0.0"
>  
> +const unsigned long guest_offloads[] = { VIRTIO_NET_F_GUEST_TSO4,
> +  VIRTIO_NET_F_GUEST_TSO6,
> +  VIRTIO_NET_F_GUEST_ECN,
> +  VIRTIO_NET_F_GUEST_UFO };
> +
>  struct virtnet_stats {
>   struct u64_stats_sync tx_syncp;
>   struct u64_stats_sync rx_syncp;
> @@ -164,10 +169,13 @@ struct virtnet_info {
>   u8 ctrl_promisc;
>   u8 ctrl_allmulti;
>   u16 ctrl_vid;
> + u64 ctrl_offloads;
>  
>   /* Ethtool settings */
>   u8 duplex;
>   u32 speed;
> +
> + unsigned long guest_offloads;
>  };
>  
>  struct padded_vnet_hdr {
> @@ -1889,6 +1897,47 @@ static int virtnet_restore_up(struct virtio_device 
> *vdev)
>   return err;
>  }
>  
> +static int virtnet_set_guest_offloads(struct virtnet_info *vi, u64 offloads)
> +{
> + struct scatterlist sg;
> + vi->ctrl_offloads = cpu_to_virtio64(vi->vdev, offloads);
> +
> + sg_init_one(, >ctrl_offloads, sizeof(vi->ctrl_offloads));
> +
> + if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_GUEST_OFFLOADS,
> +   VIRTIO_NET_CTRL_GUEST_OFFLOADS_SET, )) {
> + dev_warn(>dev->dev, "Fail to set guest offload. \n");
> + return -EINVAL;
> + }
> +
> + return 0;
> +}
> +
> +static int virtnet_clear_guest_offloads(struct virtnet_info *vi)
> +{
> + u64 offloads = 0;
> +
> + if (!vi->guest_offloads)
> + return 0;
> +
> + if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_CSUM))
> + offloads = 1ULL << VIRTIO_NET_F_GUEST_CSUM;
> +
> + return virtnet_set_guest_offloads(vi, offloads);
> +}
> +
> +static int virtnet_restore_guest_offloads(struct virtnet_info *vi)
> +{
> + u64 offloads = vi->guest_offloads;
> +
> + if (!vi->guest_offloads)
> + return 0;
> + if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_CSUM))
> + offloads |= 1ULL << VIRTIO_NET_F_GUEST_CSUM;
> +
> + return virtnet_set_guest_offloads(vi, offloads);
> +}
> +
>  static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog,
>  struct netlink_ext_ack *extack)
>  {
> @@ -1898,10 +1947,11 @@ static int virtnet_xdp_set(struct net_device *dev, 
> struct bpf_prog *prog,
>   u16 xdp_qp = 0, curr_qp;
>   int i, err;
>  
> - if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_TSO4) ||
> - virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_TSO6) ||
> - virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_ECN) ||
> - virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_UFO)) {
> + if (!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_GUEST_OFFLOADS)
> + && (virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_TSO4) ||
> + virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_TSO6) ||
> + virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_ECN) ||
> + virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_UFO))) {
>   NL_SET_ERR_MSG_MOD(extack, "Can't set XDP while host is 
> implementing LRO, disable LRO first");
>   return -EOPNOTSUPP;
>   }
> @@ -1950,6 +2000,12 @@ static int virtnet_xdp_set(struct net_device *dev, 
> struct bpf_prog *prog,
>   for (i = 0; i < vi->max_queue_pairs; i++) {
>   old_prog = rtnl_dereference(vi->rq[i].xdp_prog);
>

Re: [PATCH net-next 11/12] net: dsa: mv88e6xxx: add Energy Detect ops

2017-07-18 Thread Andrew Lunn
> I know this looks boring, I do not particularly enjoy it myself, but I
> think this is also important. I don't mind fixing the poking function as
> well in the near future.

It would be great if you do. It could be as simple as using
phy_ethtool_get_eee() and phy_ethtool_set_eee().

  Andrew


Re: [PATCH net-next 4/5] virtio-net: do not reset during XDP set

2017-07-18 Thread Michael S. Tsirkin
On Mon, Jul 17, 2017 at 08:44:00PM +0800, Jason Wang wrote:
> We used to reset during XDP set, the main reason is we need allocate
> extra headroom for header adjustment but there's no way to know the
> headroom of exist receive buffer. This works buy maybe complex and may
> cause the network down for a while which is bad for user
> experience. So this patch tries to avoid this by:
> 
> - packing headroom into receive buffer ctx
> - check the headroom during XDP, and if it was not sufficient, copy
>   the packet into a location which has a large enough headroom

The packing is actually done by previous patches. Here is a
corrected version:

We currently reset the device during XDP set, the main reason is
that we allocate more headroom with XDP (for header adjustment).

This works but causes network downtime for users.

Previous patches encoded the headroom in the buffer context,
this makes it possible to detect the case where a buffer
with headroom insufficient for XDP is added to the queue and
XDP is enabled afterwards.

Upon detection, we handle this case by copying the packet
(slow, but it's a temporary condition).



> Signed-off-by: Jason Wang 
> ---
>  drivers/net/virtio_net.c | 230 
> ++-
>  1 file changed, 105 insertions(+), 125 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index e31b5b2..e732bd6 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -407,6 +407,67 @@ static unsigned int virtnet_get_headroom(struct 
> virtnet_info *vi)
>   return vi->xdp_queue_pairs ? VIRTIO_XDP_HEADROOM : 0;
>  }
>  
> +/* We copy and linearize packet in the following cases:
> + *
> + * 1) Packet across multiple buffers, this happens normally when rx
> + *buffer size is underestimated. Rarely, since spec does not
> + *forbid using more than one buffer even if a single buffer is
> + *sufficient for the packet, we should also deal with this case.

Latest SVN of the spec actually forbids this. See:
net: clarify device rules for mergeable buffers


> + * 2) The header room is smaller than what XDP required. In this case
> + *we should copy the packet and reserve enough headroom for this.
> + *This would be slow but we at most we can copy times of queue
> + *size, this is acceptable. What's more important, this help to
> + *avoid resetting.

Last part of the comment applies to both cases. So

+/* We copy the packet for XDP in the following cases:
+ *
+ * 1) Packet is scattered across multiple rx buffers.
+ * 2) Headroom space is insufficient.
+ *
+ * This is inefficient but it's a temporary condition that
+ * we hit right after XDP is enabled and until queue is refilled
+ * with large buffers with sufficient headroom - so it should affect
+ * at most queue size packets.

+ * Afterwards, the conditions to enable
+ * XDP should preclude the underlying device from sending packets
+ * across multiple buffers (num_buf > 1), and we make sure buffers
+ * have enough headroom.
+ */



> + * 2) The header room is smaller than what XDP required. In this case
> + *we should copy the packet and reserve enough headroom for this.
> + *This would be slow but we at most we can copy times of queue
> + *size, this is acceptable. What's more important, this help to
> + *avoid resetting.



> + */
> +static struct page *xdp_linearize_page(struct receive_queue *rq,
> +u16 *num_buf,
> +struct page *p,
> +int offset,
> +int page_off,
> +unsigned int *len)
> +{
> + struct page *page = alloc_page(GFP_ATOMIC);
> +
> + if (!page)
> + return NULL;
> +
> + memcpy(page_address(page) + page_off, page_address(p) + offset, *len);
> + page_off += *len;
> +
> + while (--*num_buf) {
> + unsigned int buflen;
> + void *buf;
> + int off;
> +
> + buf = virtqueue_get_buf(rq->vq, );
> + if (unlikely(!buf))
> + goto err_buf;
> +
> + p = virt_to_head_page(buf);
> + off = buf - page_address(p);
> +
> + /* guard against a misconfigured or uncooperative backend that
> +  * is sending packet larger than the MTU.
> +  */
> + if ((page_off + buflen) > PAGE_SIZE) {
> + put_page(p);
> + goto err_buf;
> + }
> +
> + memcpy(page_address(page) + page_off,
> +page_address(p) + off, buflen);
> + page_off += buflen;
> + put_page(p);
> + }
> +
> + /* Headroom does not contribute to packet length */
> + *len = page_off - VIRTIO_XDP_HEADROOM;
> + return page;
> +err_buf:
> + __free_pages(page, 0);
> + return NULL;
> 

[PATCH net-next] net/packet: remove unused PGV_FROM_VMALLOC definition.

2017-07-18 Thread Rami Rosen
This patch removes the definition of PGV_FROM_VMALLOC from af_packet.c.
The PGV_FROM_VMALLOC definition was already removed by 
commit 441c793a5650 ("net: cleanup unused macros in net directory"),
and its usage was removed even before by commit c56b4d90123b 
("af_packet: remove pgv.flags"); but it was added back by mistake later on,
in commit f6fb8f100b80 ("af-packet: TPACKET_V3 flexible buffer implementation").

Signed-off-by: Rami Rosen 
---
 net/packet/af_packet.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index e3beb28203eb..ee035cbe5621 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -177,8 +177,6 @@ static int packet_set_ring(struct sock *sk, union 
tpacket_req_u *req_u,
 #define BLK_PLUS_PRIV(sz_of_priv) \
(BLK_HDR_LEN + ALIGN((sz_of_priv), V3_ALIGNMENT))
 
-#define PGV_FROM_VMALLOC 1
-
 #define BLOCK_STATUS(x)((x)->hdr.bh1.block_status)
 #define BLOCK_NUM_PKTS(x)  ((x)->hdr.bh1.num_pkts)
 #define BLOCK_O2FP(x)  ((x)->hdr.bh1.offset_to_first_pkt)
-- 
2.7.4



Re: [PATCH net-next 3/5] virtio-net: switch to use new ctx API for small buffer

2017-07-18 Thread Michael S. Tsirkin
what's needed is ability to store the headroom there.

virtio-net: switch to use ctx API for small buffers

Use ctx API to store headroom for small buffers.
Following patches will retrieve this info and use it for XDP.

On Mon, Jul 17, 2017 at 08:43:59PM +0800, Jason Wang wrote:
> Switch to use ctx API for small buffer, this is need for avoiding
> reset on XDP.
> 
> Signed-off-by: Jason Wang 
> ---
>  drivers/net/virtio_net.c | 12 +++-
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 8fae9a8..e31b5b2 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -410,7 +410,8 @@ static unsigned int virtnet_get_headroom(struct 
> virtnet_info *vi)
>  static struct sk_buff *receive_small(struct net_device *dev,
>struct virtnet_info *vi,
>struct receive_queue *rq,
> -  void *buf, unsigned int len)
> +  void *buf, void *ctx,
> +  unsigned int len)
>  {
>   struct sk_buff *skb;
>   struct bpf_prog *xdp_prog;
> @@ -773,7 +774,7 @@ static int receive_buf(struct virtnet_info *vi, struct 
> receive_queue *rq,
>   else if (vi->big_packets)
>   skb = receive_big(dev, vi, rq, buf, len);
>   else
> - skb = receive_small(dev, vi, rq, buf, len);
> + skb = receive_small(dev, vi, rq, buf, ctx, len);
>  
>   if (unlikely(!skb))
>   return 0;
> @@ -812,6 +813,7 @@ static int add_recvbuf_small(struct virtnet_info *vi, 
> struct receive_queue *rq,

Let's document that ctx API is used a bit differently here:

/* Unlike mergeable buffers, all buffers are allocated to the same size,
 * except for the headroom. For this reason we do not need to use
 * mergeable_len_to_ctx here - it is enough to store the headroom as the
 * context ignoring the truesize.
 */

as an alternative, reuse the same format as mergeable buffers.

>   struct page_frag *alloc_frag = >alloc_frag;
>   char *buf;
>   unsigned int xdp_headroom = virtnet_get_headroom(vi);
> + void *ctx = (void *)(unsigned long)xdp_headroom;
>   int len = vi->hdr_len + VIRTNET_RX_PAD + GOOD_PACKET_LEN + xdp_headroom;
>   int err;
>  
> @@ -825,7 +827,7 @@ static int add_recvbuf_small(struct virtnet_info *vi, 
> struct receive_queue *rq,
>   alloc_frag->offset += len;
>   sg_init_one(rq->sg, buf + VIRTNET_RX_PAD + xdp_headroom,
>   vi->hdr_len + GOOD_PACKET_LEN);
> - err = virtqueue_add_inbuf(rq->vq, rq->sg, 1, buf, gfp);
> + err = virtqueue_add_inbuf_ctx(rq->vq, rq->sg, 1, buf, ctx, gfp);
>   if (err < 0)
>   put_page(virt_to_head_page(buf));
>  
> @@ -1034,7 +1036,7 @@ static int virtnet_receive(struct receive_queue *rq, 
> int budget)
>   void *buf;
>   struct virtnet_stats *stats = this_cpu_ptr(vi->stats);
>  
> - if (vi->mergeable_rx_bufs) {
> + if (!vi->big_packets || vi->mergeable_rx_bufs) {
>   void *ctx;
>  
>   while (received < budget &&
> @@ -2198,7 +2200,7 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
>   names = kmalloc(total_vqs * sizeof(*names), GFP_KERNEL);
>   if (!names)
>   goto err_names;
> - if (vi->mergeable_rx_bufs) {
> + if (!vi->big_packets || vi->mergeable_rx_bufs) {
>   ctx = kzalloc(total_vqs * sizeof(*ctx), GFP_KERNEL);
>   if (!ctx)
>   goto err_ctx;
> -- 
> 2.7.4


Re: [PATCH v3 00/10] constify net attribute_group structures.

2017-07-18 Thread David Miller
From: Arvind Yadav 
Date: Tue, 18 Jul 2017 15:13:44 +0530

> attribute_group are not supposed to change at runtime. All functions
> working with attribute_group provided by  work with const
> attribute_group. So mark the non-const structs as const.

Series applied, thanks.


Re: [PATCH 0/5] Netfilter fixes for net

2017-07-18 Thread David Miller
From: Pablo Neira Ayuso 
Date: Tue, 18 Jul 2017 12:13:54 +0200

> The following patchset contains Netfilter fixes for your net tree,
> they are:
> 
> 1) Missing netlink message sanity check in nfnetlink, patch from
>Mateusz Jurczyk.
> 
> 2) We now have netfilter per-netns hooks, so let's kill global hook
>infrastructure, this infrastructure is known to be racy with netns.
>We don't care about out of tree modules. Patch from Florian Westphal.
> 
> 3) find_appropriate_src() is buggy when colissions happens after the
>conversion of the nat bysource to rhashtable. Also from Florian.
> 
> 4) Remove forward chain in nf_tables arp family, it's useless and it is
>causing quite a bit of confusion, from Florian Westphal.
> 
> 5) nf_ct_remove_expect() is called with the wrong parameter, causing
>kernel oops, patch from Florian Westphal.
> 
> You can pull these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Pulled, thanks a lot.

What about that change Eric Dumazet was talking about with Florian
that stopped instantiating conntrack by default in new namespaces?

Just curious.


[PATCH V2 net] net: fix tcp reset packet flowlabel for ipv6

2017-07-18 Thread Shaohua Li
From: Shaohua Li 

Please see below tcpdump output:
21:00:48.109122 IP6 (flowlabel 0x43304, hlim 64, next-header TCP (6) payload 
length: 40) fec0::5054:ff:fe12:3456.55804 > fec0::5054:ff:fe12:3456.: Flags 
[S], cksum 0x0529 (incorrect -> 0xf56c), seq 3282214508, win 43690, options 
[mss 65476,sackOK,TS val 2500903437 ecr 0,nop,wscale 7], length 0
21:00:48.109381 IP6 (flowlabel 0xd827f, hlim 64, next-header TCP (6) payload 
length: 40) fec0::5054:ff:fe12:3456. > fec0::5054:ff:fe12:3456.55804: Flags 
[S.], cksum 0x0529 (incorrect -> 0x49ad), seq 1923801573, ack 3282214509, win 
43690, options [mss 65476,sackOK,TS val 2500903437 ecr 2500903437,nop,wscale 
7], length 0
21:00:48.109548 IP6 (flowlabel 0x43304, hlim 64, next-header TCP (6) payload 
length: 32) fec0::5054:ff:fe12:3456.55804 > fec0::5054:ff:fe12:3456.: Flags 
[.], cksum 0x0521 (incorrect -> 0x1bdf), seq 1, ack 1, win 342, options 
[nop,nop,TS val 2500903437 ecr 2500903437], length 0
21:00:48.109823 IP6 (flowlabel 0x43304, hlim 64, next-header TCP (6) payload 
length: 62) fec0::5054:ff:fe12:3456.55804 > fec0::5054:ff:fe12:3456.: Flags 
[P.], cksum 0x053f (incorrect -> 0xb8b1), seq 1:31, ack 1, win 342, options 
[nop,nop,TS val 2500903437 ecr 2500903437], length 30
21:00:48.109910 IP6 (flowlabel 0xd827f, hlim 64, next-header TCP (6) payload 
length: 32) fec0::5054:ff:fe12:3456. > fec0::5054:ff:fe12:3456.55804: Flags 
[.], cksum 0x0521 (incorrect -> 0x1bc1), seq 1, ack 31, win 342, options 
[nop,nop,TS val 2500903437 ecr 2500903437], length 0
21:00:48.110043 IP6 (flowlabel 0xd827f, hlim 64, next-header TCP (6) payload 
length: 56) fec0::5054:ff:fe12:3456. > fec0::5054:ff:fe12:3456.55804: Flags 
[P.], cksum 0x0539 (incorrect -> 0xb726), seq 1:25, ack 31, win 342, options 
[nop,nop,TS val 2500903438 ecr 2500903437], length 24
21:00:48.110173 IP6 (flowlabel 0x43304, hlim 64, next-header TCP (6) payload 
length: 32) fec0::5054:ff:fe12:3456.55804 > fec0::5054:ff:fe12:3456.: Flags 
[.], cksum 0x0521 (incorrect -> 0x1ba7), seq 31, ack 25, win 342, options 
[nop,nop,TS val 2500903438 ecr 2500903438], length 0
21:00:48.110211 IP6 (flowlabel 0xd827f, hlim 64, next-header TCP (6) payload 
length: 32) fec0::5054:ff:fe12:3456. > fec0::5054:ff:fe12:3456.55804: Flags 
[F.], cksum 0x0521 (incorrect -> 0x1ba7), seq 25, ack 31, win 342, options 
[nop,nop,TS val 2500903438 ecr 2500903437], length 0
21:00:48.151099 IP6 (flowlabel 0x43304, hlim 64, next-header TCP (6) payload 
length: 32) fec0::5054:ff:fe12:3456.55804 > fec0::5054:ff:fe12:3456.: Flags 
[.], cksum 0x0521 (incorrect -> 0x1ba6), seq 31, ack 26, win 342, options 
[nop,nop,TS val 2500903438 ecr 2500903438], length 0
21:00:49.110524 IP6 (flowlabel 0x43304, hlim 64, next-header TCP (6) payload 
length: 56) fec0::5054:ff:fe12:3456.55804 > fec0::5054:ff:fe12:3456.: Flags 
[P.], cksum 0x0539 (incorrect -> 0xb324), seq 31:55, ack 26, win 342, options 
[nop,nop,TS val 2500904438 ecr 2500903438], length 24
21:00:49.110637 IP6 (flowlabel 0xb34d5, hlim 64, next-header TCP (6) payload 
length: 20) fec0::5054:ff:fe12:3456. > fec0::5054:ff:fe12:3456.55804: Flags 
[R], cksum 0x0515 (incorrect -> 0x668c), seq 1923801599, win 0, length 0

The tcp reset packet has a different flowlabel, which causes our router
doesn't correctly close tcp connection. The reason is the normal packet
gets the skb->hash from sk->sk_txhash, which is generated randomly.
ip6_make_flowlabel then uses the hash to create a flowlabel. The reset
packet doesn't get assigned a hash, so the flowlabel is calculated with
flowi6.

Since user can't change timewait sock flowlabel, we create a flowlabel
for timewait socket with the random generated hash (sk->sk_txhash), then
use it in reset packet. In this way, the reset packet will have the same
flowlabel as normal packets.

This also fixes the flowlabel issue for reset packet if user configures
flowlabel, which is ignored previously.

Cc: Eric Dumazet 
Cc: Florent Fourcot 
Signed-off-by: Shaohua Li 
---
 include/net/ipv6.h   | 25 +
 net/ipv4/tcp_minisocks.c |  8 +++-
 net/ipv6/tcp_ipv6.c  | 18 +-
 3 files changed, 49 insertions(+), 2 deletions(-)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index f2a1ddb..0a21c17 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -804,6 +804,31 @@ static inline __be32 ip6_make_flowlabel(struct net *net, 
struct sk_buff *skb,
return flowlabel;
 }
 
+/* Like ip6_make_flowlabel, but already has hash */
+static inline __be32 ip6_make_flowlabel_from_hash(struct net *net,
+ bool autolabel, u32 hash)
+{
+   __be32 flowlabel;
+
+   if (net->ipv6.sysctl.auto_flowlabels == IP6_AUTO_FLOW_LABEL_OFF ||
+   (!autolabel &&
+net->ipv6.sysctl.auto_flowlabels != IP6_AUTO_FLOW_LABEL_FORCED))
+   return 0;
+
+   /* Since 

Re: [PATCH net-next 11/12] net: dsa: mv88e6xxx: add Energy Detect ops

2017-07-18 Thread Vivien Didelot
Hi David,

David Miller  writes:

> However, in this particular case, this issue was brought to Vivien's
> attention multiple times in the past.
>
> And I think the direct PHY poking issue is much more important than
> these seemingly endless reorganizations of the driver that Vivien is
> doing.
>
> So I personally share Andrew's serious frustration that we are doing
> constant reorgs but not addressing directly the specific issues that
> one has been made clearly aware of.

We support 26 Marvell Ethernet switch chips, so I am often comparing the
documentation of many of them to make sure the driver stops writing
arbitrary registers, ending up with many inconsistencies (like Remote
Management being currently enabled on all chips with an RMU, fix coming)
mainly due to poor documentation and device setup.

I know this looks boring, I do not particularly enjoy it myself, but I
think this is also important. I don't mind fixing the poking function as
well in the near future.


Thanks,

Vivien


Re: [PATCH net v2] udp: preserve skb->dst if required for IP options processing

2017-07-18 Thread David Miller
From: Paolo Abeni 
Date: Tue, 18 Jul 2017 11:57:55 +0200

> Eric noticed that in udp_recvmsg() we still need to access
> skb->dst while processing the IP options.
> Since commit 0a463c78d25b ("udp: avoid a cache miss on dequeue")
> skb->dst is no more available at recvmsg() time and bad things
> will happen if we enter the relevant code path.
> 
> This commit address the issue, avoid clearing skb->dst if
> any IP options are present into the relevant skb.
> Since the IP CB is contained in the first skb cacheline, we can
> test it to decide to leverage the consume_stateless_skb()
> optimization, without measurable additional cost in the faster
> path.
> 
> v1 -> v2: updated commit message tags
> 
> Fixes: 0a463c78d25b ("udp: avoid a cache miss on dequeue")
> Reported-by: Andrey Konovalov 
> Reported-by: Eric Dumazet 
> Signed-off-by: Paolo Abeni 

Applied, thank you.


Re: [PATCH net-next 2/5] virtio-net: pack headroom into ctx for mergeable buffer

2017-07-18 Thread Michael S. Tsirkin
On Mon, Jul 17, 2017 at 08:43:58PM +0800, Jason Wang wrote:
> Pack headroom into ctx, then during XDP set, we could know the size of
> headroom and copy if needed. This is required for avoiding reset on
> XDP.

Not really when XDP is set - it's when buffers are used.

virtio-net: pack headroom into ctx for mergeable buffers

Pack headroom into ctx - this way when we get a buffer we can figure out
the actual headroom that was allocated for the buffer. Will be helpful
to optimize switching between XDP and non-XDP modes which have different
headroom requirements.


> 
> Signed-off-by: Jason Wang 
> ---
>  drivers/net/virtio_net.c | 29 -
>  1 file changed, 24 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 1f8c15c..8fae9a8 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -270,6 +270,23 @@ static void skb_xmit_done(struct virtqueue *vq)
>   netif_wake_subqueue(vi->dev, vq2txq(vq));
>  }
>  
> +#define MRG_CTX_HEADER_SHIFT 22
> +static void *mergeable_len_to_ctx(unsigned int truesize,
> +   unsigned int headroom)
> +{
> + return (void *)(unsigned long)((headroom << MRG_CTX_HEADER_SHIFT) | 
> truesize);
> +}
> +
> +static unsigned int mergeable_ctx_to_headroom(void *mrg_ctx)
> +{
> + return (unsigned long)mrg_ctx >> MRG_CTX_HEADER_SHIFT;
> +}
> +
> +static unsigned int mergeable_ctx_to_truesize(void *mrg_ctx)
> +{
> + return (unsigned long)mrg_ctx & ((1 << MRG_CTX_HEADER_SHIFT) - 1);
> +}
> +
>  /* Called from bottom half context */
>  static struct sk_buff *page_to_skb(struct virtnet_info *vi,
>  struct receive_queue *rq,
> @@ -639,13 +656,14 @@ static struct sk_buff *receive_mergeable(struct 
> net_device *dev,
>   }
>   rcu_read_unlock();
>  
> - if (unlikely(len > (unsigned long)ctx)) {
> + truesize = mergeable_ctx_to_truesize(ctx);
> + if (unlikely(len > truesize)) {
>   pr_debug("%s: rx error: len %u exceeds truesize %lu\n",
>dev->name, len, (unsigned long)ctx);
>   dev->stats.rx_length_errors++;
>   goto err_skb;
>   }
> - truesize = (unsigned long)ctx;
> +
>   head_skb = page_to_skb(vi, rq, page, offset, len, truesize);
>   curr_skb = head_skb;
>  
> @@ -665,13 +683,14 @@ static struct sk_buff *receive_mergeable(struct 
> net_device *dev,
>   }
>  
>   page = virt_to_head_page(buf);
> - if (unlikely(len > (unsigned long)ctx)) {
> +
> + truesize = mergeable_ctx_to_truesize(ctx);
> + if (unlikely(len > truesize)) {
>   pr_debug("%s: rx error: len %u exceeds truesize %lu\n",
>dev->name, len, (unsigned long)ctx);
>   dev->stats.rx_length_errors++;
>   goto err_skb;
>   }
> - truesize = (unsigned long)ctx;
>  
>   num_skb_frags = skb_shinfo(curr_skb)->nr_frags;
>   if (unlikely(num_skb_frags == MAX_SKB_FRAGS)) {
> @@ -889,7 +908,7 @@ static int add_recvbuf_mergeable(struct virtnet_info *vi,
>  
>   buf = (char *)page_address(alloc_frag->page) + alloc_frag->offset;
>   buf += headroom; /* advance address leaving hole at front of pkt */
> - ctx = (void *)(unsigned long)len;
> + ctx = mergeable_len_to_ctx(len, headroom);
>   get_page(alloc_frag->page);
>   alloc_frag->offset += len + headroom;
>   hole = alloc_frag->size - alloc_frag->offset;
> -- 
> 2.7.4


Re: [patch net-next 00/22] mlxsw: Preparations for IPv6 UC router

2017-07-18 Thread David Miller
From: Jiri Pirko 
Date: Tue, 18 Jul 2017 10:10:08 +0200

> From: Jiri Pirko 
> 
> Ido says:
> 
> The purpose of this set is to prepare the driver for the introduction of
> IPv6 FIB offload. It's mainly composed of small and non-functional
> changes, that either add the IPv6 equivalent of existing IPv4 code or
> aimed at making the introduction of IPv6-specific code easier.
> 
> The first five patches enable IPv6 forwarding in the device and allow us
> to configure router interfaces (RIFs) based on inet6addr notifications.
> 
> The next six patches add support for programming IPv6 neighbours into
> the device's table as well as dumping their activity and updating the
> kernel accordingly.
> 
> The last 11 patches extend current infrastructure to allow us to program
> IPv6 routes, set catch-all IPv6 trap in case of abort and make the code
> more receptive towards up-coming changes.

A lot of small, easy to understand, changes.  I like.

Series applied, thanks.


Re: [RFC net 1/2] net: set skb hash for IP6 TCP reset packet

2017-07-18 Thread Shaohua Li
On Mon, Jul 17, 2017 at 09:02:57PM -0700, Eric Dumazet wrote:
> On Mon, 2017-07-17 at 14:53 -0700, Shaohua Li wrote:
> > On Mon, Jul 17, 2017 at 01:51:51AM -0700, Eric Dumazet wrote:
> > > On Thu, 2017-07-13 at 10:56 -0700, Shaohua Li wrote:
> > > > From: Shaohua Li 
> > > > 
> > > > Please see below tcpdump output:
> > > 
> > > > The tcp reset packet has a different flowlabel, which causes our router
> > > > doesn't correctly close tcp connection.
> > > 
> > > This looks a bug in your router, because (IPv6 only) flowlabel is not
> > > part of the tuple identifying a TCP flow.
> > 
> > Actually it's for load balance between several routers.
> 
> What happens then when flowlabel changes as I described ?
> 
> See commit 3acf3ec3f4b0 ("tcp: Change txhash on every SYN and RTO
> retransmit")

Frankly I have no idea. People in the team do think this is a problem in some
corner cases. Didn't get any report yet though.
 
> > > 
> > > >   The reason is the normal packet
> > > > gets the skb->hash from sk->sk_txhash, which is generated randomly.
> > > > ip6_make_flowlabel then uses the hash to create a flowlabel. The reset
> > > > packet doesn't get assigned a hash, so the flowlabel is calculated with
> > > > flowi6.
> > > > 
> > > > The solution is to save the hash value for timeout sock and use it for
> > > > reset packet.
> > > 
> > > I am a bit unsure why we need to add yet another field in TCP timewait
> > > structure, since :
> > > 
> > > 1) flowlabel can vary during a TCP flow lifetime.
> > > 2) flowlabel is different unde synflood (each syncookie gets a random
> > > flowlabel), and if 3rd packet comes back from the client to finish 3WHS,
> > > the flowlabel will again be different from the one that SYNACK used.
> > 
> > Is it acceptable we reuse tw_flowlabel as Florent Fourcot suggested? It 
> > makes
> > no sense to change flowlabel for no reason.
> 
> Sure, if you can find a way to keep storage as small as possible.
> 
> Current size is dangerously approaching 256 bytes, so we might soon use
> one additional cache line (64 bytes)

Will send a new patch.

Thanks,
Shaohua


Re: [PATCH net-next 2/2] liquidio: Add support to create management interface

2017-07-18 Thread Jakub Kicinski
On Mon, 17 Jul 2017 12:52:17 -0700, Felix Manlunas wrote:
> From: VSR Burru 
> 
> This patch adds support to create a virtual ethernet interface to
> communicate with Linux on LiquidIO adapter for management.
> 
> Signed-off-by: VSR Burru 
> Signed-off-by: Srinivasa Jampala 
> Signed-off-by: Satanand Burla 
> Signed-off-by: Raghu Vatsavayi 
> Signed-off-by: Felix Manlunas 

Not my call, but I have mixed feelings about this one.  Is there any
precedent under drivers/net/ethernet of exposing special communication
channels with FW like this?  It's irrelevant to me that you're running
SSH, arbitrary communication with FW from userspace is not something
netdev community usually accepts.  And I'm afraid what the effects will
be of this getting accepted.  I'm pretty sure most modern network
adapters have management CPU cores perfectly capable of running Linux.
I know NFP does, here is the out-of-tree code equivalent to this patch:

https://github.com/Netronome/nfp-drv-kmods/blob/master/src/nfpcore/nfp_net_vnic.c

I'm not looking forward to a world where I have to ssh into my NIC and
run vendor commands to configure things.


Re: [PATCH net-next 0/5] refine virtio-net XDP

2017-07-18 Thread Michael S. Tsirkin
On Tue, Jul 18, 2017 at 11:24:42AM -0700, David Miller wrote:
> From: Jason Wang 
> Date: Mon, 17 Jul 2017 20:43:56 +0800
> 
> > This series brings two optimizations for virtio-net XDP:
> > 
> > - avoid reset during XDP set
> > - turn off offloads on demand
> > 
> > Please review.
> 
> Michael, please review Jason's changes.
> 
> Thanks.

Doing that, thanks for the reminder.

-- 
MST


Re: [PATCH] atm: zatm: Fix an error handling path in 'zatm_init_one()'

2017-07-18 Thread David Miller
From: Christophe JAILLET 
Date: Mon, 17 Jul 2017 19:42:41 +0200

> If 'dma_set_mask_and_coherent()' fails, we must undo the previous
> 'pci_request_regions()' call.
> Adjust corresponding 'goto' to jump at the right place of the error
> handling path.
> 
> Signed-off-by: Christophe JAILLET 

Applied, thank you.


Re: [PATCH net-next 0/5] refine virtio-net XDP

2017-07-18 Thread David Miller
From: Jason Wang 
Date: Mon, 17 Jul 2017 20:43:56 +0800

> This series brings two optimizations for virtio-net XDP:
> 
> - avoid reset during XDP set
> - turn off offloads on demand
> 
> Please review.

Michael, please review Jason's changes.

Thanks.


Re: [PATCH] ipv4: ipv6: initialize treq->txhash in cookie_v[46]_check()

2017-07-18 Thread David Miller
From: Alexander Potapenko 
Date: Mon, 17 Jul 2017 12:35:58 +0200

> KMSAN reported use of uninitialized memory in skb_set_hash_from_sk(),
> which originated from the TCP request socket created in
> cookie_v6_check():
 ...
> Similar error is reported for cookie_v4_check().
> 
> Signed-off-by: Alexander Potapenko 
> Fixes: 58d607d3e52f ("tcp: provide skb->hash to synack packets")

Applied and queued up for -stable, thanks!


Re: [PATCH] ppp: Fix false xmit recursion detect with two ppp devices

2017-07-18 Thread David Miller
From: gfree.w...@vip.163.com
Date: Mon, 17 Jul 2017 18:34:42 +0800

> From: Gao Feng 
> 
> The global percpu variable ppp_xmit_recursion is used to detect the ppp
> xmit recursion to avoid the deadlock, which is caused by one CPU tries to
> lock the xmit lock twice. But it would report false recursion when one CPU
> wants to send the skb from two different PPP devices, like one L2TP on the
> PPPoE. It is a normal case actually.
> 
> Now use one percpu member of struct ppp instead of the gloable variable to
> detect the xmit recursion of one ppp device.
> 
> Fixes: 55454a565836 ("ppp: avoid dealock on recursive xmit")
> Signed-off-by: Gao Feng 
> Signed-off-by: Liu Jianying 

Indeed, for a per-ppp lock recursion check we need a per-ppp counter.

Applied, thanks!


Re: [PATCH net-next 0/10] xfrm: remove flow cache

2017-07-18 Thread David Miller
From: Florian Westphal 
Date: Mon, 17 Jul 2017 13:57:17 +0200

> After RCU-ification of ipsec packet path there are no major scalability
> issues anymore without flow cache.
> 
> We still incur a performance hit, which comes mostly from the extra xfrm
> dst allocation/freeing.
> The last patch in the series adds a simple percpu cache to avoid the
> extra allocation if a packet matched the same policies as last one.
> 
> The main concern with this is that we will see performance drops,
> especially with large numbers of policies/SAs.
> 
> However, during hallway discussions at nfws 2017 it seemed the issues
> with flow caching outweight the removal downsides, and that it
> might be best to just 'remove it' and see where the practical issues
> (if any) will appear.
> 
> It should now be possible to also remove the genid member in the policies
> as we don't hold bundles for prolonged time anymore, but I think
> this change is controversial (and intrusive) enough as-is, so defer
> that to a later point in time.
> 
> Changes since last rfc:
> 
> - fix build failures due to implicit interrupt.h includes
> - rework last patch (pcpu cache):
>  * avoid xchg()
>  * check policies for walk.dead = 1 instead of more costly bundle_ok().
>  * flush pcpu bundles when sa/policies get removed, to allow module
>references to go away (suggested by Ilan Tayari)

Steffen, I know you have some level of trepidation about this because
there is obviously some performance cost immediately for removing this
DoS problem.

Like the routing cache removal, most of the low hanging fruit will be
fixed shortly, and over time the bulk of the loss will be reparied one
way or another.

And, to me more importantly, killing it now gives a real incentive to
do the work for fixing that stuff now rather than later.

Therefore I have applied this series to net-next, thanks everyone.


Re: [PATCH net-next 11/11] net: switchdev: Remove bridge bypass support from switchdev

2017-07-18 Thread Vivien Didelot
Arkadi Sharshevsky  writes:

> Currently the bridge port flags, vlans, FDBs and MDBs can be offloaded
> through the bridge code, making the switchdev's SELF bridge bypass
> implementation to be redundant. This implies several changes:
> - No need for dump infra in switchdev, DSA's special case is handled
>   privately.
> - Remove obj_dump from switchdev_ops.
> - FDBs are removed from obj_add/del routines, due to the fact that they
>   are offloaded through the bridge notifcation chain.

 notification*
 
> - The switchdev_port_bridge_xx() and switchdev_port_fdb_xx() functions
>   can be removed.
>
> Signed-off-by: Arkadi Sharshevsky 

Reviewed-by: Vivien Didelot 


Re: [PATCH net-next 00/12] net: dsa: mv88e6xxx: cleanup capabilities

2017-07-18 Thread David Miller
From: Vivien Didelot 
Date: Mon, 17 Jul 2017 13:03:34 -0400

> This patch series removes the remaining capabilities as well as the
> flags bitmap in the info structures. Most of them are turned into ops,
> or new info members.
> 
> There is no mv88e6xxx_cap enum or bitmap flags anymore, only
> mv88e6xxx_info and mv88e6xxx_ops structures.
> 
> While reviewing and documenting the related G2 registers, fix a few
> inconsistencies: 88E6185 has no interrupt in G2 and 88E6390 has a POT.
> 
> Except these two adjustments, there is no functional changes.

Series applied, thanks Vivien.


Re: [PATCH net-next 10/11] net: bridge: Remove FDB deletion through switchdev object

2017-07-18 Thread Vivien Didelot
Arkadi Sharshevsky  writes:

> At this point no driver supports FDB add/del through switchdev object
> but rather via notification chain, thus, it is removed.
>
> Signed-off-by: Arkadi Sharshevsky 

Reviewed-by: Vivien Didelot 


Re: [PATCH net-next 09/11] net: dsa: Move FDB dump implementation inside DSA

2017-07-18 Thread Vivien Didelot
Hi Arkadi,

Arkadi Sharshevsky  writes:

> +typedef int dsa_fdb_dump_cb_t(const unsigned char *addr, u16 vid,
> +   u16 ndm_state, void *data);

Can I ask you to change u16 ndm_state for bool is_static at the same
time? Ethernet switches do not need to report more than that.

> +static int
> +dsa_slave_port_fdb_do_dump(const unsigned char *addr, u16 vid,
> +u16 ndm_state, void *data)
> +{
> + struct dsa_slave_dump_ctx *dump = data;
> + u32 portid = NETLINK_CB(dump->cb->skb).portid;
> + u32 seq = dump->cb->nlh->nlmsg_seq;
> + struct nlmsghdr *nlh;
> + struct ndmsg *ndm;
> +
> + if (dump->idx < dump->cb->args[2])
> + goto skip;
> +
> + nlh = nlmsg_put(dump->skb, portid, seq, RTM_NEWNEIGH,
> + sizeof(*ndm), NLM_F_MULTI);
> + if (!nlh)
> + return -EMSGSIZE;
> +
> + ndm = nlmsg_data(nlh);
> + ndm->ndm_family  = AF_BRIDGE;
> + ndm->ndm_pad1= 0;
> + ndm->ndm_pad2= 0;
> + ndm->ndm_flags   = NTF_SELF;
> + ndm->ndm_type= 0;
> + ndm->ndm_ifindex = dump->dev->ifindex;
> + ndm->ndm_state   = ndm_state;

So we can simply scope this here:

ndm->ndm_state = is_static ? NUD_NOARP : NUD_REACHABLE;

> +
> + if (nla_put(dump->skb, NDA_LLADDR, ETH_ALEN, addr))
> + goto nla_put_failure;
> +
> + if (vid && nla_put_u16(dump->skb, NDA_VLAN, vid))
> + goto nla_put_failure;
> +
> + nlmsg_end(dump->skb, nlh);
> +
> +skip:
> + dump->idx++;
> + return 0;
> +
> +nla_put_failure:
> + nlmsg_cancel(dump->skb, nlh);
> + return -EMSGSIZE;
> +}

Other than that, LGTM.


Thanks,

Vivien


Re: [PATCH net-next 11/12] net: dsa: mv88e6xxx: add Energy Detect ops

2017-07-18 Thread David Miller
From: Florian Fainelli 
Date: Tue, 18 Jul 2017 09:01:01 -0700

> On 07/17/2017 02:10 PM, David Miller wrote:
>> From: Andrew Lunn 
>> Date: Mon, 17 Jul 2017 23:04:05 +0200
>> 
>>> On Mon, Jul 17, 2017 at 01:45:49PM -0700, David Miller wrote:
 From: Vivien Didelot 
 Date: Mon, 17 Jul 2017 15:32:52 -0400

> Hi Andrew,
>
> Andrew Lunn  writes:
>
>> I never liked this. I think it is architecturally wrong for the switch
>> to be poking around in the PHY. It should ask the PHY driver. This is
>> especially true for external PHYs which might not be a Marvell PHY.
>
> I share the same concern. However this patch is just isolating the
> existing code so that we get rid of the last caps and flags and stop
> writing (without reading them first) arbitrary registers.
>
> Once this portion is moved to the PHY driver, one can remove it from
> mv88e6xxx.

 Seems a reasonable plan of action.

 Andrew, do you agree?
>>>
>>> Hi David
>>>
>>> I just fear it will not get fixed, just put into a corner to
>>> fester. Having to fix it properly before these patches are merged
>>> provides some incentive.
>> 
>> If Vivien doesn't make good on his promises to do so, tell me and
>> I will revert all of these changes.
>> 
>> Ok?
> 
> This seems to be completely unfair to Vivien, there is nothing wrong
> with his patch series per-se other than he was unfortunate enough he
> highlighted something that needs fixing. This was not a serious enough
> problem before and it cannot possibly be one now either with just a code
> move.
> 
> On a general note, we cannot have whoever was the last one to touch a
> piece of code that makes us see that this or that said piece of code is
> less than ideal be selected as the random victim for doing that cleanup,
> this just does not work. I know this is standard practice in Linux and
> other open source software (been there before with the USB maintainers),
> but this creates only one thing: making you want to runaway and scream
> lalalalala.
> 
> So let's be pragmatic and maintain a public TODO list for this driver
> that people can pick items to fix/cleanup/change that have been
> identified as candidates for patches.

However, in this particular case, this issue was brought to Vivien's
attention multiple times in the past.

And I think the direct PHY poking issue is much more important than
these seemingly endless reorganizations of the driver that Vivien is
doing.

So I personally share Andrew's serious frustration that we are doing
constant reorgs but not addressing directly the specific issues that
one has been made clearly aware of.

Thanks.


Re: Use sock_diag instead of procfs for new address families?

2017-07-18 Thread David Miller
From: Stefan Hajnoczi 
Date: Tue, 18 Jul 2017 17:18:06 +0100

> I am implementing userspace access to socket information for AF_VSOCK.
> A few hours into writing and testing a /proc/net/vsock seq_file I
> noticed that ss(8) prefers NETLINK_SOCK_DIAG over procfs.
> 
> Before potentially wasting time implementing a legacy interface that
> won't be accepted, I thought it might be good to ask :).
> 
> Which approach is preferred?
> 1. New address families must implement only sock_diag.
> 2. Both sock_diag and procfs must be implemented.
> 3. Implement whichever interface you prefer.

Do not use procfs, that is for sure.


Re: [PATCH net-next 08/11] net: dsa: Remove redundant MDB dump support

2017-07-18 Thread Vivien Didelot
Hi Arkadi,

Arkadi Sharshevsky  writes:

> Currently the MDB HW database is synced with the bridge's one, thus,
> There is no need to support special dump functionality.
>
> Signed-off-by: Arkadi Sharshevsky 
> ---
>  drivers/net/dsa/microchip/ksz_common.c |  9 -
>  drivers/net/dsa/mv88e6xxx/chip.c   | 24 
>  include/net/dsa.h  |  4 
>  net/dsa/dsa_priv.h |  2 --
>  net/dsa/port.c | 11 ---
>  net/dsa/slave.c|  3 ---

Same request as in the VLAN dump deletion patch, please.


Thank you,

Vivien


Re: [PATCH net-next 07/11] net: dsa: Remove support for bypass bridge port attributes/vlan set

2017-07-18 Thread Vivien Didelot
Hi Arkadi,

Arkadi Sharshevsky  writes:

> The bridge port attributes/vlan for DSA devices should be set only
> from bridge code. Furthermore, The vlans are synced totally with the
> bridge so there is no need for special dump support.
>
> Signed-off-by: Arkadi Sharshevsky 
> ---
>  drivers/net/dsa/b53/b53_common.c   | 44 --
>  drivers/net/dsa/b53/b53_priv.h |  3 --
>  drivers/net/dsa/bcm_sf2.c  |  1 -
>  drivers/net/dsa/dsa_loop.c | 38 ---
>  drivers/net/dsa/microchip/ksz_common.c | 41 -
>  drivers/net/dsa/mv88e6xxx/chip.c   | 56 
> --
>  include/net/dsa.h  |  4 ---
>  net/dsa/dsa_priv.h |  4 ---
>  net/dsa/port.c | 12 
>  net/dsa/slave.c|  6 

Regarding this massive deletion, can you please split it in two patches,
one deleting first the DSA core usage of .port_vlan_dump, i.e. in:

net/dsa/dsa_priv.h
net/dsa/port.c
net/dsa/slave.c

Then a second patch which deletes the .port_vlan_dump implementations?

This may sound useless but it will actually make it easy for us to
restore the VLAN dump support in drivers once we introduce an
alternative way to query the hardware.


Thanks,

Vivien


Re: [PATCH net-next 06/11] net: dsa: Add support for querying supported bridge flags

2017-07-18 Thread Vivien Didelot
Arkadi Sharshevsky  writes:

> The DSA drivers do not support bridge flags offload. Yet, this attribute
> should be added in order for the bridge to fail when one tries set a
> flag on the port, as explained in commit dc0ecabd6231 ("net: switchdev:
> Add support for querying supported bridge flags by hardware").
>
> Signed-off-by: Arkadi Sharshevsky 

Reviewed-by: Vivien Didelot 


Re: [PATCH net-next 05/11] net: dsa: Remove support for FDB add/del via SELF

2017-07-18 Thread Vivien Didelot
Arkadi Sharshevsky  writes:

> FDB add/del can be added via switchdev notification chain. Thus the support
> for configuration via switchdev objects can be removed.
>
> Signed-off-by: Arkadi Sharshevsky 

Reviewed-by: Vivien Didelot 


Re: [PATCH net-next 04/11] net: dsa: Add support for learning FDB through notification

2017-07-18 Thread Vivien Didelot
Hi Arkadi,

Arkadi Sharshevsky  writes:

> --- a/include/net/dsa.h
> +++ b/include/net/dsa.h
> @@ -451,6 +451,7 @@ void unregister_switch_driver(struct dsa_switch_driver 
> *type);
>  struct mii_bus *dsa_host_dev_to_mii_bus(struct device *dev);
>  
>  struct net_device *dsa_dev_to_net_device(struct device *dev);
> +bool dsa_schedule_work(struct work_struct *work);

You forgot to move this declaration to net/dsa/dsa_priv.h, since this is
private to DSA core and does not need to be exposed to drivers ;-)

> + err = unregister_netdevice_notifier(_slave_switchdev_notifier);
> + if (err)
> + pr_err("DSA: failed to unregister switchdev notifier (%d)\n", 
> err);

I think you meant unregister_switchdev_notifier() here.

Thanks,

Vivien


Re: [PATCH v1 1/1] dt-binding: ptp: Add SoC compatibility strings for dte ptp clock

2017-07-18 Thread Arun Parameswaran
Hi David,

On 17-07-10 06:44 AM, Rob Herring wrote:
> On Thu, Jul 06, 2017 at 10:37:57AM -0700, Arun Parameswaran wrote:
>> Add SoC specific compatibility strings to the Broadcom DTE
>> based PTP clock binding document.
>>
>> Fixed the document heading and node name.
>>
>> Fixes: 80d6076140b2 ("dt-binding: ptp: add bindings document for dte based 
>> ptp clock")
>> Signed-off-by: Arun Parameswaran 
>> ---
>>  Documentation/devicetree/bindings/ptp/brcm,ptp-dte.txt | 15 +++
>>  1 file changed, 11 insertions(+), 4 deletions(-)
> 
> Acked-by: Rob Herring 
>
Will you be picking up this change ?

Thanks
Arun


Re: Use sock_diag instead of procfs for new address families?

2017-07-18 Thread Stephen Hemminger
On Tue, 18 Jul 2017 17:18:06 +0100
Stefan Hajnoczi  wrote:

> I am implementing userspace access to socket information for AF_VSOCK.
> A few hours into writing and testing a /proc/net/vsock seq_file I
> noticed that ss(8) prefers NETLINK_SOCK_DIAG over procfs.
> 
> Before potentially wasting time implementing a legacy interface that
> won't be accepted, I thought it might be good to ask :).
> 
> Which approach is preferred?
> 1. New address families must implement only sock_diag.
> 2. Both sock_diag and procfs must be implemented.
> 3. Implement whichever interface you prefer.
> 
> Thanks,
> Stefan

You are correct, I am unlikely to take any new code using /proc
in ss.


pgpdsRBh9zhLR.pgp
Description: OpenPGP digital signature


Use sock_diag instead of procfs for new address families?

2017-07-18 Thread Stefan Hajnoczi
I am implementing userspace access to socket information for AF_VSOCK.
A few hours into writing and testing a /proc/net/vsock seq_file I
noticed that ss(8) prefers NETLINK_SOCK_DIAG over procfs.

Before potentially wasting time implementing a legacy interface that
won't be accepted, I thought it might be good to ask :).

Which approach is preferred?
1. New address families must implement only sock_diag.
2. Both sock_diag and procfs must be implemented.
3. Implement whichever interface you prefer.

Thanks,
Stefan


signature.asc
Description: PGP signature


Re: [PATCH net-next 11/12] net: dsa: mv88e6xxx: add Energy Detect ops

2017-07-18 Thread Florian Fainelli
On 07/17/2017 02:10 PM, David Miller wrote:
> From: Andrew Lunn 
> Date: Mon, 17 Jul 2017 23:04:05 +0200
> 
>> On Mon, Jul 17, 2017 at 01:45:49PM -0700, David Miller wrote:
>>> From: Vivien Didelot 
>>> Date: Mon, 17 Jul 2017 15:32:52 -0400
>>>
 Hi Andrew,

 Andrew Lunn  writes:

> I never liked this. I think it is architecturally wrong for the switch
> to be poking around in the PHY. It should ask the PHY driver. This is
> especially true for external PHYs which might not be a Marvell PHY.

 I share the same concern. However this patch is just isolating the
 existing code so that we get rid of the last caps and flags and stop
 writing (without reading them first) arbitrary registers.

 Once this portion is moved to the PHY driver, one can remove it from
 mv88e6xxx.
>>>
>>> Seems a reasonable plan of action.
>>>
>>> Andrew, do you agree?
>>
>> Hi David
>>
>> I just fear it will not get fixed, just put into a corner to
>> fester. Having to fix it properly before these patches are merged
>> provides some incentive.
> 
> If Vivien doesn't make good on his promises to do so, tell me and
> I will revert all of these changes.
> 
> Ok?

This seems to be completely unfair to Vivien, there is nothing wrong
with his patch series per-se other than he was unfortunate enough he
highlighted something that needs fixing. This was not a serious enough
problem before and it cannot possibly be one now either with just a code
move.

On a general note, we cannot have whoever was the last one to touch a
piece of code that makes us see that this or that said piece of code is
less than ideal be selected as the random victim for doing that cleanup,
this just does not work. I know this is standard practice in Linux and
other open source software (been there before with the USB maintainers),
but this creates only one thing: making you want to runaway and scream
lalalalala.

So let's be pragmatic and maintain a public TODO list for this driver
that people can pick items to fix/cleanup/change that have been
identified as candidates for patches.
-- 
Florian


[Resend, PATCH v1] ISDN: eicon: switch to use native bitmaps

2017-07-18 Thread Andy Shevchenko
Two arrays are clearly bit maps, so, make that explicit by converting to
bitmap API and remove custom helpers.

Note sig_ind() uses out of boundary bit to (looks like) protect against
potential bitmap_empty() checks for the same bitmap.

This patch removes that since:
1) that didn't guarantee atomicity anyway;
2) the first operation inside the for-loop is set bit in the bitmap
   (which effectively makes it non-empty);
3) group_optimization() doesn't utilize possible emptiness of the bitmap
   in question.

Thus, if there is a protection needed it should be implemented properly.

Signed-off-by: Andy Shevchenko 
---
- resend after v4.13-rc1 is out
 drivers/isdn/hardware/eicon/divacapi.h |  16 +--
 drivers/isdn/hardware/eicon/message.c  | 247 -
 2 files changed, 58 insertions(+), 205 deletions(-)

diff --git a/drivers/isdn/hardware/eicon/divacapi.h 
b/drivers/isdn/hardware/eicon/divacapi.h
index a315a2914d70..c4868a0d82f4 100644
--- a/drivers/isdn/hardware/eicon/divacapi.h
+++ b/drivers/isdn/hardware/eicon/divacapi.h
@@ -26,15 +26,7 @@
 
 /*#define DEBUG */
 
-
-
-
-
-
-
-
-
-
+#include 
 
 #define IMPLEMENT_DTMF 1
 #define IMPLEMENT_LINE_INTERCONNECT2 1
@@ -82,8 +74,6 @@
 #define CODEC_PERMANENT0x02
 #define ADV_VOICE  0x03
 #define MAX_CIP_TYPES  5  /* kind of CIP types for group optimization */
-#define C_IND_MASK_DWORDS  ((MAX_APPL + 32) >> 5)
-
 
 #define FAX_CONNECT_INFO_BUFFER_SIZE  256
 #define NCPI_BUFFER_SIZE  256
@@ -265,8 +255,8 @@ struct _PLCI {
word  ncci_ring_list;
byte  inc_dis_ncci_table[MAX_CHANNELS_PER_PLCI];
t_std_internal_command 
internal_command_queue[MAX_INTERNAL_COMMAND_LEVELS];
-   dword c_ind_mask_table[C_IND_MASK_DWORDS];
-   dword group_optimization_mask_table[C_IND_MASK_DWORDS];
+   DECLARE_BITMAP(c_ind_mask_table, MAX_APPL);
+   DECLARE_BITMAP(group_optimization_mask_table, MAX_APPL);
byte  RBuffer[200];
dword msg_in_queue[MSG_IN_QUEUE_SIZE/sizeof(dword)];
API_SAVE  saved_msg;
diff --git a/drivers/isdn/hardware/eicon/message.c 
b/drivers/isdn/hardware/eicon/message.c
index 3b11422b1cce..eadd1ed1e014 100644
--- a/drivers/isdn/hardware/eicon/message.c
+++ b/drivers/isdn/hardware/eicon/message.c
@@ -23,9 +23,7 @@
  *
  */
 
-
-
-
+#include 
 
 #include "platform.h"
 #include "di_defs.h"
@@ -35,19 +33,9 @@
 #include "mdm_msg.h"
 #include "divasync.h"
 
-
-
 #define FILE_ "MESSAGE.C"
 #define dprintf
 
-
-
-
-
-
-
-
-
 /*--*/
 /* This is options supported for all adapters that are server by*/
 /* XDI driver. Allo it is not necessary to ask it from every adapter*/
@@ -72,9 +60,6 @@ static dword diva_xdi_extended_features = 0;
 /*--*/
 
 static void group_optimization(DIVA_CAPI_ADAPTER *a, PLCI *plci);
-static void set_group_ind_mask(PLCI *plci);
-static void clear_group_ind_mask_bit(PLCI *plci, word b);
-static byte test_group_ind_mask_bit(PLCI *plci, word b);
 void AutomaticLaw(DIVA_CAPI_ADAPTER *);
 word CapiRelease(word);
 word CapiRegister(word);
@@ -1087,106 +1072,6 @@ static void plci_remove(PLCI *plci)
 }
 
 /*--*/
-/* Application Group function helpers   */
-/*--*/
-
-static void set_group_ind_mask(PLCI *plci)
-{
-   word i;
-
-   for (i = 0; i < C_IND_MASK_DWORDS; i++)
-   plci->group_optimization_mask_table[i] = 0xL;
-}
-
-static void clear_group_ind_mask_bit(PLCI *plci, word b)
-{
-   plci->group_optimization_mask_table[b >> 5] &= ~(1L << (b & 0x1f));
-}
-
-static byte test_group_ind_mask_bit(PLCI *plci, word b)
-{
-   return ((plci->group_optimization_mask_table[b >> 5] & (1L << (b & 
0x1f))) != 0);
-}
-
-/*--*/
-/* c_ind_mask operations for arbitrary MAX_APPL */
-/*--*/
-
-static void clear_c_ind_mask(PLCI *plci)
-{
-   word i;
-
-   for (i = 0; i < C_IND_MASK_DWORDS; i++)
-   plci->c_ind_mask_table[i] = 0;
-}
-
-static byte c_ind_mask_empty(PLCI *plci)
-{
-   word i;
-
-   i = 0;
-   while ((i < C_IND_MASK_DWORDS) && (plci->c_ind_mask_table[i] == 0))
-   i++;
-   return (i == C_IND_MASK_DWORDS);
-}
-
-static void set_c_ind_mask_bit(PLCI *plci, word b)
-{
-   plci->c_ind_mask_table[b >> 5] |= (1L << (b & 0x1f));
-}
-
-static void clear_c_ind_mask_bit(PLCI *plci, word b)
-{
-   plci->c_ind_mask_table[b >> 5] &= ~(1L << (b & 0x1f));
-}
-
-static byte test_c_ind_mask_bit(PLCI *plci, word b)
-{
-   return ((plci->c_ind_mask_table[b >> 5] & (1L 

[PATCH net-next] sfc: Add ethtool -m support for QSFP modules

2017-07-18 Thread Martin Habets
This also adds support for non-QSFP modules attached to QSFP.

Signed-off-by: Martin Habets 
---
 drivers/net/ethernet/sfc/mcdi_port.c |  224 +++---
 1 file changed, 181 insertions(+), 43 deletions(-)

diff --git a/drivers/net/ethernet/sfc/mcdi_port.c 
b/drivers/net/ethernet/sfc/mcdi_port.c
index c905971c5f3a..d3f96a8f743b 100644
--- a/drivers/net/ethernet/sfc/mcdi_port.c
+++ b/drivers/net/ethernet/sfc/mcdi_port.c
@@ -746,59 +746,171 @@ static const char *efx_mcdi_phy_test_name(struct efx_nic 
*efx,
return NULL;
 }
 
-#define SFP_PAGE_SIZE  128
-#define SFP_NUM_PAGES  2
-static int efx_mcdi_phy_get_module_eeprom(struct efx_nic *efx,
- struct ethtool_eeprom *ee, u8 *data)
+#define SFP_PAGE_SIZE  128
+#define SFF_DIAG_TYPE_OFFSET   92
+#define SFF_DIAG_ADDR_CHANGE   BIT(2)
+#define SFF_8079_NUM_PAGES 2
+#define SFF_8472_NUM_PAGES 4
+#define SFF_8436_NUM_PAGES 5
+#define SFF_DMT_LEVEL_OFFSET   94
+
+/** efx_mcdi_phy_get_module_eeprom_page() - Get a single page of module eeprom
+ * @efx:   NIC context
+ * @page:  EEPROM page number
+ * @data:  Destination data pointer
+ * @offset:Offset in page to copy from in to data
+ * @space: Space available in data
+ *
+ * Return:
+ *   >=0 - amount of data copied
+ *   <0  - error
+ */
+static int efx_mcdi_phy_get_module_eeprom_page(struct efx_nic *efx,
+  unsigned int page,
+  u8 *data, ssize_t offset,
+  ssize_t space)
 {
MCDI_DECLARE_BUF(outbuf, MC_CMD_GET_PHY_MEDIA_INFO_OUT_LENMAX);
MCDI_DECLARE_BUF(inbuf, MC_CMD_GET_PHY_MEDIA_INFO_IN_LEN);
size_t outlen;
-   int rc;
unsigned int payload_len;
-   unsigned int space_remaining = ee->len;
-   unsigned int page;
-   unsigned int page_off;
unsigned int to_copy;
-   u8 *user_data = data;
+   int rc;
 
-   BUILD_BUG_ON(SFP_PAGE_SIZE * SFP_NUM_PAGES != ETH_MODULE_SFF_8079_LEN);
+   if (offset > SFP_PAGE_SIZE)
+   return -EINVAL;
 
-   page_off = ee->offset % SFP_PAGE_SIZE;
-   page = ee->offset / SFP_PAGE_SIZE;
+   to_copy = min(space, SFP_PAGE_SIZE - offset);
 
-   while (space_remaining && (page < SFP_NUM_PAGES)) {
-   MCDI_SET_DWORD(inbuf, GET_PHY_MEDIA_INFO_IN_PAGE, page);
+   MCDI_SET_DWORD(inbuf, GET_PHY_MEDIA_INFO_IN_PAGE, page);
+   rc = efx_mcdi_rpc_quiet(efx, MC_CMD_GET_PHY_MEDIA_INFO,
+   inbuf, sizeof(inbuf),
+   outbuf, sizeof(outbuf),
+   );
 
-   rc = efx_mcdi_rpc(efx, MC_CMD_GET_PHY_MEDIA_INFO,
- inbuf, sizeof(inbuf),
- outbuf, sizeof(outbuf),
- );
-   if (rc)
-   return rc;
+   if (rc)
+   return rc;
+
+   if (outlen < (MC_CMD_GET_PHY_MEDIA_INFO_OUT_DATA_OFST +
+   SFP_PAGE_SIZE))
+   return -EIO;
+
+   payload_len = MCDI_DWORD(outbuf, GET_PHY_MEDIA_INFO_OUT_DATALEN);
+   if (payload_len != SFP_PAGE_SIZE)
+   return -EIO;
 
-   if (outlen < (MC_CMD_GET_PHY_MEDIA_INFO_OUT_DATA_OFST +
- SFP_PAGE_SIZE))
-   return -EIO;
+   memcpy(data, MCDI_PTR(outbuf, GET_PHY_MEDIA_INFO_OUT_DATA) + offset,
+  to_copy);
 
-   payload_len = MCDI_DWORD(outbuf,
-GET_PHY_MEDIA_INFO_OUT_DATALEN);
-   if (payload_len != SFP_PAGE_SIZE)
-   return -EIO;
+   return to_copy;
+}
 
-   /* Copy as much as we can into data */
-   payload_len -= page_off;
-   to_copy = (space_remaining < payload_len) ?
-   space_remaining : payload_len;
+static int efx_mcdi_phy_get_module_eeprom_byte(struct efx_nic *efx,
+  unsigned int page,
+  u8 byte)
+{
+   int rc;
+   u8 data;
 
-   memcpy(user_data,
-  MCDI_PTR(outbuf, GET_PHY_MEDIA_INFO_OUT_DATA) + page_off,
-  to_copy);
+   rc = efx_mcdi_phy_get_module_eeprom_page(efx, page, , byte, 1);
+   if (rc == 1)
+   return data;
+
+   return rc;
+}
+
+static int efx_mcdi_phy_diag_type(struct efx_nic *efx)
+{
+   /* Page zero of the EEPROM includes the diagnostic type at byte 92. */
+   return efx_mcdi_phy_get_module_eeprom_byte(efx, 0,
+  SFF_DIAG_TYPE_OFFSET);
+}
 
-   space_remaining -= to_copy;
-   user_data += to_copy;
-   page_off = 0;
-   

RE: [PATCH v2 net-next 0/3] liquidio: avoid vm low memory crashes

2017-07-18 Thread Ricardo Farrington
My apologies Leon - I did not infer that the subject line should have been 
changed from your previous correspondence.  I will correct it.

Rick

-Original Message-
From: Leon Romanovsky [mailto:l...@kernel.org] 
Sent: Monday, July 17, 2017 11:23 PM
To: Manlunas, Felix 
Cc: da...@davemloft.net; netdev@vger.kernel.org; Vatsavayi, Raghu 
; Chickles, Derek ; 
Burla, Satananda ; Ricardo Farrington 

Subject: Re: [PATCH v2 net-next 0/3] liquidio: avoid vm low memory crashes

On Mon, Jul 17, 2017 at 05:49:20PM -0700, Felix Manlunas wrote:
> From: Rick Farrington 
>
> This patchset addresses issues brought about by low memory conditions 
> in a VM.  These conditions were not seen when the driver was exercised 
> normally.  Rather, they were brought about through manual fault injection.
> They are being included in the interest of hardening the driver 
> against unforeseen circumstances.
>
> 1. Fix GPF in octeon_init_droq(); zero the allocated block 'recv_buf_list'.
>This prevents a GPF trying to access an invalid 'recv_buf_list[i]' entry
>in octeon_droq_destroy_ring_buffers() if init didn't alloc all entries.
> 2. Don't dereference a NULL ptr in octeon_droq_destroy_ring_buffers().
> 3. For defensive programming, zero the allocated block 'oct->droq[0]' in
>octeon_setup_output_queues() and 'oct->instr_queue[0]' in
>octeon_setup_instr_queues().
>
> change log:
> V1 -> V2:
> 1. Corrected syntax in 'Subject' lines; no functional or code changes.
>
> Rick Farrington (3):
>   liquidio: lowmem: init allocated memory to 0
>   liquidio: lowmem: do not dereference null ptr
>   liquidio: lowmem: init allocated memory to 0

I'm feeling déjà vu here. We already discussed that zero allocated arrays have 
nothing to do with low memory conditions. Why are you continuing to use this 
misleading term here?

>
>  drivers/net/ethernet/cavium/liquidio/octeon_device.c | 8 
>  drivers/net/ethernet/cavium/liquidio/octeon_droq.c   | 6 --
>  2 files changed, 8 insertions(+), 6 deletions(-)
>
> --
> 2.9.0
>


Attention To me and get back to me urgent

2017-07-18 Thread Mrs.Meliana Trump
Attention:Beneficiary

I am  Meliana Trump, and I am writing to inform you about your Bank
Check Draft brought back 16/07/2017 by the United Embassy Mr John
Moore from the government of Benin Republic in the white house
Washington DC been mandated to be deliver to your home address once
you reconfirm it with the one we have here with us to avoid wrong
delivery of your check draft Eighteen million united states dollars
$18,000,000,00usd that was assigned to be delivered to your humble
home address by Honorable president Donald Trump the president of this
great country this week by a delivery agent Mr John Moore

Also reconfirm your details for the check delivery by filling the form
below and send it immediately to our mrs.melianatr...@yahoo.com  in
for verification and for prompt collection of your fund.

Fill The Form Below:

1. Residential Address :
2. Next of Kin :
3. Mobile Number:
4. Fax Number :
5. Occupation :
6. Sex :
7. Age :
8. Nationality :
9. Country :
10. Marital Status :


Accept my hearty congratulation again!

Yours Sincerely,
MRS MELIANA TRUMP
FIRST_LADY
1600 Pennsylvania Ave NW, Washington, DC 20500, United States


commit 16ecba59 breaks 82574L under heavy load.

2017-07-18 Thread Lennart Sorensen
Commit 16ecba59bc333d6282ee057fb02339f77a880beb has apparently broken
at least the 82574L under heavy load (as in load heavy enough to cause
packet drops).  In this case, when running in MSI-X mode, the Other
Causes interrupt fires about 3000 times per second, but not due to link
state changes.  Unfortunately this commit changed the driver to assume
that the Other Causes interrupt can only mean link state change and
hence sets the flag that (unfortunately) means both link is down and link
state should be checked.  Since this now happens 3000 times per second,
the chances of it happening while the watchdog_task is checking the link
state becomes pretty high, and it if does happen to coincice, then the
watchdog_task will reset the adapter, which causes a real loss of link.

Reverting the commit makes everything work fine again (of course packets
are still dropped, but at least the link stays up, the adapter isn't
reset, and most packets make it through).

I tried checking what the bits in the ICR actually were under these
conditions, and it would appear that the only bit set is 24 (the Other
Causes interrupt bit).  So I don't know what the real cause is although
rx buffer overrun would be my guess, and in fact I see nothing in the
datasheet indicating that you can actually disable the rx buffer overrun
from generating an interrupt.

Prior to this commit, the interrupt handler explicitly checked that the
interrupt was caused by a link state change and only then did it trigger
a recheck which worked fine and did not cause incorrect adapter resets,
although it of course still had lots of undesired interrupts to deal with.

Of course ideally there would be a way to make these 3000 pointless
interrupts per second not happen, but unless there is a way to determine
that, I think this commit needs reverting, since it apparently causes
link failures on actual hardware that exists.

The ports are onboard intel 82574L on a Supermicro X7SPA-HF-D525 with
1.2a BIOS (upgrading to 1.2b to check if it makes a difference is not
an option unfortunately).

-- 
Len Sorensen


[PATCH net-next 03/11] net: dsa: Remove switchdev dependency from DSA switch notifier chain

2017-07-18 Thread Arkadi Sharshevsky
Currently, the switchdev objects are embedded inside the DSA notifier
info. This patch removes this dependency. This is done as a preparation
stage before adding support for learning FDB through the switchdev
notification chain.

Signed-off-by: Arkadi Sharshevsky 
Reviewed-by: Florian Fainelli 
Reviewed-by: Vivien Didelot 
---
 net/dsa/dsa_priv.h | 11 ++-
 net/dsa/port.c | 15 +--
 net/dsa/slave.c|  6 --
 net/dsa/switch.c   | 11 ---
 4 files changed, 23 insertions(+), 20 deletions(-)

diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index 428402f..2b2f124 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -43,9 +43,10 @@ struct dsa_notifier_bridge_info {
 
 /* DSA_NOTIFIER_FDB_* */
 struct dsa_notifier_fdb_info {
-   const struct switchdev_obj_port_fdb *fdb;
int sw_index;
int port;
+   const unsigned char *addr;
+   u16 vid;
 };
 
 /* DSA_NOTIFIER_MDB_* */
@@ -119,10 +120,10 @@ int dsa_port_vlan_filtering(struct dsa_port *dp, bool 
vlan_filtering,
struct switchdev_trans *trans);
 int dsa_port_ageing_time(struct dsa_port *dp, clock_t ageing_clock,
 struct switchdev_trans *trans);
-int dsa_port_fdb_add(struct dsa_port *dp,
-const struct switchdev_obj_port_fdb *fdb);
-int dsa_port_fdb_del(struct dsa_port *dp,
-const struct switchdev_obj_port_fdb *fdb);
+int dsa_port_fdb_add(struct dsa_port *dp, const unsigned char *addr,
+u16 vid);
+int dsa_port_fdb_del(struct dsa_port *dp, const unsigned char *addr,
+u16 vid);
 int dsa_port_fdb_dump(struct dsa_port *dp, struct switchdev_obj_port_fdb *fdb,
  switchdev_obj_dump_cb_t *cb);
 int dsa_port_mdb_add(struct dsa_port *dp,
diff --git a/net/dsa/port.c b/net/dsa/port.c
index bd271b9..86e0585 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -146,25 +146,28 @@ int dsa_port_ageing_time(struct dsa_port *dp, clock_t 
ageing_clock,
return dsa_port_notify(dp, DSA_NOTIFIER_AGEING_TIME, );
 }
 
-int dsa_port_fdb_add(struct dsa_port *dp,
-const struct switchdev_obj_port_fdb *fdb)
+int dsa_port_fdb_add(struct dsa_port *dp, const unsigned char *addr,
+u16 vid)
 {
struct dsa_notifier_fdb_info info = {
.sw_index = dp->ds->index,
.port = dp->index,
-   .fdb = fdb,
+   .addr = addr,
+   .vid = vid,
};
 
return dsa_port_notify(dp, DSA_NOTIFIER_FDB_ADD, );
 }
 
-int dsa_port_fdb_del(struct dsa_port *dp,
-const struct switchdev_obj_port_fdb *fdb)
+int dsa_port_fdb_del(struct dsa_port *dp, const unsigned char *addr,
+u16 vid)
 {
struct dsa_notifier_fdb_info info = {
.sw_index = dp->ds->index,
.port = dp->index,
-   .fdb = fdb,
+   .addr = addr,
+   .vid = vid,
+
};
 
return dsa_port_notify(dp, DSA_NOTIFIER_FDB_DEL, );
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index b4e68b2..19395cc 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -253,7 +253,8 @@ static int dsa_slave_port_obj_add(struct net_device *dev,
case SWITCHDEV_OBJ_ID_PORT_FDB:
if (switchdev_trans_ph_prepare(trans))
return 0;
-   err = dsa_port_fdb_add(dp, SWITCHDEV_OBJ_PORT_FDB(obj));
+   err = dsa_port_fdb_add(dp, SWITCHDEV_OBJ_PORT_FDB(obj)->addr,
+  SWITCHDEV_OBJ_PORT_FDB(obj)->vid);
break;
case SWITCHDEV_OBJ_ID_PORT_MDB:
err = dsa_port_mdb_add(dp, SWITCHDEV_OBJ_PORT_MDB(obj), trans);
@@ -279,7 +280,8 @@ static int dsa_slave_port_obj_del(struct net_device *dev,
 
switch (obj->id) {
case SWITCHDEV_OBJ_ID_PORT_FDB:
-   err = dsa_port_fdb_del(dp, SWITCHDEV_OBJ_PORT_FDB(obj));
+   err = dsa_port_fdb_del(dp, SWITCHDEV_OBJ_PORT_FDB(obj)->addr,
+  SWITCHDEV_OBJ_PORT_FDB(obj)->vid);
break;
case SWITCHDEV_OBJ_ID_PORT_MDB:
err = dsa_port_mdb_del(dp, SWITCHDEV_OBJ_PORT_MDB(obj));
diff --git a/net/dsa/switch.c b/net/dsa/switch.c
index eb20e0f..e6c06aa 100644
--- a/net/dsa/switch.c
+++ b/net/dsa/switch.c
@@ -83,8 +83,6 @@ static int dsa_switch_bridge_leave(struct dsa_switch *ds,
 static int dsa_switch_fdb_add(struct dsa_switch *ds,
  struct dsa_notifier_fdb_info *info)
 {
-   const struct switchdev_obj_port_fdb *fdb = info->fdb;
-
/* Do not care yet about other switch chips of the fabric */
if (ds->index != info->sw_index)
return 0;
@@ -92,14 +90,13 @@ static int dsa_switch_fdb_add(struct dsa_switch *ds,
if 

[PATCH net-next 08/11] net: dsa: Remove redundant MDB dump support

2017-07-18 Thread Arkadi Sharshevsky
Currently the MDB HW database is synced with the bridge's one, thus,
There is no need to support special dump functionality.

Signed-off-by: Arkadi Sharshevsky 
---
 drivers/net/dsa/microchip/ksz_common.c |  9 -
 drivers/net/dsa/mv88e6xxx/chip.c   | 24 
 include/net/dsa.h  |  4 
 net/dsa/dsa_priv.h |  2 --
 net/dsa/port.c | 11 ---
 net/dsa/slave.c|  3 ---
 6 files changed, 53 deletions(-)

diff --git a/drivers/net/dsa/microchip/ksz_common.c 
b/drivers/net/dsa/microchip/ksz_common.c
index a53ce59..4de9d90 100644
--- a/drivers/net/dsa/microchip/ksz_common.c
+++ b/drivers/net/dsa/microchip/ksz_common.c
@@ -1020,14 +1020,6 @@ static int ksz_port_mdb_del(struct dsa_switch *ds, int 
port,
return ret;
 }
 
-static int ksz_port_mdb_dump(struct dsa_switch *ds, int port,
-struct switchdev_obj_port_mdb *mdb,
-switchdev_obj_dump_cb_t *cb)
-{
-   /* this is not called by switch layer */
-   return 0;
-}
-
 static int ksz_port_mirror_add(struct dsa_switch *ds, int port,
   struct dsa_mall_mirror_tc_entry *mirror,
   bool ingress)
@@ -1090,7 +1082,6 @@ static const struct dsa_switch_ops ksz_switch_ops = {
.port_mdb_prepare   = ksz_port_mdb_prepare,
.port_mdb_add   = ksz_port_mdb_add,
.port_mdb_del   = ksz_port_mdb_del,
-   .port_mdb_dump  = ksz_port_mdb_dump,
.port_mirror_add= ksz_port_mirror_add,
.port_mirror_del= ksz_port_mirror_del,
 };
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 9cc6269..97b77b9 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -1443,15 +1443,6 @@ static int mv88e6xxx_port_db_dump_fid(struct 
mv88e6xxx_chip *chip,
fdb->ndm_state = NUD_NOARP;
else
fdb->ndm_state = NUD_REACHABLE;
-   } else if (obj->id == SWITCHDEV_OBJ_ID_PORT_MDB) {
-   struct switchdev_obj_port_mdb *mdb;
-
-   if (!is_multicast_ether_addr(addr.mac))
-   continue;
-
-   mdb = SWITCHDEV_OBJ_PORT_MDB(obj);
-   mdb->vid = vid;
-   ether_addr_copy(mdb->addr, addr.mac);
} else {
return -EOPNOTSUPP;
}
@@ -3762,20 +3753,6 @@ static int mv88e6xxx_port_mdb_del(struct dsa_switch *ds, 
int port,
return err;
 }
 
-static int mv88e6xxx_port_mdb_dump(struct dsa_switch *ds, int port,
-  struct switchdev_obj_port_mdb *mdb,
-  switchdev_obj_dump_cb_t *cb)
-{
-   struct mv88e6xxx_chip *chip = ds->priv;
-   int err;
-
-   mutex_lock(>reg_lock);
-   err = mv88e6xxx_port_db_dump(chip, port, >obj, cb);
-   mutex_unlock(>reg_lock);
-
-   return err;
-}
-
 static const struct dsa_switch_ops mv88e6xxx_switch_ops = {
.probe  = mv88e6xxx_drv_probe,
.get_tag_protocol   = mv88e6xxx_get_tag_protocol,
@@ -3809,7 +3786,6 @@ static const struct dsa_switch_ops mv88e6xxx_switch_ops = 
{
.port_mdb_prepare   = mv88e6xxx_port_mdb_prepare,
.port_mdb_add   = mv88e6xxx_port_mdb_add,
.port_mdb_del   = mv88e6xxx_port_mdb_del,
-   .port_mdb_dump  = mv88e6xxx_port_mdb_dump,
.crosschip_bridge_join  = mv88e6xxx_crosschip_bridge_join,
.crosschip_bridge_leave = mv88e6xxx_crosschip_bridge_leave,
 };
diff --git a/include/net/dsa.h b/include/net/dsa.h
index 9171b11..2d85ad2 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -406,10 +406,6 @@ struct dsa_switch_ops {
struct switchdev_trans *trans);
int (*port_mdb_del)(struct dsa_switch *ds, int port,
const struct switchdev_obj_port_mdb *mdb);
-   int (*port_mdb_dump)(struct dsa_switch *ds, int port,
-struct switchdev_obj_port_mdb *mdb,
- switchdev_obj_dump_cb_t *cb);
-
/*
 * RXNFC
 */
diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index 421df4f..85f53a0 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -131,8 +131,6 @@ int dsa_port_mdb_add(struct dsa_port *dp,
 struct switchdev_trans *trans);
 int dsa_port_mdb_del(struct dsa_port *dp,
 const struct switchdev_obj_port_mdb *mdb);
-int dsa_port_mdb_dump(struct dsa_port *dp, struct switchdev_obj_port_mdb *mdb,
- switchdev_obj_dump_cb_t *cb);
 int dsa_port_vlan_add(struct dsa_port *dp,
  const 

[PATCH net-next 02/11] net: dsa: Remove prepare phase for FDB

2017-07-18 Thread Arkadi Sharshevsky
The prepare phase for FDB add is unneeded because most of DSA devices
can have failures during bus transactions (SPI, I2C, etc.), thus, the
prepare phase cannot guarantee success of the commit stage.

The support for learning FDB through notification chain, which will be
introduced in the following patches, will provide the ability to notify
back the bridge about successful offload.

Signed-off-by: Arkadi Sharshevsky 
Reviewed-by: Vivien Didelot 
Reviewed-by: Florian Fainelli 
---
 drivers/net/dsa/b53/b53_common.c   | 17 +++--
 drivers/net/dsa/b53/b53_priv.h |  6 ++
 drivers/net/dsa/bcm_sf2.c  |  1 -
 drivers/net/dsa/microchip/ksz_common.c | 24 ++--
 drivers/net/dsa/mt7530.c   | 25 -
 drivers/net/dsa/mv88e6xxx/chip.c   | 23 +++
 drivers/net/dsa/qca8k.c| 18 +-
 include/net/dsa.h  |  4 +---
 net/dsa/dsa_priv.h |  4 +---
 net/dsa/port.c |  4 +---
 net/dsa/slave.c|  4 +++-
 net/dsa/switch.c   | 14 +++---
 12 files changed, 36 insertions(+), 108 deletions(-)

diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index d0156dc..c414b43 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1213,8 +1213,8 @@ static int b53_arl_op(struct b53_device *dev, int op, int 
port,
return b53_arl_rw_op(dev, 0);
 }
 
-int b53_fdb_prepare(struct dsa_switch *ds, int port,
-   const unsigned char *addr, u16 vid)
+int b53_fdb_add(struct dsa_switch *ds, int port,
+   const unsigned char *addr, u16 vid)
 {
struct b53_device *priv = ds->priv;
 
@@ -1224,17 +1224,7 @@ int b53_fdb_prepare(struct dsa_switch *ds, int port,
if (is5325(priv) || is5365(priv))
return -EOPNOTSUPP;
 
-   return 0;
-}
-EXPORT_SYMBOL(b53_fdb_prepare);
-
-void b53_fdb_add(struct dsa_switch *ds, int port,
-const unsigned char *addr, u16 vid)
-{
-   struct b53_device *priv = ds->priv;
-
-   if (b53_arl_op(priv, 0, port, addr, vid, true))
-   pr_err("%s: failed to add MAC address\n", __func__);
+   return b53_arl_op(priv, 0, port, addr, vid, true);
 }
 EXPORT_SYMBOL(b53_fdb_add);
 
@@ -1563,7 +1553,6 @@ static const struct dsa_switch_ops b53_switch_ops = {
.port_vlan_add  = b53_vlan_add,
.port_vlan_del  = b53_vlan_del,
.port_vlan_dump = b53_vlan_dump,
-   .port_fdb_prepare   = b53_fdb_prepare,
.port_fdb_dump  = b53_fdb_dump,
.port_fdb_add   = b53_fdb_add,
.port_fdb_del   = b53_fdb_del,
diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h
index d417bca..f29c892 100644
--- a/drivers/net/dsa/b53/b53_priv.h
+++ b/drivers/net/dsa/b53/b53_priv.h
@@ -396,10 +396,8 @@ int b53_vlan_del(struct dsa_switch *ds, int port,
 int b53_vlan_dump(struct dsa_switch *ds, int port,
  struct switchdev_obj_port_vlan *vlan,
  switchdev_obj_dump_cb_t *cb);
-int b53_fdb_prepare(struct dsa_switch *ds, int port,
-   const unsigned char *addr, u16 vid);
-void b53_fdb_add(struct dsa_switch *ds, int port,
-const unsigned char *addr, u16 vid);
+int b53_fdb_add(struct dsa_switch *ds, int port,
+   const unsigned char *addr, u16 vid);
 int b53_fdb_del(struct dsa_switch *ds, int port,
const unsigned char *addr, u16 vid);
 int b53_fdb_dump(struct dsa_switch *ds, int port,
diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 648f91b..a26e99d 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -1034,7 +1034,6 @@ static const struct dsa_switch_ops bcm_sf2_ops = {
.port_vlan_add  = b53_vlan_add,
.port_vlan_del  = b53_vlan_del,
.port_vlan_dump = b53_vlan_dump,
-   .port_fdb_prepare   = b53_fdb_prepare,
.port_fdb_dump  = b53_fdb_dump,
.port_fdb_add   = b53_fdb_add,
.port_fdb_del   = b53_fdb_del,
diff --git a/drivers/net/dsa/microchip/ksz_common.c 
b/drivers/net/dsa/microchip/ksz_common.c
index db82808..b55f364 100644
--- a/drivers/net/dsa/microchip/ksz_common.c
+++ b/drivers/net/dsa/microchip/ksz_common.c
@@ -678,14 +678,6 @@ static int ksz_port_vlan_dump(struct dsa_switch *ds, int 
port,
return err;
 }
 
-static int ksz_port_fdb_prepare(struct dsa_switch *ds, int port,
-   const unsigned char *addr, u16 vid)
-{
-   /* nothing needed */
-
-   return 0;
-}
-
 struct alu_struct {
/* entry 1 */
u8  is_static:1;
@@ -705,12 +697,13 @@ struct alu_struct {
u8  

[PATCH net-next 04/11] net: dsa: Add support for learning FDB through notification

2017-07-18 Thread Arkadi Sharshevsky
Add support for learning FDB through notification. The driver defers
the hardware update via ordered work queue. In case of a successful
FDB add a notification is sent back to bridge.

In case of hw FDB del failure the static FDB will be deleted from
the bridge, thus, the interface is moved to down state in order to
indicate inconsistent situation.

Signed-off-by: Arkadi Sharshevsky 
---
 include/net/dsa.h |   1 +
 net/dsa/dsa.c |  13 ++
 net/dsa/slave.c   | 127 +-
 3 files changed, 139 insertions(+), 2 deletions(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index f054d41..4835b0e 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -451,6 +451,7 @@ void unregister_switch_driver(struct dsa_switch_driver 
*type);
 struct mii_bus *dsa_host_dev_to_mii_bus(struct device *dev);
 
 struct net_device *dsa_dev_to_net_device(struct device *dev);
+bool dsa_schedule_work(struct work_struct *work);
 
 /* Keep inline for faster access in hot path */
 static inline bool netdev_uses_dsa(struct net_device *dev)
diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index 416ac4e..9abe6dc 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -271,10 +271,22 @@ static struct packet_type dsa_pack_type __read_mostly = {
.func   = dsa_switch_rcv,
 };
 
+static struct workqueue_struct *dsa_owq;
+
+bool dsa_schedule_work(struct work_struct *work)
+{
+   return queue_work(dsa_owq, work);
+}
+
 static int __init dsa_init_module(void)
 {
int rc;
 
+   dsa_owq = alloc_ordered_workqueue("dsa_ordered",
+ WQ_MEM_RECLAIM);
+   if (!dsa_owq)
+   return -ENOMEM;
+
rc = dsa_slave_register_notifier();
if (rc)
return rc;
@@ -294,6 +306,7 @@ static void __exit dsa_cleanup_module(void)
dsa_slave_unregister_notifier();
dev_remove_pack(_pack_type);
dsa_legacy_unregister();
+   destroy_workqueue(dsa_owq);
 }
 module_exit(dsa_cleanup_module);
 
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 19395cc..8278d08 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -1263,19 +1263,142 @@ static int dsa_slave_netdevice_event(struct 
notifier_block *nb,
return NOTIFY_DONE;
 }
 
+struct dsa_switchdev_event_work {
+   struct work_struct work;
+   struct switchdev_notifier_fdb_info fdb_info;
+   struct net_device *dev;
+   unsigned long event;
+};
+
+static void dsa_slave_switchdev_event_work(struct work_struct *work)
+{
+   struct dsa_switchdev_event_work *switchdev_work =
+   container_of(work, struct dsa_switchdev_event_work, work);
+   struct net_device *dev = switchdev_work->dev;
+   struct switchdev_notifier_fdb_info *fdb_info;
+   struct dsa_slave_priv *p = netdev_priv(dev);
+   int err;
+
+   rtnl_lock();
+   switch (switchdev_work->event) {
+   case SWITCHDEV_FDB_ADD_TO_DEVICE:
+   fdb_info = _work->fdb_info;
+   err = dsa_port_fdb_add(p->dp, fdb_info->addr, fdb_info->vid);
+   if (err) {
+   netdev_dbg(dev, "fdb add failed err=%d\n", err);
+   break;
+   }
+   call_switchdev_notifiers(SWITCHDEV_FDB_OFFLOADED, dev,
+_info->info);
+   break;
+
+   case SWITCHDEV_FDB_DEL_TO_DEVICE:
+   fdb_info = _work->fdb_info;
+   err = dsa_port_fdb_del(p->dp, fdb_info->addr, fdb_info->vid);
+   if (err) {
+   netdev_dbg(dev, "fdb del failed err=%d\n", err);
+   dev_close(dev);
+   }
+   break;
+   }
+   rtnl_unlock();
+
+   kfree(switchdev_work->fdb_info.addr);
+   kfree(switchdev_work);
+   dev_put(dev);
+}
+
+static int
+dsa_slave_switchdev_fdb_work_init(struct dsa_switchdev_event_work *
+ switchdev_work,
+ const struct switchdev_notifier_fdb_info *
+ fdb_info)
+{
+   memcpy(_work->fdb_info, fdb_info,
+  sizeof(switchdev_work->fdb_info));
+   switchdev_work->fdb_info.addr = kzalloc(ETH_ALEN, GFP_ATOMIC);
+   if (!switchdev_work->fdb_info.addr)
+   return -ENOMEM;
+   ether_addr_copy((u8 *)switchdev_work->fdb_info.addr,
+   fdb_info->addr);
+   return 0;
+}
+
+/* Called under rcu_read_lock() */
+static int dsa_slave_switchdev_event(struct notifier_block *unused,
+unsigned long event, void *ptr)
+{
+   struct net_device *dev = switchdev_notifier_info_to_dev(ptr);
+   struct dsa_switchdev_event_work *switchdev_work;
+
+   if (!dsa_slave_dev_check(dev))
+   return NOTIFY_DONE;
+
+   switchdev_work = kzalloc(sizeof(*switchdev_work), GFP_ATOMIC);
+   if (!switchdev_work)
+ 

[PATCH net-next 06/11] net: dsa: Add support for querying supported bridge flags

2017-07-18 Thread Arkadi Sharshevsky
The DSA drivers do not support bridge flags offload. Yet, this attribute
should be added in order for the bridge to fail when one tries set a
flag on the port, as explained in commit dc0ecabd6231 ("net: switchdev:
Add support for querying supported bridge flags by hardware").

Signed-off-by: Arkadi Sharshevsky 
---
 net/dsa/slave.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index a083287..bf93fbc 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -324,6 +324,9 @@ static int dsa_slave_port_attr_get(struct net_device *dev,
attr->u.ppid.id_len = sizeof(ds->index);
memcpy(>u.ppid.id, >index, attr->u.ppid.id_len);
break;
+   case SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS_SUPPORT:
+   attr->u.brport_flags_support = 0;
+   break;
default:
return -EOPNOTSUPP;
}
-- 
2.4.11



[PATCH net-next 01/11] net: dsa: Change DSA slave FDB API to be switchdev independent

2017-07-18 Thread Arkadi Sharshevsky
In order to support FDB add/del to be on a notifier chain the slave
API need to be changed to be switchdev independent.

Signed-off-by: Arkadi Sharshevsky 
Reviewed-by: Vivien Didelot 
Reviewed-by: Florian Fainelli 
---
 drivers/net/dsa/b53/b53_common.c   | 12 +---
 drivers/net/dsa/b53/b53_priv.h |  8 +++-
 drivers/net/dsa/microchip/ksz_common.c | 34 --
 drivers/net/dsa/mt7530.c   | 14 ++
 drivers/net/dsa/mv88e6xxx/chip.c   | 12 +---
 drivers/net/dsa/qca8k.c| 15 ++-
 include/net/dsa.h  |  8 +++-
 net/dsa/switch.c   |  8 +---
 8 files changed, 49 insertions(+), 62 deletions(-)

diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index e68d368..d0156dc 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1214,8 +1214,7 @@ static int b53_arl_op(struct b53_device *dev, int op, int 
port,
 }
 
 int b53_fdb_prepare(struct dsa_switch *ds, int port,
-   const struct switchdev_obj_port_fdb *fdb,
-   struct switchdev_trans *trans)
+   const unsigned char *addr, u16 vid)
 {
struct b53_device *priv = ds->priv;
 
@@ -1230,22 +1229,21 @@ int b53_fdb_prepare(struct dsa_switch *ds, int port,
 EXPORT_SYMBOL(b53_fdb_prepare);
 
 void b53_fdb_add(struct dsa_switch *ds, int port,
-const struct switchdev_obj_port_fdb *fdb,
-struct switchdev_trans *trans)
+const unsigned char *addr, u16 vid)
 {
struct b53_device *priv = ds->priv;
 
-   if (b53_arl_op(priv, 0, port, fdb->addr, fdb->vid, true))
+   if (b53_arl_op(priv, 0, port, addr, vid, true))
pr_err("%s: failed to add MAC address\n", __func__);
 }
 EXPORT_SYMBOL(b53_fdb_add);
 
 int b53_fdb_del(struct dsa_switch *ds, int port,
-   const struct switchdev_obj_port_fdb *fdb)
+   const unsigned char *addr, u16 vid)
 {
struct b53_device *priv = ds->priv;
 
-   return b53_arl_op(priv, 0, port, fdb->addr, fdb->vid, false);
+   return b53_arl_op(priv, 0, port, addr, vid, false);
 }
 EXPORT_SYMBOL(b53_fdb_del);
 
diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h
index 155a9c4..d417bca 100644
--- a/drivers/net/dsa/b53/b53_priv.h
+++ b/drivers/net/dsa/b53/b53_priv.h
@@ -397,13 +397,11 @@ int b53_vlan_dump(struct dsa_switch *ds, int port,
  struct switchdev_obj_port_vlan *vlan,
  switchdev_obj_dump_cb_t *cb);
 int b53_fdb_prepare(struct dsa_switch *ds, int port,
-   const struct switchdev_obj_port_fdb *fdb,
-   struct switchdev_trans *trans);
+   const unsigned char *addr, u16 vid);
 void b53_fdb_add(struct dsa_switch *ds, int port,
-const struct switchdev_obj_port_fdb *fdb,
-struct switchdev_trans *trans);
+const unsigned char *addr, u16 vid);
 int b53_fdb_del(struct dsa_switch *ds, int port,
-   const struct switchdev_obj_port_fdb *fdb);
+   const unsigned char *addr, u16 vid);
 int b53_fdb_dump(struct dsa_switch *ds, int port,
 struct switchdev_obj_port_fdb *fdb,
 switchdev_obj_dump_cb_t *cb);
diff --git a/drivers/net/dsa/microchip/ksz_common.c 
b/drivers/net/dsa/microchip/ksz_common.c
index b313ecd..db82808 100644
--- a/drivers/net/dsa/microchip/ksz_common.c
+++ b/drivers/net/dsa/microchip/ksz_common.c
@@ -679,8 +679,7 @@ static int ksz_port_vlan_dump(struct dsa_switch *ds, int 
port,
 }
 
 static int ksz_port_fdb_prepare(struct dsa_switch *ds, int port,
-   const struct switchdev_obj_port_fdb *fdb,
-   struct switchdev_trans *trans)
+   const unsigned char *addr, u16 vid)
 {
/* nothing needed */
 
@@ -707,8 +706,7 @@ struct alu_struct {
 };
 
 static void ksz_port_fdb_add(struct dsa_switch *ds, int port,
-const struct switchdev_obj_port_fdb *fdb,
-struct switchdev_trans *trans)
+const unsigned char *addr, u16 vid)
 {
struct ksz_device *dev = ds->priv;
u32 alu_table[4];
@@ -717,12 +715,12 @@ static void ksz_port_fdb_add(struct dsa_switch *ds, int 
port,
mutex_lock(>alu_mutex);
 
/* find any entry with mac & vid */
-   data = fdb->vid << ALU_FID_INDEX_S;
-   data |= ((fdb->addr[0] << 8) | fdb->addr[1]);
+   data = vid << ALU_FID_INDEX_S;
+   data |= ((addr[0] << 8) | addr[1]);
ksz_write32(dev, REG_SW_ALU_INDEX_0, data);
 
-   data = ((fdb->addr[2] << 24) | (fdb->addr[3] << 16));
-   data |= ((fdb->addr[4] << 8) | fdb->addr[5]);
+   data = ((addr[2] << 24) | 

[PATCH net-next 10/11] net: bridge: Remove FDB deletion through switchdev object

2017-07-18 Thread Arkadi Sharshevsky
At this point no driver supports FDB add/del through switchdev object
but rather via notification chain, thus, it is removed.

Signed-off-by: Arkadi Sharshevsky 
---
 net/bridge/br_fdb.c | 18 --
 1 file changed, 18 deletions(-)

diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index a5e4a73..a79b648 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -169,29 +169,11 @@ static void fdb_del_hw_addr(struct net_bridge *br, const 
unsigned char *addr)
}
 }
 
-static void fdb_del_external_learn(struct net_bridge_fdb_entry *f)
-{
-   struct switchdev_obj_port_fdb fdb = {
-   .obj = {
-   .orig_dev = f->dst->dev,
-   .id = SWITCHDEV_OBJ_ID_PORT_FDB,
-   .flags = SWITCHDEV_F_DEFER,
-   },
-   .vid = f->vlan_id,
-   };
-
-   ether_addr_copy(fdb.addr, f->addr.addr);
-   switchdev_port_obj_del(f->dst->dev, );
-}
-
 static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f)
 {
if (f->is_static)
fdb_del_hw_addr(br, f->addr.addr);
 
-   if (f->added_by_external_learn)
-   fdb_del_external_learn(f);
-
hlist_del_init_rcu(>hlist);
fdb_notify(br, f, RTM_DELNEIGH);
call_rcu(>rcu, fdb_rcu_free);
-- 
2.4.11



[PATCH net-next 11/11] net: switchdev: Remove bridge bypass support from switchdev

2017-07-18 Thread Arkadi Sharshevsky
Currently the bridge port flags, vlans, FDBs and MDBs can be offloaded
through the bridge code, making the switchdev's SELF bridge bypass
implementation to be redundant. This implies several changes:
- No need for dump infra in switchdev, DSA's special case is handled
  privately.
- Remove obj_dump from switchdev_ops.
- FDBs are removed from obj_add/del routines, due to the fact that they
  are offloaded through the bridge notifcation chain.
- The switchdev_port_bridge_xx() and switchdev_port_fdb_xx() functions
  can be removed.

Signed-off-by: Arkadi Sharshevsky 
---
 include/net/switchdev.h   |  75 
 net/switchdev/switchdev.c | 435 --
 2 files changed, 510 deletions(-)

diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index d2637a6..d767b79 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -74,7 +74,6 @@ struct switchdev_attr {
 enum switchdev_obj_id {
SWITCHDEV_OBJ_ID_UNDEFINED,
SWITCHDEV_OBJ_ID_PORT_VLAN,
-   SWITCHDEV_OBJ_ID_PORT_FDB,
SWITCHDEV_OBJ_ID_PORT_MDB,
 };
 
@@ -97,17 +96,6 @@ struct switchdev_obj_port_vlan {
 #define SWITCHDEV_OBJ_PORT_VLAN(obj) \
container_of(obj, struct switchdev_obj_port_vlan, obj)
 
-/* SWITCHDEV_OBJ_ID_PORT_FDB */
-struct switchdev_obj_port_fdb {
-   struct switchdev_obj obj;
-   unsigned char addr[ETH_ALEN];
-   u16 vid;
-   u16 ndm_state;
-};
-
-#define SWITCHDEV_OBJ_PORT_FDB(obj) \
-   container_of(obj, struct switchdev_obj_port_fdb, obj)
-
 /* SWITCHDEV_OBJ_ID_PORT_MDB */
 struct switchdev_obj_port_mdb {
struct switchdev_obj obj;
@@ -135,8 +123,6 @@ typedef int switchdev_obj_dump_cb_t(struct switchdev_obj 
*obj);
  * @switchdev_port_obj_add: Add an object to port (see switchdev_obj_*).
  *
  * @switchdev_port_obj_del: Delete an object from port (see switchdev_obj_*).
- *
- * @switchdev_port_obj_dump: Dump port objects (see switchdev_obj_*).
  */
 struct switchdev_ops {
int (*switchdev_port_attr_get)(struct net_device *dev,
@@ -149,9 +135,6 @@ struct switchdev_ops {
  struct switchdev_trans *trans);
int (*switchdev_port_obj_del)(struct net_device *dev,
  const struct switchdev_obj *obj);
-   int (*switchdev_port_obj_dump)(struct net_device *dev,
-  struct switchdev_obj *obj,
-  switchdev_obj_dump_cb_t *cb);
 };
 
 enum switchdev_notifier_type {
@@ -189,25 +172,10 @@ int switchdev_port_obj_add(struct net_device *dev,
   const struct switchdev_obj *obj);
 int switchdev_port_obj_del(struct net_device *dev,
   const struct switchdev_obj *obj);
-int switchdev_port_obj_dump(struct net_device *dev, struct switchdev_obj *obj,
-   switchdev_obj_dump_cb_t *cb);
 int register_switchdev_notifier(struct notifier_block *nb);
 int unregister_switchdev_notifier(struct notifier_block *nb);
 int call_switchdev_notifiers(unsigned long val, struct net_device *dev,
 struct switchdev_notifier_info *info);
-int switchdev_port_bridge_getlink(struct sk_buff *skb, u32 pid, u32 seq,
- struct net_device *dev, u32 filter_mask,
- int nlflags);
-int switchdev_port_bridge_setlink(struct net_device *dev,
- struct nlmsghdr *nlh, u16 flags);
-int switchdev_port_bridge_dellink(struct net_device *dev,
- struct nlmsghdr *nlh, u16 flags);
-int switchdev_port_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
-  struct net_device *dev, const unsigned char *addr,
-  u16 vid, u16 nlm_flags);
-int switchdev_port_fdb_del(struct ndmsg *ndm, struct nlattr *tb[],
-  struct net_device *dev, const unsigned char *addr,
-  u16 vid);
 void switchdev_port_fwd_mark_set(struct net_device *dev,
 struct net_device *group_dev,
 bool joining);
@@ -246,13 +214,6 @@ static inline int switchdev_port_obj_del(struct net_device 
*dev,
return -EOPNOTSUPP;
 }
 
-static inline int switchdev_port_obj_dump(struct net_device *dev,
- const struct switchdev_obj *obj,
- switchdev_obj_dump_cb_t *cb)
-{
-   return -EOPNOTSUPP;
-}
-
 static inline int register_switchdev_notifier(struct notifier_block *nb)
 {
return 0;
@@ -270,42 +231,6 @@ static inline int call_switchdev_notifiers(unsigned long 
val,
return NOTIFY_DONE;
 }
 
-static inline int switchdev_port_bridge_getlink(struct sk_buff *skb, u32 pid,
-   u32 seq, struct net_device *dev,
-   

[PATCH net-next 09/11] net: dsa: Move FDB dump implementation inside DSA

2017-07-18 Thread Arkadi Sharshevsky
>From all switchdev devices only DSA requires special FDB dump. This is due
to lack of ability for syncing the hardware learned FDBs with the bridge.
Due to this it is removed from switchdev and moved inside DSA.

Signed-off-by: Arkadi Sharshevsky 
---
 drivers/net/dsa/b53/b53_common.c   |  18 +++---
 drivers/net/dsa/b53/b53_priv.h |   3 +-
 drivers/net/dsa/microchip/ksz_common.c |  21 +++
 drivers/net/dsa/mt7530.c   |  13 ++---
 drivers/net/dsa/mv88e6xxx/chip.c   |  39 +
 drivers/net/dsa/qca8k.c|  13 ++---
 include/net/dsa.h  |   6 +-
 include/net/switchdev.h|  12 
 net/dsa/dsa_priv.h |   2 -
 net/dsa/port.c |  11 
 net/dsa/slave.c| 100 +
 net/switchdev/switchdev.c  |  84 ---
 12 files changed, 124 insertions(+), 198 deletions(-)

diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index 6020e88..9db895e 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1227,25 +1227,23 @@ static void b53_arl_search_rd(struct b53_device *dev, 
u8 idx,
 }
 
 static int b53_fdb_copy(int port, const struct b53_arl_entry *ent,
-   struct switchdev_obj_port_fdb *fdb,
-   switchdev_obj_dump_cb_t *cb)
+   dsa_fdb_dump_cb_t *cb, void *data)
 {
+   u16 ndm_state;
+
if (!ent->is_valid)
return 0;
 
if (port != ent->port)
return 0;
 
-   ether_addr_copy(fdb->addr, ent->mac);
-   fdb->vid = ent->vid;
-   fdb->ndm_state = ent->is_static ? NUD_NOARP : NUD_REACHABLE;
+   ndm_state = ent->is_static ? NUD_NOARP : NUD_REACHABLE;
 
-   return cb(>obj);
+   return cb(ent->mac, ent->vid, ndm_state, data);
 }
 
 int b53_fdb_dump(struct dsa_switch *ds, int port,
-struct switchdev_obj_port_fdb *fdb,
-switchdev_obj_dump_cb_t *cb)
+dsa_fdb_dump_cb_t *cb, void *data)
 {
struct b53_device *priv = ds->priv;
struct b53_arl_entry results[2];
@@ -1263,13 +1261,13 @@ int b53_fdb_dump(struct dsa_switch *ds, int port,
return ret;
 
b53_arl_search_rd(priv, 0, [0]);
-   ret = b53_fdb_copy(port, [0], fdb, cb);
+   ret = b53_fdb_copy(port, [0], cb, data);
if (ret)
return ret;
 
if (priv->num_arl_entries > 2) {
b53_arl_search_rd(priv, 1, [1]);
-   ret = b53_fdb_copy(port, [1], fdb, cb);
+   ret = b53_fdb_copy(port, [1], cb, data);
if (ret)
return ret;
 
diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h
index af5d6c1..01bd8cb 100644
--- a/drivers/net/dsa/b53/b53_priv.h
+++ b/drivers/net/dsa/b53/b53_priv.h
@@ -398,8 +398,7 @@ int b53_fdb_add(struct dsa_switch *ds, int port,
 int b53_fdb_del(struct dsa_switch *ds, int port,
const unsigned char *addr, u16 vid);
 int b53_fdb_dump(struct dsa_switch *ds, int port,
-struct switchdev_obj_port_fdb *fdb,
-switchdev_obj_dump_cb_t *cb);
+dsa_fdb_dump_cb_t *cb, void *data);
 int b53_mirror_add(struct dsa_switch *ds, int port,
   struct dsa_mall_mirror_tc_entry *mirror, bool ingress);
 void b53_mirror_del(struct dsa_switch *ds, int port,
diff --git a/drivers/net/dsa/microchip/ksz_common.c 
b/drivers/net/dsa/microchip/ksz_common.c
index 4de9d90..702191d 100644
--- a/drivers/net/dsa/microchip/ksz_common.c
+++ b/drivers/net/dsa/microchip/ksz_common.c
@@ -805,15 +805,15 @@ static void convert_alu(struct alu_struct *alu, u32 
*alu_table)
 }
 
 static int ksz_port_fdb_dump(struct dsa_switch *ds, int port,
-struct switchdev_obj_port_fdb *fdb,
-switchdev_obj_dump_cb_t *cb)
+dsa_fdb_dump_cb_t *cb, void *data)
 {
struct ksz_device *dev = ds->priv;
int ret = 0;
-   u32 data;
+   u32 ksz_data;
u32 alu_table[4];
struct alu_struct alu;
int timeout;
+   u16 ndm_state;
 
mutex_lock(>alu_mutex);
 
@@ -823,8 +823,8 @@ static int ksz_port_fdb_dump(struct dsa_switch *ds, int 
port,
do {
timeout = 1000;
do {
-   ksz_read32(dev, REG_SW_ALU_CTRL__4, );
-   if ((data & ALU_VALID) || !(data & ALU_START))
+   ksz_read32(dev, REG_SW_ALU_CTRL__4, _data);
+   if ((ksz_data & ALU_VALID) || !(ksz_data & ALU_START))
break;
usleep_range(1, 10);
} while (timeout-- > 0);

[PATCH net-next 07/11] net: dsa: Remove support for bypass bridge port attributes/vlan set

2017-07-18 Thread Arkadi Sharshevsky
The bridge port attributes/vlan for DSA devices should be set only
from bridge code. Furthermore, The vlans are synced totally with the
bridge so there is no need for special dump support.

Signed-off-by: Arkadi Sharshevsky 
---
 drivers/net/dsa/b53/b53_common.c   | 44 --
 drivers/net/dsa/b53/b53_priv.h |  3 --
 drivers/net/dsa/bcm_sf2.c  |  1 -
 drivers/net/dsa/dsa_loop.c | 38 ---
 drivers/net/dsa/microchip/ksz_common.c | 41 -
 drivers/net/dsa/mv88e6xxx/chip.c   | 56 --
 include/net/dsa.h  |  4 ---
 net/dsa/dsa_priv.h |  4 ---
 net/dsa/port.c | 12 
 net/dsa/slave.c|  6 
 10 files changed, 209 deletions(-)

diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index c414b43..6020e88 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1053,49 +1053,6 @@ int b53_vlan_del(struct dsa_switch *ds, int port,
 }
 EXPORT_SYMBOL(b53_vlan_del);
 
-int b53_vlan_dump(struct dsa_switch *ds, int port,
- struct switchdev_obj_port_vlan *vlan,
- switchdev_obj_dump_cb_t *cb)
-{
-   struct b53_device *dev = ds->priv;
-   u16 vid, vid_start = 0, pvid;
-   struct b53_vlan *vl;
-   int err = 0;
-
-   if (is5325(dev) || is5365(dev))
-   vid_start = 1;
-
-   b53_read16(dev, B53_VLAN_PAGE, B53_VLAN_PORT_DEF_TAG(port), );
-
-   /* Use our software cache for dumps, since we do not have any HW
-* operation returning only the used/valid VLANs
-*/
-   for (vid = vid_start; vid < dev->num_vlans; vid++) {
-   vl = >vlans[vid];
-
-   if (!vl->valid)
-   continue;
-
-   if (!(vl->members & BIT(port)))
-   continue;
-
-   vlan->vid_begin = vlan->vid_end = vid;
-   vlan->flags = 0;
-
-   if (vl->untag & BIT(port))
-   vlan->flags |= BRIDGE_VLAN_INFO_UNTAGGED;
-   if (pvid == vid)
-   vlan->flags |= BRIDGE_VLAN_INFO_PVID;
-
-   err = cb(>obj);
-   if (err)
-   break;
-   }
-
-   return err;
-}
-EXPORT_SYMBOL(b53_vlan_dump);
-
 /* Address Resolution Logic routines */
 static int b53_arl_op_wait(struct b53_device *dev)
 {
@@ -1552,7 +1509,6 @@ static const struct dsa_switch_ops b53_switch_ops = {
.port_vlan_prepare  = b53_vlan_prepare,
.port_vlan_add  = b53_vlan_add,
.port_vlan_del  = b53_vlan_del,
-   .port_vlan_dump = b53_vlan_dump,
.port_fdb_dump  = b53_fdb_dump,
.port_fdb_add   = b53_fdb_add,
.port_fdb_del   = b53_fdb_del,
diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h
index f29c892..af5d6c1 100644
--- a/drivers/net/dsa/b53/b53_priv.h
+++ b/drivers/net/dsa/b53/b53_priv.h
@@ -393,9 +393,6 @@ void b53_vlan_add(struct dsa_switch *ds, int port,
  struct switchdev_trans *trans);
 int b53_vlan_del(struct dsa_switch *ds, int port,
 const struct switchdev_obj_port_vlan *vlan);
-int b53_vlan_dump(struct dsa_switch *ds, int port,
- struct switchdev_obj_port_vlan *vlan,
- switchdev_obj_dump_cb_t *cb);
 int b53_fdb_add(struct dsa_switch *ds, int port,
const unsigned char *addr, u16 vid);
 int b53_fdb_del(struct dsa_switch *ds, int port,
diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index a26e99d..824a137 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -1033,7 +1033,6 @@ static const struct dsa_switch_ops bcm_sf2_ops = {
.port_vlan_prepare  = b53_vlan_prepare,
.port_vlan_add  = b53_vlan_add,
.port_vlan_del  = b53_vlan_del,
-   .port_vlan_dump = b53_vlan_dump,
.port_fdb_dump  = b53_fdb_dump,
.port_fdb_add   = b53_fdb_add,
.port_fdb_del   = b53_fdb_del,
diff --git a/drivers/net/dsa/dsa_loop.c b/drivers/net/dsa/dsa_loop.c
index fdd8f38..76d6660 100644
--- a/drivers/net/dsa/dsa_loop.c
+++ b/drivers/net/dsa/dsa_loop.c
@@ -257,43 +257,6 @@ static int dsa_loop_port_vlan_del(struct dsa_switch *ds, 
int port,
return 0;
 }
 
-static int dsa_loop_port_vlan_dump(struct dsa_switch *ds, int port,
-  struct switchdev_obj_port_vlan *vlan,
-  switchdev_obj_dump_cb_t *cb)
-{
-   struct dsa_loop_priv *ps = ds->priv;
-   struct mii_bus *bus = ps->bus;
-   struct dsa_loop_vlan *vl;
-   u16 vid, vid_start = 0;
-   int err = 0;
-
-   dev_dbg(ds->dev, "%s\n", __func__);
-
-   /* Just do a sleeping 

[PATCH net-next 05/11] net: dsa: Remove support for FDB add/del via SELF

2017-07-18 Thread Arkadi Sharshevsky
FDB add/del can be added via switchdev notification chain. Thus the support
for configuration via switchdev objects can be removed.

Signed-off-by: Arkadi Sharshevsky 
---
 net/dsa/slave.c | 12 
 1 file changed, 12 deletions(-)

diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 8278d08..a083287 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -250,12 +250,6 @@ static int dsa_slave_port_obj_add(struct net_device *dev,
 */
 
switch (obj->id) {
-   case SWITCHDEV_OBJ_ID_PORT_FDB:
-   if (switchdev_trans_ph_prepare(trans))
-   return 0;
-   err = dsa_port_fdb_add(dp, SWITCHDEV_OBJ_PORT_FDB(obj)->addr,
-  SWITCHDEV_OBJ_PORT_FDB(obj)->vid);
-   break;
case SWITCHDEV_OBJ_ID_PORT_MDB:
err = dsa_port_mdb_add(dp, SWITCHDEV_OBJ_PORT_MDB(obj), trans);
break;
@@ -279,10 +273,6 @@ static int dsa_slave_port_obj_del(struct net_device *dev,
int err;
 
switch (obj->id) {
-   case SWITCHDEV_OBJ_ID_PORT_FDB:
-   err = dsa_port_fdb_del(dp, SWITCHDEV_OBJ_PORT_FDB(obj)->addr,
-  SWITCHDEV_OBJ_PORT_FDB(obj)->vid);
-   break;
case SWITCHDEV_OBJ_ID_PORT_MDB:
err = dsa_port_mdb_del(dp, SWITCHDEV_OBJ_PORT_MDB(obj));
break;
@@ -925,8 +915,6 @@ static const struct net_device_ops dsa_slave_netdev_ops = {
.ndo_change_rx_flags= dsa_slave_change_rx_flags,
.ndo_set_rx_mode= dsa_slave_set_rx_mode,
.ndo_set_mac_address= dsa_slave_set_mac_address,
-   .ndo_fdb_add= switchdev_port_fdb_add,
-   .ndo_fdb_del= switchdev_port_fdb_del,
.ndo_fdb_dump   = switchdev_port_fdb_dump,
.ndo_do_ioctl   = dsa_slave_ioctl,
.ndo_get_iflink = dsa_slave_get_iflink,
-- 
2.4.11



[PATCH net-next 00/11] Change DSA's FDB API and perform switchdev cleanup

2017-07-18 Thread Arkadi Sharshevsky
The patchset moves the DSA driver into learning static FDB entries via
the switchdev notification chain rather then by using bridge bypass SELF
flag. 

The DSA drivers cannot sync the software bridge with hardware learned
entries and use the switchdev's implementation of bypass FDB dumping.
Because they are the only ones using this functionality, the fdb_dump
implementation is moved from switchdev code into DSA.

Finally after this changes a major cleanup in switchdev can be done.

Arkadi Sharshevsky (11):
  net: dsa: Change DSA slave FDB API to be switchdev independent
  net: dsa: Remove prepare phase for FDB
  net: dsa: Remove switchdev dependency from DSA switch notifier chain
  net: dsa: Add support for learning FDB through notification
  net: dsa: Remove support for FDB add/del via SELF
  net: dsa: Add support for querying supported bridge flags
  net: dsa: Remove support for bypass bridge port attributes/vlan set
  net: dsa: Remove redundant MDB dump support
  net: dsa: Move FDB dump implementation inside DSA
  net: bridge: Remove FDB deletion through switchdev object
  net: switchdev: Remove bridge bypass support from switchdev

 drivers/net/dsa/b53/b53_common.c   |  85 +-
 drivers/net/dsa/b53/b53_priv.h |  16 +-
 drivers/net/dsa/bcm_sf2.c  |   2 -
 drivers/net/dsa/dsa_loop.c |  38 ---
 drivers/net/dsa/microchip/ksz_common.c | 125 +++-
 drivers/net/dsa/mt7530.c   |  44 +--
 drivers/net/dsa/mv88e6xxx/chip.c   | 148 ++
 drivers/net/dsa/qca8k.c|  40 +--
 include/net/dsa.h  |  25 +-
 include/net/switchdev.h|  87 --
 net/bridge/br_fdb.c|  18 --
 net/dsa/dsa.c  |  13 +
 net/dsa/dsa_priv.h |  21 +-
 net/dsa/port.c |  51 +---
 net/dsa/slave.c| 247 +---
 net/dsa/switch.c   |  21 +-
 net/switchdev/switchdev.c  | 519 -
 17 files changed, 341 insertions(+), 1159 deletions(-)

-- 
2.4.11



[PATCH] NET: dwmac: Make dwmac reset unconditional

2017-07-18 Thread Eugeniy Paltsev
Unconditional reset dwmac before HW init if reset controller is present.

In existing implementation we reset dwmac only after second module
probing:
(module load -> unload -> load again [reset happens])

Now we reset dwmac at every module load:
(module load [reset happens] -> unload -> load again [reset happens])

Also some reset controllers have only reset callback instead of
assert + deassert callbacks pair, so handle this case.

Signed-off-by: Eugeniy Paltsev 
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c 
b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 12236da..c7b3d0d 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -4096,8 +4096,15 @@ int stmmac_dvr_probe(struct device *device,
if ((phyaddr >= 0) && (phyaddr <= 31))
priv->plat->phy_addr = phyaddr;
 
-   if (priv->plat->stmmac_rst)
+   if (priv->plat->stmmac_rst) {
+   ret = reset_control_assert(priv->plat->stmmac_rst);
reset_control_deassert(priv->plat->stmmac_rst);
+   /* Some reset controllers have only reset callback instead of
+* assert + deassert callbacks pair.
+*/
+   if (ret == -ENOTSUPP)
+   reset_control_reset(priv->plat->stmmac_rst);
+   }
 
/* Init MAC and get the capabilities */
ret = stmmac_hw_init(priv);
-- 
2.9.3



Re: [PATCH net-next] mdio_bus: Remove unneeded gpiod NULL check

2017-07-18 Thread Andrew Lunn
On Tue, Jul 18, 2017 at 09:52:51AM -0300, Fabio Estevam wrote:
> On Tue, Jul 18, 2017 at 9:48 AM, Sergei Shtylyov
>  wrote:
> > On 07/18/2017 03:39 PM, Fabio Estevam wrote:
> >
> >>>Won't this result in kernel WARNING when GPIO is disabled?
> >
> >
> >GPIO support, I was going to type...
> >
> >> Not sure if I understood your point, but gpiod_set_value_cansleep() is
> >> a no-op when the gpiod is NULL.
> >
> >
> >Look at the stub in , it has WARN_ON(1).
> 
> This patch does not alter the behavior of the driver with respect to
> GPIO being disabled, so I still do not understand your concern.

http://elixir.free-electrons.com/linux/latest/source/include/linux/gpio/consumer.h#L345
static inline void gpiod_set_value_cansleep(struct gpio_desc *desc, int value)
{
/* GPIO can never have been requested */
WARN_ON(1);
}

But i would say this is a gpio problem. If GPIO enabled does not care,
GPIO disabled should also not care.

Adding Linus Walleij.

 Andrew


Re: [PATCH net-next] mdio_bus: Remove unneeded gpiod NULL check

2017-07-18 Thread Sergei Shtylyov

On 07/18/2017 04:09 PM, Fabio Estevam wrote:


   No, it does -- devm_gpiod_get_optinal() will return NULL in that case,
bus->reset_gpio will remanin NULL, and you're removing the NULL checks
around the gpiod_set_value_cansleep() calls. Perhaps it's the problem in the
GPIO support though...


It is perfectly fine to call gpiod_set_value_cansleep() with a NULL
gpio descriptor.


   Depends on whether CONFIG_HPIOLIB is enabled or not.


Please take a look at drivers/gpio/gpiolib.c:


   If CONF(G_GPIOLIB=n, the stub from 
gpiod_set_value_cansleep() calls VALIDATE_DESC_VOID

Then if you look at the definition of VALIDATE_DESC_VOID you will see
that it does a NULL check on desc and returns immediately if it is
NULL.


   Sre, I did see that.


This means we are safe here :-)


   Sigh...

MBR, Sergei



  1   2   >