date:20181115

Re: [RFC v1 2/3] vxlan: add support for underlay in non-default VRF

2018-11-15 Thread David Ahern

On 11/15/18 2:05 AM, Alexis Bauvin wrote:
> Le 14 nov. 2018 à 20:58, David Ahern  a écrit :
>>
>> you are making this more specific than it needs to be 
>>
>> On 11/14/18 1:31 AM, Alexis Bauvin wrote:
>>> diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
>>> index 27bd586b94b0..7477b5510a04 100644
>>> --- a/drivers/net/vxlan.c
>>> +++ b/drivers/net/vxlan.c
>>> @@ -208,11 +208,23 @@ static inline struct vxlan_rdst 
>>> *first_remote_rtnl(struct vxlan_fdb *fdb)
>>> return list_first_entry(>remotes, struct vxlan_rdst, list);
>>> }
>>>
>>> +static int vxlan_get_l3mdev(struct net *net, int ifindex)
>>> +{
>>> +   struct net_device *dev;
>>> +
>>> +   dev = __dev_get_by_index(net, ifindex);
>>> +   while (dev && !netif_is_l3_master(dev))
>>> +   dev = netdev_master_upper_dev_get(dev);
>>> +
>>> +   return dev ? dev->ifindex : 0;
>>> +}
>>
>> l3mdev_master_ifindex_by_index should work instead of defining this for
>> vxlan.
>>
>> But I do not believe you need this function.
> 
> l3mdev_master_ifindex_by_index does not recursively climbs up the master 
> chain.
> This means that if the l3mdev is not a direct master of the device, it will 
> not
> be found.
> 
> E.G. Calling l3mdev_master_ifindex_by_index with the index of eth0 will
> return 0:
> 
> +--+ +-+ +--+
> |  | | | |  |
> | eth0 +-+ br0 +-+ vrf-blue |
> |  | | | |  |
> +--+ +-+ +--+
> 

eth0 is not the L3/router interface in this picture; br0 is. There
should not be a need for invoking l3mdev_master_ifindex_by_index on eth0.

What device stacking are you expecting to handle with vxlan devices?
vxlan on eth0 with vxlan devices in a VRF? vxlan devices into a bridge
with the bridge (or SVI) enslaved to a VRF?


> This is because the underlying l3mdev_master_dev_rcu function fetches the 
> master
> (br0 in this case), checks whether it is an l3mdev (which it is not), and
> returns its index if so.
> 
> So if using l3mdev_master_dev_rcu, using eth0 as a lower device will still 
> bind
> to no specific device, thus in the default VRF.
> 
> Maybe I should have patched l3mdev_master_dev_rcu to do a recursive resolution
> (as vxlan_get_l3mdev does), but I don’t know the impact of such a change.

no, that is definitely the wrong the approach.

Re: [PATCH] [PATCH net-next] tun: fix multiqueue rx

2018-11-15 Thread Jason Wang




On 2018/11/16 下午3:00, Matthew Cover wrote:

When writing packets to a descriptor associated with a combined queue, the
packets should end up on that queue.

Before this change all packets written to any descriptor associated with a
tap interface end up on rx-0, even when the descriptor is associated with a
different queue.

The rx traffic can be generated by either of the following.
   1. a simple tap program which spins up multiple queues and writes packets
  to each of the file descriptors
   2. tx from a qemu vm with a tap multiqueue netdev

The queue for rx traffic can be observed by either of the following (done
on the hypervisor in the qemu case).
   1. a simple netmap program which opens and reads from per-queue
  descriptors
   2. configuring RPS and doing per-cpu captures with rxtxcpu

Alternatively, if you printk() the return value of skb_get_rx_queue() just
before each instance of netif_receive_skb() in tun.c, you will get 65535
for every skb.

Calling skb_record_rx_queue() to set the rx queue to the queue_index fixes
the association between descriptor and rx queue.

Signed-off-by: Matthew Cover 
---
  drivers/net/tun.c | 7 ++-
  1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index a65779c6d72f..ce8620f3ea5e 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1536,6 +1536,7 @@ static void tun_rx_batched(struct tun_struct *tun, struct 
tun_file *tfile,
  
  	if (!rx_batched || (!more && skb_queue_empty(queue))) {

local_bh_disable();
+   skb_record_rx_queue(skb, tfile->queue_index);
netif_receive_skb(skb);
local_bh_enable();
return;
@@ -1555,8 +1556,11 @@ static void tun_rx_batched(struct tun_struct *tun, 
struct tun_file *tfile,
struct sk_buff *nskb;
  
  		local_bh_disable();

-   while ((nskb = __skb_dequeue(_queue)))
+   while ((nskb = __skb_dequeue(_queue))) {
+   skb_record_rx_queue(nskb, tfile->queue_index);
netif_receive_skb(nskb);
+   }
+   skb_record_rx_queue(skb, tfile->queue_index);
netif_receive_skb(skb);
local_bh_enable();
}
@@ -2452,6 +2456,7 @@ static int tun_xdp_one(struct tun_struct *tun,
!tfile->detached)
rxhash = __skb_get_hash_symmetric(skb);
  
+	skb_record_rx_queue(skb, tfile->queue_index);

netif_receive_skb(skb);
  
  	stats = get_cpu_ptr(tun->pcpu_stats);



Acked-by: Jason Wang

[PATCH] [PATCH net-next] tun: fix multiqueue rx

2018-11-15 Thread Matthew Cover

When writing packets to a descriptor associated with a combined queue, the
packets should end up on that queue.

Before this change all packets written to any descriptor associated with a
tap interface end up on rx-0, even when the descriptor is associated with a
different queue.

The rx traffic can be generated by either of the following.
  1. a simple tap program which spins up multiple queues and writes packets
 to each of the file descriptors
  2. tx from a qemu vm with a tap multiqueue netdev

The queue for rx traffic can be observed by either of the following (done
on the hypervisor in the qemu case).
  1. a simple netmap program which opens and reads from per-queue
 descriptors
  2. configuring RPS and doing per-cpu captures with rxtxcpu

Alternatively, if you printk() the return value of skb_get_rx_queue() just
before each instance of netif_receive_skb() in tun.c, you will get 65535
for every skb.

Calling skb_record_rx_queue() to set the rx queue to the queue_index fixes
the association between descriptor and rx queue.

Signed-off-by: Matthew Cover 
---
 drivers/net/tun.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index a65779c6d72f..ce8620f3ea5e 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1536,6 +1536,7 @@ static void tun_rx_batched(struct tun_struct *tun, struct 
tun_file *tfile,
 
if (!rx_batched || (!more && skb_queue_empty(queue))) {
local_bh_disable();
+   skb_record_rx_queue(skb, tfile->queue_index);
netif_receive_skb(skb);
local_bh_enable();
return;
@@ -1555,8 +1556,11 @@ static void tun_rx_batched(struct tun_struct *tun, 
struct tun_file *tfile,
struct sk_buff *nskb;
 
local_bh_disable();
-   while ((nskb = __skb_dequeue(_queue)))
+   while ((nskb = __skb_dequeue(_queue))) {
+   skb_record_rx_queue(nskb, tfile->queue_index);
netif_receive_skb(nskb);
+   }
+   skb_record_rx_queue(skb, tfile->queue_index);
netif_receive_skb(skb);
local_bh_enable();
}
@@ -2452,6 +2456,7 @@ static int tun_xdp_one(struct tun_struct *tun,
!tfile->detached)
rxhash = __skb_get_hash_symmetric(skb);
 
+   skb_record_rx_queue(skb, tfile->queue_index);
netif_receive_skb(skb);
 
stats = get_cpu_ptr(tun->pcpu_stats);
-- 
2.15.2 (Apple Git-101.1)

Re: [PATCH net] sctp: not allow to set asoc prsctp_enable by sockopt

2018-11-15 Thread Xin Long

On Fri, Nov 16, 2018 at 2:22 AM Marcelo Ricardo Leitner
 wrote:
>
> On Thu, Nov 15, 2018 at 07:14:28PM +0800, Xin Long wrote:
> > As rfc7496#section4.5 says about SCTP_PR_SUPPORTED:
> >
> >This socket option allows the enabling or disabling of the
> >negotiation of PR-SCTP support for future associations.  For existing
> >associations, it allows one to query whether or not PR-SCTP support
> >was negotiated on a particular association.
> >
> > It means only sctp sock's prsctp_enable can be set.
> >
> > Note that for the limitation of SCTP_{CURRENT|ALL}_ASSOC, we will
> > add it when introducing SCTP_{FUTURE|CURRENT|ALL}_ASSOC for linux
> > sctp in another patchset.
> >
> > Fixes: 28aa4c26fce2 ("sctp: add SCTP_PR_SUPPORTED on sctp sockopt")
> > Reported-by: Ying Xu 
> > Signed-off-by: Xin Long 
> > ---
> >  net/sctp/socket.c | 13 +++--
> >  1 file changed, 3 insertions(+), 10 deletions(-)
> >
> > diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> > index 739f3e5..e9b8232 100644
> > --- a/net/sctp/socket.c
> > +++ b/net/sctp/socket.c
> > @@ -3940,7 +3940,6 @@ static int sctp_setsockopt_pr_supported(struct sock 
> > *sk,
> >   unsigned int optlen)
> >  {
> >   struct sctp_assoc_value params;
> > - struct sctp_association *asoc;
> >   int retval = -EINVAL;
> >
> >   if (optlen != sizeof(params))
> > @@ -3951,16 +3950,10 @@ static int sctp_setsockopt_pr_supported(struct sock 
> > *sk,
> >   goto out;
> >   }
> >
> > - asoc = sctp_id2assoc(sk, params.assoc_id);
> > - if (asoc) {
> > - asoc->prsctp_enable = !!params.assoc_value;
> > - } else if (!params.assoc_id) {
> > - struct sctp_sock *sp = sctp_sk(sk);
> > -
> > - sp->ep->prsctp_enable = !!params.assoc_value;
> > - } else {
> > + if (sctp_style(sk, UDP) && sctp_id2assoc(sk, params.assoc_id))
I got this semantic from BSD's SCTP_PR_SUPPORTED sockopt:
SCTP_FIND_STCB(inp, stcb, av->assoc_id);

if (stcb) {
SCTP_LTRACE_ERR_RET(...);
error = EINVAL;
SCTP_TCB_UNLOCK(stcb);
} else {
...
}

>
> This would allow using a non-existent assoc id on UDP-style sockets to
> set it at the socket, which is not expected. It should be more like:
>
> +   if (sctp_style(sk, UDP) && params.assoc_id)
This way is more strict, but it seems reasonable.

When a user sets params.assoc_id for UDP type socket, it should be
thought as he WANTs to apply this on assoc, which is not allowed here.

Re: [PATCH] [PATCH net-next] tun: fix multiqueue rx

2018-11-15 Thread Jason Wang




On 2018/11/16 下午12:10, Matthew Cover wrote:

When writing packets to a descriptor associated with a combined queue, the
packets should end up on that queue.

Before this change all packets written to any descriptor associated with a
tap interface end up on rx-0, even when the descriptor is associated with a
different queue.

The rx traffic can be generated by either of the following.
   1. a simple tap program which spins up multiple queues and writes packets
  to each of the file descriptors
   2. tx from a qemu vm with a tap multiqueue netdev

The queue for rx traffic can be observed by either of the following (done
on the hypervisor in the qemu case).
   1. a simple netmap program which opens and reads from per-queue
  descriptors
   2. configuring RPS and doing per-cpu captures with rxtxcpu

Alternatively, if you printk() the return value of skb_get_rx_queue() just
before each instance of netif_receive_skb() in tun.c, you will get 65535
for every skb.

Calling skb_record_rx_queue() to set the rx queue to the queue_index fixes
the association between descriptor and rx queue.

Signed-off-by: Matthew Cover 
---
  drivers/net/tun.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index a65779c6d72f..4e306ff3501c 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1536,6 +1536,7 @@ static void tun_rx_batched(struct tun_struct *tun, struct 
tun_file *tfile,
  
  	if (!rx_batched || (!more && skb_queue_empty(queue))) {

local_bh_disable();
+   skb_record_rx_queue(skb, tfile->queue_index);
netif_receive_skb(skb);
local_bh_enable();
return;
@@ -1555,8 +1556,11 @@ static void tun_rx_batched(struct tun_struct *tun, 
struct tun_file *tfile,
struct sk_buff *nskb;
  
  		local_bh_disable();

-   while ((nskb = __skb_dequeue(_queue)))
+   while ((nskb = __skb_dequeue(_queue))) {
+   skb_record_rx_queue(nskb, tfile->queue_index);
netif_receive_skb(nskb);
+   }
+   skb_record_rx_queue(skb, tfile->queue_index);
netif_receive_skb(skb);
local_bh_enable();
}



Thanks for the fix. Actually, there's another path which needs to be 
fixed as well in tun_xdp_one(). This path is used for vhost to pass a 
batched of packets.

Re: [Patch net] net: invert the check of detecting hardware RX checksum fault

2018-11-15 Thread Herbert Xu

On Thu, Nov 15, 2018 at 08:52:23PM -0800, Eric Dumazet wrote:
>
> It is very possible NIC provides an incorrect CHECKSUM_COMPLETE, in the
> case non zero trailer bytes were added by a buggy switch (or host)

We should probably change netdev_rx_csum_fault to print out at
least one complete packet plus the hardware-generated checksum.

That would make debugging these rare hardware faults much easier.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [Patch net] net: invert the check of detecting hardware RX checksum fault

2018-11-15 Thread Eric Dumazet




On 11/15/2018 06:23 PM, Cong Wang wrote:
> On Thu, Nov 15, 2018 at 5:52 PM Herbert Xu  
> wrote:
>>
>> On Thu, Nov 15, 2018 at 03:16:02PM -0800, Cong Wang wrote:
>>> The following evidences indicate this check is likely wrong:
>>>
>>> 1. In the assignment "skb->csum_valid = !sum", sum==0 indicates a valid 
>>> checksum.
>>>
>>> 2. __skb_checksum_complete() always returns sum, and TCP packets are dropped
>>>only when it returns non-zero. So non-zero indicates a failure.
>>>
>>> 3. In __skb_checksum_validate_complete(), we have a nearly same check, where
>>>zero is considered as success.
>>>
>>> 4. csum_fold() already does the one’s complement, this indicates 0 should
>>>be considered as a successful validation.
>>>
>>> 5. We have triggered this fault for many times, but InCsumErrors field in
>>>/proc/net/snmp remains 0.
>>>
>>> Base on the above, non-zero should be used as a checksum mismatch.
>>>
>>> I tested this with mlx5 driver, no warning or InCsumErrors after 1 hour.
>>>
>>> Fixes: fb286bb2990a ("[NET]: Detect hardware rx checksum faults correctly")
>>> Cc: Herbert Xu 
>>> Cc: Tom Herbert 
>>> Cc: Eric Dumazet 
>>> Signed-off-by: Cong Wang 
>>> ---
>>>  net/core/datagram.c | 4 ++--
>>>  net/core/dev.c  | 2 +-
>>>  2 files changed, 3 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/net/core/datagram.c b/net/core/datagram.c
>>> index 57f3a6fcfc1e..e542a9a212a7 100644
>>> --- a/net/core/datagram.c
>>> +++ b/net/core/datagram.c
>>> @@ -733,7 +733,7 @@ __sum16 __skb_checksum_complete_head(struct sk_buff 
>>> *skb, int len)
>>>   __sum16 sum;
>>>
>>>   sum = csum_fold(skb_checksum(skb, 0, len, skb->csum));
>>> - if (likely(!sum)) {
>>> + if (unlikely(sum)) {
>>>   if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE) &&
>>>   !skb->csum_complete_sw)
>>>   netdev_rx_csum_fault(skb->dev);
>>
>> Normally if the hardware's partial checksum is valid then we just
>> trust it and send the packet along.  However, if the partial
>> checksum is invalid we don't trust it and we will compute the
>> whole checksum manually which is what ends up in sum.
> 
> Not sure if I understand partial checksum here, but it is the
> CHECKSUM_COMPLETE case which I am trying to fix, not
> CHECKSUM_PARTIAL.
> 
> Or you mean the checksum returned by skb_checksum(), that is,
> checksum from skb->data to skb->data+skb->len.
> 
> If neither, I am confused.
> 
>>
>> netdev_rx_csum_fault is meant to warn about the situation where
>> a packet with a valid checksum (i.e., sum == 0) was given to us
>> by the hardware with a partial checksum that was invalid.
>>
>> So changing it to sum here is wrong.
>>
> 
> So, in other word, a checksum *match* is the intended to detect
> this HW RX checksum fault?
> 
> What has been changed in between skb_checksum_init() and
> tcp_checksum_complete() so that the logic is inverted?
> 
> Looks like I miss something too obvious to understand the logic. :-/
> 
> 
> 
>> Can you give more information as to how you got the warnings with
>> mlx5? It sounds like there may be a real bug there because if you
>> are getting the warning then it means that a packet with an invalid
>> hardware-computed partial checksum passed the manual check and
>> was actually valid.  This implies that either the hardware or the
>> driver is broken.
> 
> Sure, my case is nearly same with Pawel's, except I have no vlan:
> https://marc.info/?l=linux-netdev=154086647601721=2
> 
> None of us has RXFCS, if you are curious whether Eric's fix works
> for us.
> 
> There are also a few other reports with conntrack involved:
> https://marc.info/?l=linux-netdev=154134983130200=2
> https://marc.info/?l=linux-netdev=154070099731902=2


It is very possible NIC provides an incorrect CHECKSUM_COMPLETE, in the
case non zero trailer bytes were added by a buggy switch (or host)

Saeed can comment/confirm, but the theory is that the NIC does header analysis 
and
computes a checksum only on the bytes of the IP frame, not including the tail 
bytes
that were added by a switch.

You could use trafgen to cook such a frame and confirm the theory.

Something like :

{
  0x00, 0x1a, 0x11, 0xc3, 0x0d, 0x45,  # MAC Destination
  0x00, 0x12, 0xc0, 0x02, 0xac, 0x5a,  # MAC Source
  const16(0x0800),

  /* IPv4 Version, IHL, TOS */
  0b01000101, 0,
  /* IPv4 Total Len */
  const16(40),
  /* IPv4 Ident */
  //drnd(2),
  const16(2),

  /* IPv4 Flags, Frag Off */
  0b0100, 0,
  /* IPv4 TTL */
  64,
  /* Proto TCP */
  0x06,
  /* IPv4 Checksum (IP header from, to) */
  csumip(14, 33),

  7, drnd(3), # Source IP
  10,246,7,152,   # Dest IP

  /* TCP Source Port */
  drnd(2),
  /* TCP Dest Port */
  const16(80),
  /* TCP Sequence Number */
  drnd(4),
  /* TCP Ackn. Number */
  c32(0),

  /* TCP Header length + Flags */
  const16((0x5 << 12) | 2)  /* TCP SYN Flag */

  /* Window Size */
  const16(16),
  /* TCP Checksum (offset IP, offset TCP) */
  csumtcp(14, 34),

Re: [Patch net] net: invert the check of detecting hardware RX checksum fault

2018-11-15 Thread Herbert Xu

On Thu, Nov 15, 2018 at 06:23:38PM -0800, Cong Wang wrote:
>
> > Normally if the hardware's partial checksum is valid then we just
> > trust it and send the packet along.  However, if the partial
> > checksum is invalid we don't trust it and we will compute the
> > whole checksum manually which is what ends up in sum.
> 
> Not sure if I understand partial checksum here, but it is the
> CHECKSUM_COMPLETE case which I am trying to fix, not
> CHECKSUM_PARTIAL.

What I meant by partial checksum is the checksum produced by the
hardware on RX.  In the kernel we call that CHECKSUM_COMPLETE.
CHECKSUM_PARTIAL is the absence of the substantial part of the
checksum which is something we use in the kernel primarily for TX.

Yes the names are confusing :)

> So, in other word, a checksum *match* is the intended to detect
> this HW RX checksum fault?

Correct.  Or more likely it's probably a bug in either the driver
or if there are overlaying code such as VLAN then in that code.

Basically if the RX checksum is buggy, it's much more likely to
cause a valid packet to be rejected than to cause an invalid packet
to be accepted, because we still verify that checksum against the
pseudoheader.  So we only attempt to catch buggy hardware/drivers
by doing a second manual verification for the case where the packet
is flagged as invalid.

> Sure, my case is nearly same with Pawel's, except I have no vlan:
> https://marc.info/?l=linux-netdev=154086647601721=2

Can you please provide your backtrace?

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

[PATCH] [PATCH net-next] tun: fix multiqueue rx

2018-11-15 Thread Matthew Cover

When writing packets to a descriptor associated with a combined queue, the
packets should end up on that queue.

Before this change all packets written to any descriptor associated with a
tap interface end up on rx-0, even when the descriptor is associated with a
different queue.

The rx traffic can be generated by either of the following.
  1. a simple tap program which spins up multiple queues and writes packets
 to each of the file descriptors
  2. tx from a qemu vm with a tap multiqueue netdev

The queue for rx traffic can be observed by either of the following (done
on the hypervisor in the qemu case).
  1. a simple netmap program which opens and reads from per-queue
 descriptors
  2. configuring RPS and doing per-cpu captures with rxtxcpu

Alternatively, if you printk() the return value of skb_get_rx_queue() just
before each instance of netif_receive_skb() in tun.c, you will get 65535
for every skb.

Calling skb_record_rx_queue() to set the rx queue to the queue_index fixes
the association between descriptor and rx queue.

Signed-off-by: Matthew Cover 
---
 drivers/net/tun.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index a65779c6d72f..4e306ff3501c 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1536,6 +1536,7 @@ static void tun_rx_batched(struct tun_struct *tun, struct 
tun_file *tfile,
 
if (!rx_batched || (!more && skb_queue_empty(queue))) {
local_bh_disable();
+   skb_record_rx_queue(skb, tfile->queue_index);
netif_receive_skb(skb);
local_bh_enable();
return;
@@ -1555,8 +1556,11 @@ static void tun_rx_batched(struct tun_struct *tun, 
struct tun_file *tfile,
struct sk_buff *nskb;
 
local_bh_disable();
-   while ((nskb = __skb_dequeue(_queue)))
+   while ((nskb = __skb_dequeue(_queue))) {
+   skb_record_rx_queue(nskb, tfile->queue_index);
netif_receive_skb(nskb);
+   }
+   skb_record_rx_queue(skb, tfile->queue_index);
netif_receive_skb(skb);
local_bh_enable();
}
-- 
2.15.2 (Apple Git-101.1)

[PATCH net-next] cxgb4: Remove SGE_HOST_PAGE_SIZE dependency on page size

2018-11-15 Thread Arjun Vynipadath

The SGE Host Page Size has nothing to do with the actual
Host Page Size. It's the SGE's BAR2 Doorbell/GTS Page Size
for interpreting the SGE Ingress/Egress Queue per Page values.
Firmware reads all of these things and makes all the
subsequent changes necessary. The Host Driver uses the SGE
Host Page Size in order to properly calculate BAR2 Offsets.

Signed-off-by: Casey Leedom 
Signed-off-by: Arjun Vynipadath 
Signed-off-by: Ganesh Goudar 
---
 drivers/net/ethernet/chelsio/cxgb4/t4_hw.c | 11 ---
 1 file changed, 11 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c 
b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
index cb52394..fc6a087 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
@@ -7141,21 +7141,10 @@ int t4_fixup_host_params(struct adapter *adap, unsigned 
int page_size,
 unsigned int cache_line_size)
 {
unsigned int page_shift = fls(page_size) - 1;
-   unsigned int sge_hps = page_shift - 10;
unsigned int stat_len = cache_line_size > 64 ? 128 : 64;
unsigned int fl_align = cache_line_size < 32 ? 32 : cache_line_size;
unsigned int fl_align_log = fls(fl_align) - 1;
 
-   t4_write_reg(adap, SGE_HOST_PAGE_SIZE_A,
-HOSTPAGESIZEPF0_V(sge_hps) |
-HOSTPAGESIZEPF1_V(sge_hps) |
-HOSTPAGESIZEPF2_V(sge_hps) |
-HOSTPAGESIZEPF3_V(sge_hps) |
-HOSTPAGESIZEPF4_V(sge_hps) |
-HOSTPAGESIZEPF5_V(sge_hps) |
-HOSTPAGESIZEPF6_V(sge_hps) |
-HOSTPAGESIZEPF7_V(sge_hps));
-
if (is_t4(adap->params.chip)) {
t4_set_reg_field(adap, SGE_CONTROL_A,
 INGPADBOUNDARY_V(INGPADBOUNDARY_M) |
-- 
2.9.5

Re: [Patch net] net: invert the check of detecting hardware RX checksum fault

2018-11-15 Thread Cong Wang

On Thu, Nov 15, 2018 at 5:52 PM Herbert Xu  wrote:
>
> On Thu, Nov 15, 2018 at 03:16:02PM -0800, Cong Wang wrote:
> > The following evidences indicate this check is likely wrong:
> >
> > 1. In the assignment "skb->csum_valid = !sum", sum==0 indicates a valid 
> > checksum.
> >
> > 2. __skb_checksum_complete() always returns sum, and TCP packets are dropped
> >only when it returns non-zero. So non-zero indicates a failure.
> >
> > 3. In __skb_checksum_validate_complete(), we have a nearly same check, where
> >zero is considered as success.
> >
> > 4. csum_fold() already does the one’s complement, this indicates 0 should
> >be considered as a successful validation.
> >
> > 5. We have triggered this fault for many times, but InCsumErrors field in
> >/proc/net/snmp remains 0.
> >
> > Base on the above, non-zero should be used as a checksum mismatch.
> >
> > I tested this with mlx5 driver, no warning or InCsumErrors after 1 hour.
> >
> > Fixes: fb286bb2990a ("[NET]: Detect hardware rx checksum faults correctly")
> > Cc: Herbert Xu 
> > Cc: Tom Herbert 
> > Cc: Eric Dumazet 
> > Signed-off-by: Cong Wang 
> > ---
> >  net/core/datagram.c | 4 ++--
> >  net/core/dev.c  | 2 +-
> >  2 files changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/net/core/datagram.c b/net/core/datagram.c
> > index 57f3a6fcfc1e..e542a9a212a7 100644
> > --- a/net/core/datagram.c
> > +++ b/net/core/datagram.c
> > @@ -733,7 +733,7 @@ __sum16 __skb_checksum_complete_head(struct sk_buff 
> > *skb, int len)
> >   __sum16 sum;
> >
> >   sum = csum_fold(skb_checksum(skb, 0, len, skb->csum));
> > - if (likely(!sum)) {
> > + if (unlikely(sum)) {
> >   if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE) &&
> >   !skb->csum_complete_sw)
> >   netdev_rx_csum_fault(skb->dev);
>
> Normally if the hardware's partial checksum is valid then we just
> trust it and send the packet along.  However, if the partial
> checksum is invalid we don't trust it and we will compute the
> whole checksum manually which is what ends up in sum.

Not sure if I understand partial checksum here, but it is the
CHECKSUM_COMPLETE case which I am trying to fix, not
CHECKSUM_PARTIAL.

Or you mean the checksum returned by skb_checksum(), that is,
checksum from skb->data to skb->data+skb->len.

If neither, I am confused.

>
> netdev_rx_csum_fault is meant to warn about the situation where
> a packet with a valid checksum (i.e., sum == 0) was given to us
> by the hardware with a partial checksum that was invalid.
>
> So changing it to sum here is wrong.
>

So, in other word, a checksum *match* is the intended to detect
this HW RX checksum fault?

What has been changed in between skb_checksum_init() and
tcp_checksum_complete() so that the logic is inverted?

Looks like I miss something too obvious to understand the logic. :-/



> Can you give more information as to how you got the warnings with
> mlx5? It sounds like there may be a real bug there because if you
> are getting the warning then it means that a packet with an invalid
> hardware-computed partial checksum passed the manual check and
> was actually valid.  This implies that either the hardware or the
> driver is broken.

Sure, my case is nearly same with Pawel's, except I have no vlan:
https://marc.info/?l=linux-netdev=154086647601721=2

None of us has RXFCS, if you are curious whether Eric's fix works
for us.

There are also a few other reports with conntrack involved:
https://marc.info/?l=linux-netdev=154134983130200=2
https://marc.info/?l=linux-netdev=154070099731902=2

Thanks.

Re: [PATCH net-next 6/8] net: eth: altera: tse: add support for ptp and timestamping

2018-11-15 Thread Richard Cochran

On Thu, Nov 15, 2018 at 06:55:29AM -0800, Dalon Westergreen wrote:
> Sure, I would like to keep the debugfs entries for disabling freq 
> correction,and
> reading the current scaled_ppm value.  I intend to use these to tune 
> anexternal
> vcxo.  If there is a better way to do this, please let me know.

Yes, there is.  The external VCXO should be a proper PHC.  Then, with
a minor change to the linuxptp stack (already in the pipe), you can
just use that.

You should not disable frequency correction in the driver.  Leave that
decision to the user space PTP stack.

> I would prefer to keep altera just to be consistent with the altera_tse stuff,
> and i intend to reusethis code for a 10GbE driver, so perhaps altera_tod to
> reference the fpga ip name?

So the IP core is called "tod"?  Really?

Thanks,
Richard

Re: [Patch net] net: invert the check of detecting hardware RX checksum fault

2018-11-15 Thread Herbert Xu

On Thu, Nov 15, 2018 at 03:16:02PM -0800, Cong Wang wrote:
> The following evidences indicate this check is likely wrong:
> 
> 1. In the assignment "skb->csum_valid = !sum", sum==0 indicates a valid 
> checksum.
> 
> 2. __skb_checksum_complete() always returns sum, and TCP packets are dropped
>only when it returns non-zero. So non-zero indicates a failure.
> 
> 3. In __skb_checksum_validate_complete(), we have a nearly same check, where
>zero is considered as success.
> 
> 4. csum_fold() already does the one’s complement, this indicates 0 should
>be considered as a successful validation.
> 
> 5. We have triggered this fault for many times, but InCsumErrors field in
>/proc/net/snmp remains 0.
> 
> Base on the above, non-zero should be used as a checksum mismatch.
> 
> I tested this with mlx5 driver, no warning or InCsumErrors after 1 hour.
> 
> Fixes: fb286bb2990a ("[NET]: Detect hardware rx checksum faults correctly")
> Cc: Herbert Xu 
> Cc: Tom Herbert 
> Cc: Eric Dumazet 
> Signed-off-by: Cong Wang 
> ---
>  net/core/datagram.c | 4 ++--
>  net/core/dev.c  | 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/net/core/datagram.c b/net/core/datagram.c
> index 57f3a6fcfc1e..e542a9a212a7 100644
> --- a/net/core/datagram.c
> +++ b/net/core/datagram.c
> @@ -733,7 +733,7 @@ __sum16 __skb_checksum_complete_head(struct sk_buff *skb, 
> int len)
>   __sum16 sum;
>  
>   sum = csum_fold(skb_checksum(skb, 0, len, skb->csum));
> - if (likely(!sum)) {
> + if (unlikely(sum)) {
>   if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE) &&
>   !skb->csum_complete_sw)
>   netdev_rx_csum_fault(skb->dev);

Normally if the hardware's partial checksum is valid then we just
trust it and send the packet along.  However, if the partial
checksum is invalid we don't trust it and we will compute the
whole checksum manually which is what ends up in sum.

netdev_rx_csum_fault is meant to warn about the situation where
a packet with a valid checksum (i.e., sum == 0) was given to us
by the hardware with a partial checksum that was invalid.

So changing it to sum here is wrong.

Can you give more information as to how you got the warnings with
mlx5? It sounds like there may be a real bug there because if you
are getting the warning then it means that a packet with an invalid
hardware-computed partial checksum passed the manual check and
was actually valid.  This implies that either the hardware or the
driver is broken.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

[PATCH 09/10] flow_dissector: add basic ethtool_rx_flow_spec to flow_rule structure translator

2018-11-15 Thread Pablo Neira Ayuso

This patch adds a function to translate the ethtool_rx_flow_spec
structure to the flow_rule representation.

This allows us to reuse code from the driver side given that both flower
and ethtool_rx_flow interfaces use the same representation.

Signed-off-by: Pablo Neira Ayuso 
---
 include/net/flow_dissector.h |   5 ++
 net/core/flow_dissector.c| 190 +++
 2 files changed, 195 insertions(+)

diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
index 7a4683646d5a..ec9036232538 100644
--- a/include/net/flow_dissector.h
+++ b/include/net/flow_dissector.h
@@ -485,4 +485,9 @@ static inline bool flow_rule_match_key(const struct 
flow_rule *rule,
return dissector_uses_key(rule->match.dissector, key);
 }
 
+struct ethtool_rx_flow_spec;
+
+struct flow_rule *ethtool_rx_flow_rule(const struct ethtool_rx_flow_spec *fs);
+void ethtool_rx_flow_rule_free(struct flow_rule *rule);
+
 #endif
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index b9368349f0f7..ef5bdb62620c 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -276,6 +277,195 @@ void flow_action_free(struct flow_action *flow_action)
 }
 EXPORT_SYMBOL(flow_action_free);
 
+struct ethtool_rx_flow_key {
+   struct flow_dissector_key_basic basic;
+   union {
+   struct flow_dissector_key_ipv4_addrsipv4;
+   struct flow_dissector_key_ipv6_addrsipv6;
+   };
+   struct flow_dissector_key_ports tp;
+   struct flow_dissector_key_ipip;
+} __aligned(BITS_PER_LONG / 8); /* Ensure that we can do comparisons as longs. 
*/
+
+struct ethtool_rx_flow_match {
+   struct flow_dissector   dissector;
+   struct ethtool_rx_flow_key  key;
+   struct ethtool_rx_flow_key  mask;
+};
+
+struct flow_rule *ethtool_rx_flow_rule(const struct ethtool_rx_flow_spec *fs)
+{
+   static struct in6_addr zero_addr = {};
+   struct ethtool_rx_flow_match *match;
+   struct flow_action_key *act;
+   struct flow_rule *rule;
+
+   rule = kmalloc(sizeof(struct flow_rule), GFP_KERNEL);
+   if (!rule)
+   return NULL;
+
+   match = kzalloc(sizeof(struct ethtool_rx_flow_match), GFP_KERNEL);
+   if (!match)
+   goto err_match;
+
+   rule->match.dissector   = >dissector;
+   rule->match.mask= >mask;
+   rule->match.key = >key;
+
+   match->mask.basic.n_proto = 0x;
+
+   switch (fs->flow_type & ~FLOW_EXT) {
+   case TCP_V4_FLOW:
+   case UDP_V4_FLOW: {
+   const struct ethtool_tcpip4_spec *v4_spec, *v4_m_spec;
+
+   match->key.basic.n_proto = htons(ETH_P_IP);
+
+   v4_spec = >h_u.tcp_ip4_spec;
+   v4_m_spec = >m_u.tcp_ip4_spec;
+
+   if (v4_m_spec->ip4src) {
+   match->key.ipv4.src = v4_spec->ip4src;
+   match->mask.ipv4.src = v4_m_spec->ip4src;
+   }
+   if (v4_m_spec->ip4dst) {
+   match->key.ipv4.dst = v4_spec->ip4dst;
+   match->mask.ipv4.dst = v4_m_spec->ip4dst;
+   }
+   if (v4_m_spec->ip4src ||
+   v4_m_spec->ip4dst) {
+   match->dissector.used_keys |=
+   FLOW_DISSECTOR_KEY_IPV4_ADDRS;
+   match->dissector.offset[FLOW_DISSECTOR_KEY_IPV4_ADDRS] =
+   offsetof(struct ethtool_rx_flow_key, ipv4);
+   }
+   if (v4_m_spec->psrc) {
+   match->key.tp.src = v4_spec->psrc;
+   match->mask.tp.src = v4_m_spec->psrc;
+   }
+   if (v4_m_spec->pdst) {
+   match->key.tp.dst = v4_spec->pdst;
+   match->mask.tp.dst = v4_m_spec->pdst;
+   }
+   if (v4_m_spec->psrc ||
+   v4_m_spec->pdst) {
+   match->dissector.used_keys |= FLOW_DISSECTOR_KEY_PORTS;
+   match->dissector.offset[FLOW_DISSECTOR_KEY_PORTS] =
+   offsetof(struct ethtool_rx_flow_key, tp);
+   }
+   if (v4_m_spec->tos) {
+   match->key.ip.tos = v4_spec->pdst;
+   match->mask.ip.tos = v4_m_spec->pdst;
+   match->dissector.used_keys |= FLOW_DISSECTOR_KEY_IP;
+   match->dissector.offset[FLOW_DISSECTOR_KEY_IP] =
+   offsetof(struct ethtool_rx_flow_key, ip);
+   }
+   }
+   break;
+   case TCP_V6_FLOW:
+   case UDP_V6_FLOW: {
+   const struct ethtool_tcpip6_spec *v6_spec, *v6_m_spec;
+
+

[PATCH 04/10] cls_api: add translator to flow_action representation

2018-11-15 Thread Pablo Neira Ayuso

This patch implements a new function to translate from native TC action
to the new flow_action representation. Moreover, this patch also updates
cls_flower to use this new function.

Signed-off-by: Pablo Neira Ayuso 
---
 include/net/pkt_cls.h  |   3 ++
 net/sched/cls_api.c| 113 +
 net/sched/cls_flower.c |  15 ++-
 3 files changed, 130 insertions(+), 1 deletion(-)

diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index ab36ac9e5967..667549050f50 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -619,6 +619,9 @@ tcf_match_indev(struct sk_buff *skb, int ifindex)
 }
 #endif /* CONFIG_NET_CLS_IND */
 
+int tc_setup_flow_action(struct flow_action *flow_action,
+const struct tcf_exts *exts);
+
 int tc_setup_cb_call(struct tcf_block *block, struct tcf_exts *exts,
 enum tc_setup_type type, void *type_data, bool err_stop);
 
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index d92f44ac4c39..6ab44e650f43 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -31,6 +31,14 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
 
 extern const struct nla_policy rtm_tca_policy[TCA_MAX + 1];
 
@@ -2567,6 +2575,111 @@ int tc_setup_cb_call(struct tcf_block *block, struct 
tcf_exts *exts,
 }
 EXPORT_SYMBOL(tc_setup_cb_call);
 
+int tc_setup_flow_action(struct flow_action *flow_action,
+const struct tcf_exts *exts)
+{
+   const struct tc_action *act;
+   int num_acts = 0, i, j, k;
+
+   if (!exts)
+   return 0;
+
+   tcf_exts_for_each_action(i, act, exts) {
+   if (is_tcf_pedit(act))
+   num_acts += tcf_pedit_nkeys(act);
+   else
+   num_acts++;
+   }
+   if (!num_acts)
+   return 0;
+
+   if (flow_action_init(flow_action, num_acts) < 0)
+   return -ENOMEM;
+
+   j = 0;
+   tcf_exts_for_each_action(i, act, exts) {
+   struct flow_action_key *key;
+
+   key = _action->keys[j];
+   if (is_tcf_gact_ok(act)) {
+   key->id = FLOW_ACTION_KEY_ACCEPT;
+   } else if (is_tcf_gact_shot(act)) {
+   key->id = FLOW_ACTION_KEY_DROP;
+   } else if (is_tcf_gact_trap(act)) {
+   key->id = FLOW_ACTION_KEY_TRAP;
+   } else if (is_tcf_gact_goto_chain(act)) {
+   key->id = FLOW_ACTION_KEY_GOTO;
+   key->chain_index = tcf_gact_goto_chain_index(act);
+   } else if (is_tcf_mirred_egress_redirect(act)) {
+   key->id = FLOW_ACTION_KEY_REDIRECT;
+   key->dev = tcf_mirred_dev(act);
+   } else if (is_tcf_mirred_egress_mirror(act)) {
+   key->id = FLOW_ACTION_KEY_MIRRED;
+   key->dev = tcf_mirred_dev(act);
+   } else if (is_tcf_vlan(act)) {
+   switch (tcf_vlan_action(act)) {
+   case TCA_VLAN_ACT_PUSH:
+   key->id = FLOW_ACTION_KEY_VLAN_PUSH;
+   key->vlan.vid = tcf_vlan_push_vid(act);
+   key->vlan.proto = tcf_vlan_push_proto(act);
+   key->vlan.prio = tcf_vlan_push_prio(act);
+   break;
+   case TCA_VLAN_ACT_POP:
+   key->id = FLOW_ACTION_KEY_VLAN_POP;
+   break;
+   case TCA_VLAN_ACT_MODIFY:
+   key->id = FLOW_ACTION_KEY_VLAN_MANGLE;
+   key->vlan.vid = tcf_vlan_push_vid(act);
+   key->vlan.proto = tcf_vlan_push_proto(act);
+   key->vlan.prio = tcf_vlan_push_prio(act);
+   break;
+   default:
+   goto err_out;
+   }
+   } else if (is_tcf_tunnel_set(act)) {
+   key->id = FLOW_ACTION_KEY_TUNNEL_ENCAP;
+   key->tunnel = tcf_tunnel_info(act);
+   } else if (is_tcf_tunnel_release(act)) {
+   key->id = FLOW_ACTION_KEY_TUNNEL_DECAP;
+   key->tunnel = tcf_tunnel_info(act);
+   } else if (is_tcf_pedit(act)) {
+   for (k = 0; k < tcf_pedit_nkeys(act); k++) {
+   switch (tcf_pedit_cmd(act, k)) {
+   case TCA_PEDIT_KEY_EX_CMD_SET:
+   key->id = FLOW_ACTION_KEY_MANGLE;
+   break;
+   case TCA_PEDIT_KEY_EX_CMD_ADD:
+

[PATCH 02/10] net/mlx5e: support for two independent packet edit actions

2018-11-15 Thread Pablo Neira Ayuso

This patch adds pedit_headers_action structure to store the result of
parsing tc pedit actions. Then, it calls alloc_tc_pedit_action() to
populate the mlx5e hardware intermediate representation once all actions
have been parsed.

This patch comes in preparation for the new flow_action infrastructure,
where each packet mangling comes in an separated action, ie. not packed
as in tc pedit.

Signed-off-by: Pablo Neira Ayuso 
---
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 81 ++---
 1 file changed, 59 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index a93ec9214bea..1b59982ed450 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -1748,6 +1748,12 @@ struct pedit_headers {
struct udphdr  udp;
 };
 
+struct pedit_headers_action {
+   struct pedit_headersvals;
+   struct pedit_headersmasks;
+   u32 pedits;
+};
+
 static int pedit_header_offsets[] = {
[TCA_PEDIT_KEY_EX_HDR_TYPE_ETH] = offsetof(struct pedit_headers, eth),
[TCA_PEDIT_KEY_EX_HDR_TYPE_IP4] = offsetof(struct pedit_headers, ip4),
@@ -1759,16 +1765,15 @@ static int pedit_header_offsets[] = {
 #define pedit_header(_ph, _htype) ((void *)(_ph) + 
pedit_header_offsets[_htype])
 
 static int set_pedit_val(u8 hdr_type, u32 mask, u32 val, u32 offset,
-struct pedit_headers *masks,
-struct pedit_headers *vals)
+struct pedit_headers_action *hdrs)
 {
u32 *curr_pmask, *curr_pval;
 
if (hdr_type >= __PEDIT_HDR_TYPE_MAX)
goto out_err;
 
-   curr_pmask = (u32 *)(pedit_header(masks, hdr_type) + offset);
-   curr_pval  = (u32 *)(pedit_header(vals, hdr_type) + offset);
+   curr_pmask = (u32 *)(pedit_header(>masks, hdr_type) + offset);
+   curr_pval  = (u32 *)(pedit_header(>vals, hdr_type) + offset);
 
if (*curr_pmask & mask)  /* disallow acting twice on the same location 
*/
goto out_err;
@@ -1824,8 +1829,7 @@ static struct mlx5_fields fields[] = {
  * max from the SW pedit action. On success, it says how many HW actions were
  * actually parsed.
  */
-static int offload_pedit_fields(struct pedit_headers *masks,
-   struct pedit_headers *vals,
+static int offload_pedit_fields(struct pedit_headers_action *hdrs,
struct mlx5e_tc_flow_parse_attr *parse_attr,
struct netlink_ext_ack *extack)
 {
@@ -1840,10 +1844,10 @@ static int offload_pedit_fields(struct pedit_headers 
*masks,
__be16 mask_be16;
void *action;
 
-   set_masks = [TCA_PEDIT_KEY_EX_CMD_SET];
-   add_masks = [TCA_PEDIT_KEY_EX_CMD_ADD];
-   set_vals = [TCA_PEDIT_KEY_EX_CMD_SET];
-   add_vals = [TCA_PEDIT_KEY_EX_CMD_ADD];
+   set_masks = [TCA_PEDIT_KEY_EX_CMD_SET].masks;
+   add_masks = [TCA_PEDIT_KEY_EX_CMD_ADD].masks;
+   set_vals = [TCA_PEDIT_KEY_EX_CMD_SET].vals;
+   add_vals = [TCA_PEDIT_KEY_EX_CMD_ADD].vals;
 
action_size = MLX5_UN_SZ_BYTES(set_action_in_add_action_in_auto);
action = parse_attr->mod_hdr_actions;
@@ -1939,12 +1943,14 @@ static int offload_pedit_fields(struct pedit_headers 
*masks,
 }
 
 static int alloc_mod_hdr_actions(struct mlx5e_priv *priv,
-const struct tc_action *a, int namespace,
+struct pedit_headers_action *hdrs,
+int namespace,
 struct mlx5e_tc_flow_parse_attr *parse_attr)
 {
int nkeys, action_size, max_actions;
 
-   nkeys = tcf_pedit_nkeys(a);
+   nkeys = hdrs[TCA_PEDIT_KEY_EX_CMD_SET].pedits +
+   hdrs[TCA_PEDIT_KEY_EX_CMD_ADD].pedits;
action_size = MLX5_UN_SZ_BYTES(set_action_in_add_action_in_auto);
 
if (namespace == MLX5_FLOW_NAMESPACE_FDB) /* FDB offloading */
@@ -1968,18 +1974,15 @@ static const struct pedit_headers zero_masks = {};
 static int parse_tc_pedit_action(struct mlx5e_priv *priv,
 const struct tc_action *a, int namespace,
 struct mlx5e_tc_flow_parse_attr *parse_attr,
+struct pedit_headers_action *hdrs,
 struct netlink_ext_ack *extack)
 {
-   struct pedit_headers masks[__PEDIT_CMD_MAX], vals[__PEDIT_CMD_MAX], 
*cmd_masks;
int nkeys, i, err = -EOPNOTSUPP;
u32 mask, val, offset;
u8 cmd, htype;
 
nkeys = tcf_pedit_nkeys(a);
 
-   memset(masks, 0, sizeof(struct pedit_headers) * __PEDIT_CMD_MAX);
-   memset(vals,  0, sizeof(struct pedit_headers) * __PEDIT_CMD_MAX);
-
for (i = 0; i < nkeys; i++) {
htype = tcf_pedit_htype(a, i);
cmd = tcf_pedit_cmd(a, i);

[PATCH 01/10] flow_dissector: add flow_rule and flow_match structures and use them

2018-11-15 Thread Pablo Neira Ayuso

This patch wraps the dissector key and mask - that flower uses to
represent the matching side - around the flow_match structure.

To avoid a follow up patch that would edit the same LoCs in the drivers,
this patch also wraps this new flow match structure around the flow rule
object. This new structure will also contain the flow actions in follow
up patches.

This introduces two new interfaces:

bool flow_rule_match_key(rule, dissector_id)

that returns true if a given matching key is set on, and:

flow_rule_match_XYZ(rule, );

To fetch the matching side XYZ into the match container structure, to
retrieve the key and the mask with one single call.

Signed-off-by: Pablo Neira Ayuso 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c   | 174 -
 .../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c   | 194 --
 drivers/net/ethernet/intel/i40e/i40e_main.c| 178 -
 drivers/net/ethernet/intel/iavf/iavf_main.c| 195 --
 drivers/net/ethernet/intel/igb/igb_main.c  |  64 ++--
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c| 420 +
 .../net/ethernet/mellanox/mlxsw/spectrum_flower.c  | 202 +-
 drivers/net/ethernet/netronome/nfp/flower/action.c |  11 +-
 drivers/net/ethernet/netronome/nfp/flower/match.c  | 416 ++--
 .../net/ethernet/netronome/nfp/flower/offload.c| 145 +++
 drivers/net/ethernet/qlogic/qede/qede_filter.c |  85 ++---
 include/net/flow_dissector.h   | 107 ++
 include/net/pkt_cls.h  |   4 +-
 net/core/flow_dissector.c  | 133 +++
 net/sched/cls_flower.c |  18 +-
 15 files changed, 1144 insertions(+), 1202 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
index 749f63beddd8..9b947e03335a 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
@@ -177,18 +177,12 @@ static int bnxt_tc_parse_actions(struct bnxt *bp,
return 0;
 }
 
-#define GET_KEY(flow_cmd, key_type)\
-   skb_flow_dissector_target((flow_cmd)->dissector, key_type,\
- (flow_cmd)->key)
-#define GET_MASK(flow_cmd, key_type)   \
-   skb_flow_dissector_target((flow_cmd)->dissector, key_type,\
- (flow_cmd)->mask)
-
 static int bnxt_tc_parse_flow(struct bnxt *bp,
  struct tc_cls_flower_offload *tc_flow_cmd,
  struct bnxt_tc_flow *flow)
 {
-   struct flow_dissector *dissector = tc_flow_cmd->dissector;
+   struct flow_rule *rule = _flow_cmd->rule;
+   struct flow_dissector *dissector = rule->match.dissector;
 
/* KEY_CONTROL and KEY_BASIC are needed for forming a meaningful key */
if ((dissector->used_keys & BIT(FLOW_DISSECTOR_KEY_CONTROL)) == 0 ||
@@ -198,140 +192,120 @@ static int bnxt_tc_parse_flow(struct bnxt *bp,
return -EOPNOTSUPP;
}
 
-   if (dissector_uses_key(dissector, FLOW_DISSECTOR_KEY_BASIC)) {
-   struct flow_dissector_key_basic *key =
-   GET_KEY(tc_flow_cmd, FLOW_DISSECTOR_KEY_BASIC);
-   struct flow_dissector_key_basic *mask =
-   GET_MASK(tc_flow_cmd, FLOW_DISSECTOR_KEY_BASIC);
+   if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_BASIC)) {
+   struct flow_match_basic match;
 
-   flow->l2_key.ether_type = key->n_proto;
-   flow->l2_mask.ether_type = mask->n_proto;
+   flow_rule_match_basic(rule, );
+   flow->l2_key.ether_type = match.key->n_proto;
+   flow->l2_mask.ether_type = match.mask->n_proto;
 
-   if (key->n_proto == htons(ETH_P_IP) ||
-   key->n_proto == htons(ETH_P_IPV6)) {
-   flow->l4_key.ip_proto = key->ip_proto;
-   flow->l4_mask.ip_proto = mask->ip_proto;
+   if (match.key->n_proto == htons(ETH_P_IP) ||
+   match.key->n_proto == htons(ETH_P_IPV6)) {
+   flow->l4_key.ip_proto = match.key->ip_proto;
+   flow->l4_mask.ip_proto = match.mask->ip_proto;
}
}
 
-   if (dissector_uses_key(dissector, FLOW_DISSECTOR_KEY_ETH_ADDRS)) {
-   struct flow_dissector_key_eth_addrs *key =
-   GET_KEY(tc_flow_cmd, FLOW_DISSECTOR_KEY_ETH_ADDRS);
-   struct flow_dissector_key_eth_addrs *mask =
-   GET_MASK(tc_flow_cmd, FLOW_DISSECTOR_KEY_ETH_ADDRS);
+   if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_ETH_ADDRS)) {
+   struct flow_match_eth_addrs match;
 
+   flow_rule_match_eth_addrs(rule, );
flow->flags

[PATCH 05/10] cls_flower: add statistics retrieval infrastructure and use it

2018-11-15 Thread Pablo Neira Ayuso

This patch provides a tc_cls_flower_stats structure that acts as
container for tc_cls_flower_offload, then we can use to restore the
statistics on the existing TC actions. Hence, tcf_exts_stats_update() is
not used from drivers.

Signed-off-by: Pablo Neira Ayuso 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c  |  4 ++--
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c  |  6 +++---
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c   |  2 +-
 drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c |  2 +-
 drivers/net/ethernet/netronome/nfp/flower/offload.c   |  6 +++---
 include/net/pkt_cls.h | 15 +++
 net/sched/cls_flower.c|  4 
 7 files changed, 29 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
index 9b947e03335a..684fddd98ca0 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
@@ -1366,8 +1366,8 @@ static int bnxt_tc_get_flow_stats(struct bnxt *bp,
lastused = flow->lastused;
spin_unlock(>stats_lock);
 
-   tcf_exts_stats_update(tc_flow_cmd->exts, stats.bytes, stats.packets,
- lastused);
+   tc_cls_flower_stats_update(tc_flow_cmd, stats.bytes, stats.packets,
+  lastused);
return 0;
 }
 
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c 
b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
index cff9d854bf51..74fe2ee4636e 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
@@ -807,9 +807,9 @@ int cxgb4_tc_flower_stats(struct net_device *dev,
if (ofld_stats->packet_count != packets) {
if (ofld_stats->prev_packet_count != packets)
ofld_stats->last_used = jiffies;
-   tcf_exts_stats_update(cls->exts, bytes - ofld_stats->byte_count,
- packets - ofld_stats->packet_count,
- ofld_stats->last_used);
+   tc_cls_flower_stats_update(cls, bytes - ofld_stats->byte_count,
+  packets - ofld_stats->packet_count,
+  ofld_stats->last_used);
 
ofld_stats->packet_count = packets;
ofld_stats->byte_count = bytes;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 1b59982ed450..d477c5c77df9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -3224,7 +3224,7 @@ int mlx5e_stats_flower(struct mlx5e_priv *priv,
 
mlx5_fc_query_cached(counter, , , );
 
-   tcf_exts_stats_update(f->exts, bytes, packets, lastuse);
+   tc_cls_flower_stats_update(f, bytes, packets, lastuse);
 
return 0;
 }
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c 
b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
index f936ca7bbfa0..bb3dbab1452d 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
@@ -460,7 +460,7 @@ int mlxsw_sp_flower_stats(struct mlxsw_sp *mlxsw_sp,
if (err)
goto err_rule_get_stats;
 
-   tcf_exts_stats_update(f->exts, bytes, packets, lastuse);
+   tc_cls_flower_stats_update(f, bytes, packets, lastuse);
 
mlxsw_sp_acl_ruleset_put(mlxsw_sp, ruleset);
return 0;
diff --git a/drivers/net/ethernet/netronome/nfp/flower/offload.c 
b/drivers/net/ethernet/netronome/nfp/flower/offload.c
index 6c029b4ccca5..26c23a9e36d9 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/offload.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c
@@ -532,9 +532,9 @@ nfp_flower_get_stats(struct nfp_app *app, struct net_device 
*netdev,
ctx_id = be32_to_cpu(nfp_flow->meta.host_ctx_id);
 
spin_lock_bh(>stats_lock);
-   tcf_exts_stats_update(flow->exts, priv->stats[ctx_id].bytes,
- priv->stats[ctx_id].pkts,
- priv->stats[ctx_id].used);
+   tc_cls_flower_stats_update(flow, priv->stats[ctx_id].bytes,
+  priv->stats[ctx_id].pkts,
+  priv->stats[ctx_id].used);
 
priv->stats[ctx_id].pkts = 0;
priv->stats[ctx_id].bytes = 0;
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 667549050f50..a3e2285aeefe 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -758,6 +758,12 @@ enum tc_fl_command {
TC_CLSFLOWER_TMPLT_DESTROY,
 };
 
+struct tc_cls_flower_stats {
+   u64 pkts;
+   u64 bytes;
+   u64 lastused;
+};
+
 struct tc_cls_flower_offload {
struct tc_cls_common_offload common;

[PATCH 07/10] cls_flower: don't expose TC actions to drivers anymore

2018-11-15 Thread Pablo Neira Ayuso

Now that drivers have been converted to use the flow action
infrastructure, remove this field from the tc_cls_flower_offload
structure.

Signed-off-by: Pablo Neira Ayuso 
---
 include/net/pkt_cls.h  | 1 -
 net/sched/cls_flower.c | 5 -
 2 files changed, 6 deletions(-)

diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index a3e2285aeefe..251583096011 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -769,7 +769,6 @@ struct tc_cls_flower_offload {
enum tc_fl_command command;
unsigned long cookie;
struct flow_rule rule;
-   struct tcf_exts *exts;
u32 classid;
struct tc_cls_flower_stats stats;
 };
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 1e26e8a0ae47..49e91d5ee271 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -310,7 +310,6 @@ static int fl_hw_replace_filter(struct tcf_proto *tp,
cls_flower.rule.match.dissector = >mask->dissector;
cls_flower.rule.match.mask = >mask->key;
cls_flower.rule.match.key = >mkey;
-   cls_flower.exts = >exts;
cls_flower.classid = f->res.classid;
 
if (tc_setup_flow_action(>action, >exts) < 0)
@@ -346,7 +345,6 @@ static void fl_hw_update_stats(struct tcf_proto *tp, struct 
cls_fl_filter *f)
tc_cls_common_offload_init(_flower.common, tp, f->flags, NULL);
cls_flower.command = TC_CLSFLOWER_STATS;
cls_flower.cookie = (unsigned long) f;
-   cls_flower.exts = >exts;
cls_flower.classid = f->res.classid;
 
tc_setup_cb_call(block, >exts, TC_SETUP_CLSFLOWER,
@@ -1367,7 +1365,6 @@ static int fl_reoffload(struct tcf_proto *tp, bool add, 
tc_setup_cb_t *cb,
cls_flower.rule.match.dissector = >dissector;
cls_flower.rule.match.mask = >key;
cls_flower.rule.match.key = >mkey;
-   cls_flower.exts = >exts;
cls_flower.rule.action.num_keys = f->action.num_keys;
cls_flower.rule.action.keys = f->action.keys;
cls_flower.classid = f->res.classid;
@@ -1392,7 +1389,6 @@ static void fl_hw_create_tmplt(struct tcf_chain *chain,
 {
struct tc_cls_flower_offload cls_flower = {};
struct tcf_block *block = chain->block;
-   struct tcf_exts dummy_exts = { 0, };
 
cls_flower.common.chain_index = chain->index;
cls_flower.command = TC_CLSFLOWER_TMPLT_CREATE;
@@ -1400,7 +1396,6 @@ static void fl_hw_create_tmplt(struct tcf_chain *chain,
cls_flower.rule.match.dissector = >dissector;
cls_flower.rule.match.mask = >mask;
cls_flower.rule.match.key = >dummy_key;
-   cls_flower.exts = _exts;
 
/* We don't care if driver (any of them) fails to handle this
 * call. It serves just as a hint for it.
-- 
2.11.0

[PATCH 00/10] add flow_rule infrastructure

2018-11-15 Thread Pablo Neira Ayuso

This patchset introduces a kernel intermediate representation (IR) to
express ACL hardware offloads, this is heavily based on the existing
flow dissector infrastructure and the TC actions. This IR can be used by
different frontend ACL interfaces such as ethtool_rxnfc and tc to
represent ACL hardware offloads. Main goal is to simplify the
development of ACL hardware offloads for the existing frontend
interfaces, the idea is that driver developers do not need to add one
specific parser for each ACL frontend, instead each frontend can just
generate this flow_rule IR and pass it to drivers to populate the
hardware IR.

.   ethtool_rxnfc   tc
   |   (ioctl)(netlink)
   |  | | translate native
  Frontend |  | |  interface representation
   |  | |  to flow_rule IR
   |  | |
. \/\/
. flow_rule IR
   ||
   Drivers || parsing of flow_rule IR
   ||  to populate hardware IR
   |   \/
.  hardware IR (driver)

For design and implementation details, please have a look at:

https://lwn.net/Articles/766695/

As an example, with this patchset, it should be possible to simplify the
existing net/qede driver which already has two parsers to populate the
hardware IR, one for ethtool_rxnfc interface and another for tc.

This batch is composed of 10 patches:

Patch #1 adds the flow_match structure, this includes the
 flow_rule_match_key() interface to check for existing selectors
 that are in used in the rule and the flow_rule_match_*()
 functions to fetch the selector value and the mask. This
 also introduces the initial flow_rule structure skeleton to
 avoid a follow up patch that would update the same LoCs.

Patch #2 makes changes to packet edit parser of mlx5e driver, to prepare
 introduction of the new flow_action to mangle packets.

Patch #3 Introduce flow_action infrastructure. This infrastructure is
 based on the TC actions. Patch #8 extends it so it also
 supports two new actions that are only available through the
 ethtool_rxnfc interface.

Patch #4 Add function to translate TC action to flow_action from
 cls_flower.

Patch #5 Add infrastructure to fetch statistics into container structure
 and synchronize them to TC actions from cls_flower. Another
 preparation patch before patch #7, so we can stop exposing the
 TC action native layout to the drivers.

Patch #6 Use flow_action infrastructure from drivers.

Patch #7 Do not expose TC actions to drivers anymore, now that drivers
 have been converted to use the flow_action infrastructure after
 patch #5.

Patch #8 Support to wake-up-on-lan and queue actions for the flow_action
 infrastructure, two actions supported by NICs. This is used by
 the ethtool_rx_flow interface.

Patch #9 Add a function to translate from ethtool_rx_flow_spec structure
 to the flow_action structure. This is a simple enough for its
 first client: the ethtool_rxnfc interface in the bcm_sf2 driver.

Patch #10 Update bcm_sf2 to use this new translator function and
  update codebase to configure hardware IR using the
  flow_action representation. This will allow later development
  of cls_flower using the same codebase from the driver.

This patchset has passed here functional tests of the codepath that
generates the flow_rule structure and the functions to implement the
parsers that populate the hardware IR.

Thanks.

Pablo Neira Ayuso (10):
  flow_dissector: add flow_rule and flow_match structures and use them
  net/mlx5e: support for two independent packet edit actions
  flow_dissector: add flow action infrastructure
  cls_api: add translator to flow_action representation
  cls_flower: add statistics retrieval infrastructure and use it
  drivers: net: use flow action infrastructure
  cls_flower: don't expose TC actions to drivers anymore
  flow_dissector: add wake-up-on-lan and queue to flow_action
  flow_dissector: add basic ethtool_rx_flow_spec to flow_rule structure 
translator
  dsa: bcm_sf2: use flow_rule infrastructure

 drivers/net/dsa/bcm_sf2_cfp.c  | 103 ++-
 drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c   | 252 +++
 .../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c   | 450 ++---
 drivers/net/ethernet/intel/i40e/i40e_main.c| 178 ++---
 drivers/net/ethernet/intel/iavf/iavf_main.c| 195 +++---
 drivers/net/ethernet/intel/igb/igb_main.c  |  64 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c| 743 ++---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c |   2 +-
 .../net/ethernet/mellanox/mlxsw/spectrum_flower.c  |

[PATCH 08/10] flow_dissector: add wake-up-on-lan and queue to flow_action

2018-11-15 Thread Pablo Neira Ayuso

These actions need to be added to support bcm sf2 features available
through the ethtool_rx_flow interface.

Reviewed-by: Florian Fainelli 
Signed-off-by: Pablo Neira Ayuso 
---
 include/net/flow_dissector.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
index 925c208816f1..7a4683646d5a 100644
--- a/include/net/flow_dissector.h
+++ b/include/net/flow_dissector.h
@@ -418,6 +418,8 @@ enum flow_action_key_id {
FLOW_ACTION_KEY_ADD,
FLOW_ACTION_KEY_CSUM,
FLOW_ACTION_KEY_MARK,
+   FLOW_ACTION_KEY_WAKE,
+   FLOW_ACTION_KEY_QUEUE,
 };
 
 /* This is mirroring enum pedit_header_type definition for easy mapping between
@@ -452,6 +454,7 @@ struct flow_action_key {
const struct ip_tunnel_info *tunnel;/* 
FLOW_ACTION_KEY_TUNNEL_ENCAP */
u32 csum_flags; /* FLOW_ACTION_KEY_CSUM 
*/
u32 mark;   /* FLOW_ACTION_KEY_MARK 
*/
+   u32 queue_index;/* 
FLOW_ACTION_KEY_QUEUE */
};
 };
 
-- 
2.11.0

[PATCH 06/10] drivers: net: use flow action infrastructure

2018-11-15 Thread Pablo Neira Ayuso

This patch updates drivers to use the new flow action infrastructure.

Signed-off-by: Pablo Neira Ayuso 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c   |  74 +++---
 .../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c   | 250 +--
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c| 266 ++---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c |   2 +-
 .../net/ethernet/mellanox/mlxsw/spectrum_flower.c  |  55 +++--
 drivers/net/ethernet/netronome/nfp/flower/action.c | 185 +++---
 drivers/net/ethernet/qlogic/qede/qede_filter.c |  12 +-
 7 files changed, 417 insertions(+), 427 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
index 684fddd98ca0..15dc45b7dd13 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
@@ -61,9 +61,9 @@ static u16 bnxt_flow_get_dst_fid(struct bnxt *pf_bp, struct 
net_device *dev)
 
 static int bnxt_tc_parse_redir(struct bnxt *bp,
   struct bnxt_tc_actions *actions,
-  const struct tc_action *tc_act)
+  const struct flow_action_key *act)
 {
-   struct net_device *dev = tcf_mirred_dev(tc_act);
+   struct net_device *dev = act->dev;
 
if (!dev) {
netdev_info(bp->dev, "no dev in mirred action");
@@ -77,16 +77,16 @@ static int bnxt_tc_parse_redir(struct bnxt *bp,
 
 static int bnxt_tc_parse_vlan(struct bnxt *bp,
  struct bnxt_tc_actions *actions,
- const struct tc_action *tc_act)
+ const struct flow_action_key *act)
 {
-   switch (tcf_vlan_action(tc_act)) {
-   case TCA_VLAN_ACT_POP:
+   switch (act->id) {
+   case FLOW_ACTION_KEY_VLAN_POP:
actions->flags |= BNXT_TC_ACTION_FLAG_POP_VLAN;
break;
-   case TCA_VLAN_ACT_PUSH:
+   case FLOW_ACTION_KEY_VLAN_PUSH:
actions->flags |= BNXT_TC_ACTION_FLAG_PUSH_VLAN;
-   actions->push_vlan_tci = htons(tcf_vlan_push_vid(tc_act));
-   actions->push_vlan_tpid = tcf_vlan_push_proto(tc_act);
+   actions->push_vlan_tci = htons(act->vlan.vid);
+   actions->push_vlan_tpid = act->vlan.proto;
break;
default:
return -EOPNOTSUPP;
@@ -96,10 +96,10 @@ static int bnxt_tc_parse_vlan(struct bnxt *bp,
 
 static int bnxt_tc_parse_tunnel_set(struct bnxt *bp,
struct bnxt_tc_actions *actions,
-   const struct tc_action *tc_act)
+   const struct flow_action_key *act)
 {
-   struct ip_tunnel_info *tun_info = tcf_tunnel_info(tc_act);
-   struct ip_tunnel_key *tun_key = _info->key;
+   const struct ip_tunnel_info *tun_info = act->tunnel;
+   const struct ip_tunnel_key *tun_key = _info->key;
 
if (ip_tunnel_info_af(tun_info) != AF_INET) {
netdev_info(bp->dev, "only IPv4 tunnel-encap is supported");
@@ -113,51 +113,43 @@ static int bnxt_tc_parse_tunnel_set(struct bnxt *bp,
 
 static int bnxt_tc_parse_actions(struct bnxt *bp,
 struct bnxt_tc_actions *actions,
-struct tcf_exts *tc_exts)
+struct flow_action *flow_action)
 {
-   const struct tc_action *tc_act;
+   struct flow_action_key *act;
int i, rc;
 
-   if (!tcf_exts_has_actions(tc_exts)) {
+   if (!flow_action_has_keys(flow_action)) {
netdev_info(bp->dev, "no actions");
return -EINVAL;
}
 
-   tcf_exts_for_each_action(i, tc_act, tc_exts) {
-   /* Drop action */
-   if (is_tcf_gact_shot(tc_act)) {
+   flow_action_for_each(i, act, flow_action) {
+   switch (act->id) {
+   case FLOW_ACTION_KEY_DROP:
actions->flags |= BNXT_TC_ACTION_FLAG_DROP;
return 0; /* don't bother with other actions */
-   }
-
-   /* Redirect action */
-   if (is_tcf_mirred_egress_redirect(tc_act)) {
-   rc = bnxt_tc_parse_redir(bp, actions, tc_act);
+   case FLOW_ACTION_KEY_REDIRECT:
+   rc = bnxt_tc_parse_redir(bp, actions, act);
if (rc)
return rc;
-   continue;
-   }
-
-   /* Push/pop VLAN */
-   if (is_tcf_vlan(tc_act)) {
-   rc = bnxt_tc_parse_vlan(bp, actions, tc_act);
+   break;
+   case FLOW_ACTION_KEY_VLAN_POP:
+   case FLOW_ACTION_KEY_VLAN_PUSH:
+   case FLOW_ACTION_KEY_VLAN_MANGLE:
+   rc = bnxt_tc_parse_vlan(bp, actions, act);

[PATCH 03/10] flow_dissector: add flow action infrastructure

2018-11-15 Thread Pablo Neira Ayuso

This new infrastructure defines the nic actions that you can perform
from existing network drivers. This infrastructure allows us to avoid a
direct dependency with the native software TC action representation.

Signed-off-by: Pablo Neira Ayuso 
---
 include/net/flow_dissector.h | 70 
 net/core/flow_dissector.c| 18 
 2 files changed, 88 insertions(+)

diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
index 965a82b8d881..925c208816f1 100644
--- a/include/net/flow_dissector.h
+++ b/include/net/flow_dissector.h
@@ -402,8 +402,78 @@ void flow_rule_match_enc_keyid(const struct flow_rule 
*rule,
 void flow_rule_match_enc_opts(const struct flow_rule *rule,
  struct flow_match_enc_opts *out);
 
+enum flow_action_key_id {
+   FLOW_ACTION_KEY_ACCEPT  = 0,
+   FLOW_ACTION_KEY_DROP,
+   FLOW_ACTION_KEY_TRAP,
+   FLOW_ACTION_KEY_GOTO,
+   FLOW_ACTION_KEY_REDIRECT,
+   FLOW_ACTION_KEY_MIRRED,
+   FLOW_ACTION_KEY_VLAN_PUSH,
+   FLOW_ACTION_KEY_VLAN_POP,
+   FLOW_ACTION_KEY_VLAN_MANGLE,
+   FLOW_ACTION_KEY_TUNNEL_ENCAP,
+   FLOW_ACTION_KEY_TUNNEL_DECAP,
+   FLOW_ACTION_KEY_MANGLE,
+   FLOW_ACTION_KEY_ADD,
+   FLOW_ACTION_KEY_CSUM,
+   FLOW_ACTION_KEY_MARK,
+};
+
+/* This is mirroring enum pedit_header_type definition for easy mapping between
+ * tc pedit action. Legacy TCA_PEDIT_KEY_EX_HDR_TYPE_NETWORK is mapped to
+ * FLOW_ACT_MANGLE_UNSPEC, which is supported by no driver.
+ */
+enum flow_act_mangle_base {
+   FLOW_ACT_MANGLE_UNSPEC  = 0,
+   FLOW_ACT_MANGLE_HDR_TYPE_ETH,
+   FLOW_ACT_MANGLE_HDR_TYPE_IP4,
+   FLOW_ACT_MANGLE_HDR_TYPE_IP6,
+   FLOW_ACT_MANGLE_HDR_TYPE_TCP,
+   FLOW_ACT_MANGLE_HDR_TYPE_UDP,
+};
+
+struct flow_action_key {
+   enum flow_action_key_id id;
+   union {
+   u32 chain_index;/* FLOW_ACTION_KEY_GOTO 
*/
+   struct net_device   *dev;   /* 
FLOW_ACTION_KEY_REDIRECT */
+   struct {/* FLOW_ACTION_KEY_VLAN 
*/
+   u16 vid;
+   __be16  proto;
+   u8  prio;
+   } vlan;
+   struct {/* 
FLOW_ACTION_KEY_PACKET_EDIT */
+   enum flow_act_mangle_base htype;
+   u32 offset;
+   u32 mask;
+   u32 val;
+   } mangle;
+   const struct ip_tunnel_info *tunnel;/* 
FLOW_ACTION_KEY_TUNNEL_ENCAP */
+   u32 csum_flags; /* FLOW_ACTION_KEY_CSUM 
*/
+   u32 mark;   /* FLOW_ACTION_KEY_MARK 
*/
+   };
+};
+
+struct flow_action {
+   int num_keys;
+   struct flow_action_key  *keys;
+};
+
+int flow_action_init(struct flow_action *flow_action, int num_acts);
+void flow_action_free(struct flow_action *flow_action);
+
+static inline bool flow_action_has_keys(const struct flow_action *action)
+{
+   return action->num_keys;
+}
+
+#define flow_action_for_each(__i, __act, __actions)\
+for (__i = 0, __act = &(__actions)->keys[0]; __i < 
(__actions)->num_keys; __act = &(__actions)->keys[++__i])
+
 struct flow_rule {
struct flow_match   match;
+   struct flow_action  action;
 };
 
 static inline bool flow_rule_match_key(const struct flow_rule *rule,
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 186089b8d852..b9368349f0f7 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -258,6 +258,24 @@ void flow_rule_match_enc_opts(const struct flow_rule *rule,
 }
 EXPORT_SYMBOL(flow_rule_match_enc_opts);
 
+int flow_action_init(struct flow_action *flow_action, int num_acts)
+{
+   flow_action->keys = kmalloc(sizeof(struct flow_action_key) * num_acts,
+   GFP_KERNEL);
+   if (!flow_action->keys)
+   return -ENOMEM;
+
+   flow_action->num_keys = num_acts;
+   return 0;
+}
+EXPORT_SYMBOL(flow_action_init);
+
+void flow_action_free(struct flow_action *flow_action)
+{
+   kfree(flow_action->keys);
+}
+EXPORT_SYMBOL(flow_action_free);
+
 /**
  * __skb_flow_get_ports - extract the upper layer ports and return them
  * @skb: sk_buff to extract the ports from
-- 
2.11.0

[PATCH 10/10] dsa: bcm_sf2: use flow_rule infrastructure

2018-11-15 Thread Pablo Neira Ayuso

Update this driver to use the flow_rule infrastructure, hence we can use
the same code to populate hardware IR from ethtool_rx_flow and the
cls_flower interfaces.

Signed-off-by: Pablo Neira Ayuso 
---
 drivers/net/dsa/bcm_sf2_cfp.c | 103 --
 1 file changed, 70 insertions(+), 33 deletions(-)

diff --git a/drivers/net/dsa/bcm_sf2_cfp.c b/drivers/net/dsa/bcm_sf2_cfp.c
index e14663ab6dbc..26e1b41f424e 100644
--- a/drivers/net/dsa/bcm_sf2_cfp.c
+++ b/drivers/net/dsa/bcm_sf2_cfp.c
@@ -257,7 +257,8 @@ static int bcm_sf2_cfp_act_pol_set(struct bcm_sf2_priv 
*priv,
 }
 
 static void bcm_sf2_cfp_slice_ipv4(struct bcm_sf2_priv *priv,
-  struct ethtool_tcpip4_spec *v4_spec,
+  struct flow_dissector_key_ipv4_addrs *addrs,
+  struct flow_dissector_key_ports *ports,
   unsigned int slice_num,
   bool mask)
 {
@@ -278,7 +279,7 @@ static void bcm_sf2_cfp_slice_ipv4(struct bcm_sf2_priv 
*priv,
 * UDF_n_A6 [23:8]
 * UDF_n_A5 [7:0]
 */
-   reg = be16_to_cpu(v4_spec->pdst) >> 8;
+   reg = be16_to_cpu(ports->dst) >> 8;
if (mask)
offset = CORE_CFP_MASK_PORT(3);
else
@@ -289,9 +290,9 @@ static void bcm_sf2_cfp_slice_ipv4(struct bcm_sf2_priv 
*priv,
 * UDF_n_A4 [23:8]
 * UDF_n_A3 [7:0]
 */
-   reg = (be16_to_cpu(v4_spec->pdst) & 0xff) << 24 |
- (u32)be16_to_cpu(v4_spec->psrc) << 8 |
- (be32_to_cpu(v4_spec->ip4dst) & 0xff00) >> 8;
+   reg = (be16_to_cpu(ports->dst) & 0xff) << 24 |
+ (u32)be16_to_cpu(ports->src) << 8 |
+ (be32_to_cpu(addrs->dst) & 0xff00) >> 8;
if (mask)
offset = CORE_CFP_MASK_PORT(2);
else
@@ -302,9 +303,9 @@ static void bcm_sf2_cfp_slice_ipv4(struct bcm_sf2_priv 
*priv,
 * UDF_n_A2 [23:8]
 * UDF_n_A1 [7:0]
 */
-   reg = (u32)(be32_to_cpu(v4_spec->ip4dst) & 0xff) << 24 |
- (u32)(be32_to_cpu(v4_spec->ip4dst) >> 16) << 8 |
- (be32_to_cpu(v4_spec->ip4src) & 0xff00) >> 8;
+   reg = (u32)(be32_to_cpu(addrs->dst) & 0xff) << 24 |
+ (u32)(be32_to_cpu(addrs->dst) >> 16) << 8 |
+ (be32_to_cpu(addrs->src) & 0xff00) >> 8;
if (mask)
offset = CORE_CFP_MASK_PORT(1);
else
@@ -317,8 +318,8 @@ static void bcm_sf2_cfp_slice_ipv4(struct bcm_sf2_priv 
*priv,
 * Slice ID [3:2]
 * Slice valid  [1:0]
 */
-   reg = (u32)(be32_to_cpu(v4_spec->ip4src) & 0xff) << 24 |
- (u32)(be32_to_cpu(v4_spec->ip4src) >> 16) << 8 |
+   reg = (u32)(be32_to_cpu(addrs->src) & 0xff) << 24 |
+ (u32)(be32_to_cpu(addrs->src) >> 16) << 8 |
  SLICE_NUM(slice_num) | SLICE_VALID;
if (mask)
offset = CORE_CFP_MASK_PORT(0);
@@ -335,6 +336,11 @@ static int bcm_sf2_cfp_ipv4_rule_set(struct bcm_sf2_priv 
*priv, int port,
struct ethtool_tcpip4_spec *v4_spec, *v4_m_spec;
const struct cfp_udf_layout *layout;
unsigned int slice_num, rule_index;
+   struct flow_match_ipv4_addrs ipv4;
+   struct flow_match_ports ports;
+   struct flow_match_basic basic;
+   struct flow_rule *flow_rule;
+   struct flow_match_ip ip;
u8 ip_proto, ip_frag;
u8 num_udf;
u32 reg;
@@ -367,11 +373,22 @@ static int bcm_sf2_cfp_ipv4_rule_set(struct bcm_sf2_priv 
*priv, int port,
if (rule_index > bcm_sf2_cfp_rule_size(priv))
return -ENOSPC;
 
+   flow_rule = ethtool_rx_flow_rule(fs);
+   if (!flow_rule)
+   return -ENOMEM;
+
+   flow_rule_match_ipv4_addrs(flow_rule, );
+   flow_rule_match_ports(flow_rule, );
+   flow_rule_match_basic(flow_rule, );
+   flow_rule_match_ip(flow_rule, );
+
layout = _tcpip4_layout;
/* We only use one UDF slice for now */
slice_num = bcm_sf2_get_slice_number(layout, 0);
-   if (slice_num == UDF_NUM_SLICES)
-   return -EINVAL;
+   if (slice_num == UDF_NUM_SLICES) {
+   ret = -EINVAL;
+   goto out_err_flow_rule;
+   }
 
num_udf = bcm_sf2_get_num_udf_slices(layout->udfs[slice_num].slices);
 
@@ -398,9 +415,10 @@ static int bcm_sf2_cfp_ipv4_rule_set(struct bcm_sf2_priv 
*priv, int port,
 * Reserved [1]
 * UDF_Valid[8] [0]
 */
-   core_writel(priv, v4_spec->tos << IPTOS_SHIFT |
-   ip_proto << IPPROTO_SHIFT | ip_frag << IP_FRAG_SHIFT |
-   udf_upper_bits(num_udf),
+   core_writel(priv, ip.key->tos << IPTOS_SHIFT |
+ basic.key->n_proto << IPPROTO_SHIFT |
+ ip_frag <<

[iproute2-next PATCH v3 2/2] man: tc-flower: Add explanation for range option

2018-11-15 Thread Amritha Nambiar

Add details explaining filtering based on port ranges.

Signed-off-by: Amritha Nambiar 
---
 man/man8/tc-flower.8 |   12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/man/man8/tc-flower.8 b/man/man8/tc-flower.8
index 8be8882..768bfa1 100644
--- a/man/man8/tc-flower.8
+++ b/man/man8/tc-flower.8
@@ -56,8 +56,10 @@ flower \- flow based traffic control filter
 .IR MASKED_IP_TTL " | { "
 .BR dst_ip " | " src_ip " } "
 .IR PREFIX " | { "
-.BR dst_port " | " src_port " } "
-.IR port_number " } | "
+.BR dst_port " | " src_port " } { "
+.IR port_number " | "
+.B range
+.IR min_port_number-max_port_number " } | "
 .B tcp_flags
 .IR MASKED_TCP_FLAGS " | "
 .B type
@@ -227,6 +229,12 @@ Match on layer 4 protocol source or destination port 
number. Only available for
 .BR ip_proto " values " udp ", " tcp  " and " sctp
 which have to be specified in beforehand.
 .TP
+.BI range " MIN_VALUE-MAX_VALUE"
+Match on a range of layer 4 protocol source or destination port number. Only
+available for
+.BR ip_proto " values " udp ", " tcp  " and " sctp
+which have to be specified in beforehand.
+.TP
 .BI tcp_flags " MASKED_TCP_FLAGS"
 Match on TCP flags represented as 12bit bitfield in in hexadecimal format.
 A mask may be optionally provided to limit the bits which are matched. A mask

[iproute2-next PATCH v3 1/2] tc: flower: Classify packets based port ranges

2018-11-15 Thread Amritha Nambiar

Added support for filtering based on port ranges.
UAPI changes have been accepted into net-next.

Example:
1. Match on a port range:
-
$ tc filter add dev enp4s0 protocol ip parent :\
  prio 1 flower ip_proto tcp dst_port range 20-30 skip_hw\
  action drop

$ tc -s filter show dev enp4s0 parent :
filter protocol ip pref 1 flower chain 0
filter protocol ip pref 1 flower chain 0 handle 0x1
  eth_type ipv4
  ip_proto tcp
  dst_port range 20-30
  skip_hw
  not_in_hw
action order 1: gact action drop
 random type none pass val 0
 index 1 ref 1 bind 1 installed 85 sec used 3 sec
Action statistics:
Sent 460 bytes 10 pkt (dropped 10, overlimits 0 requeues 0)
backlog 0b 0p requeues 0

2. Match on IP address and port range:
--
$ tc filter add dev enp4s0 protocol ip parent :\
  prio 1 flower dst_ip 192.168.1.1 ip_proto tcp dst_port range 100-200\
  skip_hw action drop

$ tc -s filter show dev enp4s0 parent :
filter protocol ip pref 1 flower chain 0 handle 0x2
  eth_type ipv4
  ip_proto tcp
  dst_ip 192.168.1.1
  dst_port range 100-200
  skip_hw
  not_in_hw
action order 1: gact action drop
 random type none pass val 0
 index 2 ref 1 bind 1 installed 58 sec used 2 sec
Action statistics:
Sent 920 bytes 20 pkt (dropped 20, overlimits 0 requeues 0)
backlog 0b 0p requeues 0

v3:
Modified flower_port_range_attr_type calls.

v2:
Addressed Jiri's comment to sync output format with input

Signed-off-by: Amritha Nambiar 
---
 include/uapi/linux/pkt_cls.h |7 ++
 tc/f_flower.c|  143 +++---
 2 files changed, 140 insertions(+), 10 deletions(-)

diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
index 401d0c1..95d0db2 100644
--- a/include/uapi/linux/pkt_cls.h
+++ b/include/uapi/linux/pkt_cls.h
@@ -485,6 +485,11 @@ enum {
 
TCA_FLOWER_IN_HW_COUNT,
 
+   TCA_FLOWER_KEY_PORT_SRC_MIN,/* be16 */
+   TCA_FLOWER_KEY_PORT_SRC_MAX,/* be16 */
+   TCA_FLOWER_KEY_PORT_DST_MIN,/* be16 */
+   TCA_FLOWER_KEY_PORT_DST_MAX,/* be16 */
+
__TCA_FLOWER_MAX,
 };
 
@@ -518,6 +523,8 @@ enum {
TCA_FLOWER_KEY_FLAGS_FRAG_IS_FIRST = (1 << 1),
 };
 
+#define TCA_FLOWER_MASK_FLAGS_RANGE(1 << 0) /* Range-based match */
+
 /* Match-all classifier */
 
 enum {
diff --git a/tc/f_flower.c b/tc/f_flower.c
index 65fca04..9bddf7b 100644
--- a/tc/f_flower.c
+++ b/tc/f_flower.c
@@ -494,6 +494,68 @@ static int flower_parse_port(char *str, __u8 ip_proto,
return 0;
 }
 
+static int flower_port_range_attr_type(__u8 ip_proto, enum flower_endpoint 
type,
+  __be16 *min_port_type,
+  __be16 *max_port_type)
+{
+   if (ip_proto == IPPROTO_TCP || ip_proto == IPPROTO_UDP ||
+   ip_proto == IPPROTO_SCTP) {
+   if (type == FLOWER_ENDPOINT_SRC) {
+   *min_port_type = TCA_FLOWER_KEY_PORT_SRC_MIN;
+   *max_port_type = TCA_FLOWER_KEY_PORT_SRC_MAX;
+   } else {
+   *min_port_type = TCA_FLOWER_KEY_PORT_DST_MIN;
+   *max_port_type = TCA_FLOWER_KEY_PORT_DST_MAX;
+   }
+   } else {
+   return -1;
+   }
+
+   return 0;
+}
+
+static int flower_parse_port_range(__be16 *min, __be16 *max, __u8 ip_proto,
+  enum flower_endpoint endpoint,
+  struct nlmsghdr *n)
+{
+   __be16 min_port_type, max_port_type;
+
+   if (flower_port_range_attr_type(ip_proto, endpoint, _port_type,
+   _port_type))
+   return -1;
+
+   addattr16(n, MAX_MSG, min_port_type, *min);
+   addattr16(n, MAX_MSG, max_port_type, *max);
+
+   return 0;
+}
+
+static int get_range(__be16 *min, __be16 *max, char *argv)
+{
+   char *r;
+
+   r = strchr(argv, '-');
+   if (r) {
+   *r = '\0';
+   if (get_be16(min, argv, 10)) {
+   fprintf(stderr, "invalid min range\n");
+   return -1;
+   }
+   if (get_be16(max, r + 1, 10)) {
+   fprintf(stderr, "invalid max range\n");
+   return -1;
+   }
+   if (htons(*max) <= htons(*min)) {
+   fprintf(stderr, "max value should be greater than min 
value\n");
+   return -1;
+   }
+   } else {
+   fprintf(stderr, "Illegal range format\n");
+   return -1;
+   }
+   return 0;
+}
+
 #define TCP_FLAGS_MAX_MASK 0xfff
 
 static int flower_parse_tcp_flags(char *str, int flags_type, int mask_type,
@@ -1061,20 +1123,54 @@ static int flower_parse_opt(struct filter_util *qu, 
char *handle,

[PATCH net-next] tcp: add SRTT to SCM_TIMESTAMPING_OPT_STATS

2018-11-15 Thread Yousuk Seung

Add TCP_NLA_SRTT to SCM_TIMESTAMPING_OPT_STATS that reports the smoothed
round trip time in microseconds (tcp_sock.srtt_us >> 3).

Signed-off-by: Yousuk Seung 
Signed-off-by: Eric Dumazet 
Acked-by: Soheil Hassas Yeganeh 
Acked-by: Neal Cardwell 
Acked-by: Yuchung Cheng 
---
 include/uapi/linux/tcp.h | 1 +
 net/ipv4/tcp.c   | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index e02d31986ff91..8bb6cc5f32356 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -266,6 +266,7 @@ enum {
TCP_NLA_BYTES_RETRANS,  /* Data bytes retransmitted */
TCP_NLA_DSACK_DUPS, /* DSACK blocks received */
TCP_NLA_REORD_SEEN, /* reordering events seen */
+   TCP_NLA_SRTT,   /* smoothed RTT in usecs */
 };
 
 /* for TCP_MD5SIG socket option */
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 9e6bc4d6daa75..0363a0ebee57d 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3246,6 +3246,7 @@ static size_t tcp_opt_stats_get_size(void)
nla_total_size_64bit(sizeof(u64)) + /* TCP_NLA_BYTES_RETRANS */
nla_total_size(sizeof(u32)) + /* TCP_NLA_DSACK_DUPS */
nla_total_size(sizeof(u32)) + /* TCP_NLA_REORD_SEEN */
+   nla_total_size(sizeof(u32)) + /* TCP_NLA_SRTT */
0;
 }
 
@@ -3299,6 +3300,7 @@ struct sk_buff *tcp_get_timestamping_opt_stats(const 
struct sock *sk)
  TCP_NLA_PAD);
nla_put_u32(stats, TCP_NLA_DSACK_DUPS, tp->dsack_dups);
nla_put_u32(stats, TCP_NLA_REORD_SEEN, tp->reord_seen);
+   nla_put_u32(stats, TCP_NLA_SRTT, tp->srtt_us >> 3);
 
return stats;
 }
-- 
2.19.1.1215.g8438c0b245-goog

Re: [PATCH bpf-next v2] bpftool: make libbfd optional

2018-11-15 Thread Stanislav Fomichev

On 11/13, Quentin Monnet wrote:
> 2018-11-12 14:02 UTC-0800 ~ Jakub Kicinski 
> > On Mon, 12 Nov 2018 13:44:10 -0800, Stanislav Fomichev wrote:
> >> Make it possible to build bpftool without libbfd. libbfd and libopcodes are
> >> typically provided in dev/dbg packages (binutils-dev in debian) which we
> >> usually don't have installed on the fleet machines and we'd like a way to 
> >> have
> >> bpftool version that works without installing any additional packages.
> >> This excludes support for disassembling jit-ted code and prints an error if
> >> the user tries to use these features.
> >>
> >> Tested by:
> >> cat > FEATURES_DUMP.bpftool < >> feature-libbfd=0
> >> feature-disassembler-four-args=1
> >> feature-reallocarray=0
> >> feature-libelf=1
> >> feature-libelf-mmap=1
> >> feature-bpf=1
> >> EOF
> >> FEATURES_DUMP=$PWD/FEATURES_DUMP.bpftool make
> >> ldd bpftool | grep libbfd
> >>
> >> Signed-off-by: Stanislav Fomichev 
> > 
> > Seems reasonable, thanks!
> > 
> > Acked-by: Jakub Kicinski 
> > 
> 
> Thanks Stanislav!
> 
> There is a problem with this patch on some distributions, Ubuntu at least.
> 
> Feature detection for libbfd has been used for perf before being also
> used with bpftool. Since commit 280e7c48c3b8 the feature needs libz and
> libiberty to be present on the system, otherwise the feature would not
> compile (and be detected) on OpenSuse.
> 
> On Ubuntu, libiberty is not needed (libbfd might be statically linked
> against it, if I remember correctly?), which means that we are able to
> build bpftool as long as binutils-dev has been installed, even if
> libiberty-dev has not been installed. The BFD feature, in that case,
> will appear as “undetected”. It is a bug. But since the Makefile does
> not stop compilation in that case (another bug), in the end we're good.
> 
> With your patch, the problem is that libbpf detection will fail on
> Ubuntu if libiberty-dev is not present, even though all the necessary
> libraries for using the JIT disassembler are available. And in that case
> it _will_ make a difference, since the Makefile will no more compile the
> libbfd-related bits.
> 
> So I'm not against the idea, but we have to fix libbfd detection first.
Sent out https://lkml.org/lkml/2018/11/16/243, let's see how it goes :-)

> Thanks,
> Quentin

Re: [patch 1/1] drivers/net/ethernet/qlogic/qed/qed_rdma.h: fix typo

2018-11-15 Thread David Miller

From: a...@linux-foundation.org
Date: Thu, 15 Nov 2018 16:15:20 -0800

> From: Andrew Morton 
> Subject: drivers/net/ethernet/qlogic/qed/qed_rdma.h: fix typo
> 
> Add missing semicolon.
> 
> Fixes: 291d57f67d244973 ("qed: Fix rdma_info structure allocation")
> Cc: Michal Kalderon 
> Cc: Denis Bolotin 
> Cc: David S. Miller 
> Signed-off-by: Andrew Morton 

Applied.

[patch 1/1] drivers/net/ethernet/qlogic/qed/qed_rdma.h: fix typo

2018-11-15 Thread akpm

From: Andrew Morton 
Subject: drivers/net/ethernet/qlogic/qed/qed_rdma.h: fix typo

Add missing semicolon.

Fixes: 291d57f67d244973 ("qed: Fix rdma_info structure allocation")
Cc: Michal Kalderon 
Cc: Denis Bolotin 
Cc: David S. Miller 
Signed-off-by: Andrew Morton 
---

 drivers/net/ethernet/qlogic/qed/qed_rdma.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- 
a/drivers/net/ethernet/qlogic/qed/qed_rdma.h~drivers-net-ethernet-qlogic-qed-qed_rdmah-fix-typo
+++ a/drivers/net/ethernet/qlogic/qed/qed_rdma.h
@@ -183,7 +183,7 @@ void qed_rdma_info_free(struct qed_hwfn
 static inline void qed_rdma_dpm_conf(struct qed_hwfn *p_hwfn, struct qed_ptt 
*p_ptt) {}
 static inline void qed_rdma_dpm_bar(struct qed_hwfn *p_hwfn,
struct qed_ptt *p_ptt) {}
-static inline int qed_rdma_info_alloc(struct qed_hwfn *p_hwfn) {return -EINVAL}
+static inline int qed_rdma_info_alloc(struct qed_hwfn *p_hwfn) {return 
-EINVAL;}
 static inline void qed_rdma_info_free(struct qed_hwfn *p_hwfn) {}
 #endif
 
_

Re: [PATCH net 0/3] mlx4 fixes for 4.20-rc

2018-11-15 Thread David Miller

From: Tariq Toukan 
Date: Thu, 15 Nov 2018 18:05:12 +0200

> This patchset includes small fixes for mlx4_core driver.

Series applied.

Re: [PATCH net-next 8/8] net: eth: altera: tse: update devicetree bindings documentation

2018-11-15 Thread Thor Thayer


+ Rob Herring, Mark Rutland and the Device Tree mailing list.

On 11/14/18 6:50 PM, Dalon Westergreen wrote:

From: Dalon Westergreen 

Update devicetree bindings documentation to include msgdma
prefetcher and ptp bindings.

Signed-off-by: Dalon Westergreen 
---
  .../devicetree/bindings/net/altera_tse.txt| 98 +++
  1 file changed, 79 insertions(+), 19 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/altera_tse.txt 
b/Documentation/devicetree/bindings/net/altera_tse.txt
index 0e21df94a53f..d35806942a8f 100644
--- a/Documentation/devicetree/bindings/net/altera_tse.txt
+++ b/Documentation/devicetree/bindings/net/altera_tse.txt
@@ -2,50 +2,79 @@
  
  Required properties:

  - compatible: Should be "altr,tse-1.0" for legacy SGDMA based TSE, and should
-   be "altr,tse-msgdma-1.0" for the preferred MSGDMA based TSE.
+   be "altr,tse-msgdma-1.0" for the preferred MSGDMA based TSE,
+   and "altr,tse-msgdma-2.0" for MSGDMA with prefetcher based
+   implementations.
ALTR is supported for legacy device trees, but is deprecated.
altr should be used for all new designs.
  - reg: Address and length of the register set for the device. It contains
the information of registers in the same order as described by reg-names
  - reg-names: Should contain the reg names
-  "control_port": MAC configuration space region
-  "tx_csr":   xDMA Tx dispatcher control and status space region
-  "tx_desc":  MSGDMA Tx dispatcher descriptor space region
-  "rx_csr" :  xDMA Rx dispatcher control and status space region
-  "rx_desc":  MSGDMA Rx dispatcher descriptor space region
-  "rx_resp":  MSGDMA Rx dispatcher response space region
-  "s1":SGDMA descriptor memory
  - interrupts: Should contain the TSE interrupts and it's mode.
  - interrupt-names: Should contain the interrupt names
-  "rx_irq":   xDMA Rx dispatcher interrupt
-  "tx_irq":   xDMA Tx dispatcher interrupt
+  "rx_irq":   DMA Rx dispatcher interrupt
+  "tx_irq":   DMA Tx dispatcher interrupt
  - rx-fifo-depth: MAC receive FIFO buffer depth in bytes
  - tx-fifo-depth: MAC transmit FIFO buffer depth in bytes
  - phy-mode: See ethernet.txt in the same directory.
  - phy-handle: See ethernet.txt in the same directory.
  - phy-addr: See ethernet.txt in the same directory. A configuration should
include phy-handle or phy-addr.
-- altr,has-supplementary-unicast:
-   If present, TSE supports additional unicast addresses.
-   Otherwise additional unicast addresses are not supported.
-- altr,has-hash-multicast-filter:
-   If present, TSE supports a hash based multicast filter.
-   Otherwise, hash-based multicast filtering is not supported.
-
  - mdio device tree subnode: When the TSE has a phy connected to its local
mdio, there must be device tree subnode with the following
required properties:
-
- compatible: Must be "altr,tse-mdio".
- #address-cells: Must be <1>.
- #size-cells: Must be <0>.
  
  	For each phy on the mdio bus, there must be a node with the following

fields:
-
- reg: phy id used to communicate to phy.
- device_type: Must be "ethernet-phy".
  
+- altr,has-supplementary-unicast:

+   If present, TSE supports additional unicast addresses.
+   Otherwise additional unicast addresses are not supported.
+- altr,has-hash-multicast-filter:
+   If present, TSE supports a hash based multicast filter.
+   Otherwise, hash-based multicast filtering is not supported.
+- altr,has-ptp:
+   If present, TSE supports 1588 timestamping.  Currently only
+   supported with the msgdma prefetcher.
+- altr,tx-poll-cnt:
+   Optional cycle count for Tx prefetcher to poll descriptor
+   list.  If not present, defaults to 128, which at 125MHz is
+   roughly 1usec. Only for "altr,tse-msgdma-2.0".
+- altr,rx-poll-cnt:
+   Optional cycle count for Tx prefetcher to poll descriptor
+   list.  If not present, defaults to 128, which at 125MHz is
+   roughly 1usec. Only for "altr,tse-msgdma-2.0".
+
+Required registers by compatibility string:
+ - "altr,tse-1.0"
+   "control_port": MAC configuration space region
+   "tx_csr":   DMA Tx dispatcher control and status space region
+   "rx_csr" :  DMA Rx dispatcher control and status space region
+   "s1": DMA descriptor memory
+
+ - "altr,tse-msgdma-1.0"
+   "control_port": MAC configuration space region
+   "tx_csr":   DMA Tx dispatcher control and status space region
+   "tx_desc":  DMA Tx dispatcher descriptor space region
+   "rx_csr" :  DMA Rx dispatcher control and status space region
+   "rx_desc":  DMA Rx dispatcher descriptor space

[PATCH net-next] uapi/ethtool: fix spelling errors

2018-11-15 Thread Stephen Hemminger

Trivial spelling errors found by codespell.

Signed-off-by: Stephen Hemminger 
---
 include/uapi/linux/ethtool.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
index c8f8e2455bf3..17be76aeb468 100644
--- a/include/uapi/linux/ethtool.h
+++ b/include/uapi/linux/ethtool.h
@@ -882,7 +882,7 @@ struct ethtool_rx_flow_spec {
__u32   location;
 };
 
-/* How rings are layed out when accessing virtual functions or
+/* How rings are laid out when accessing virtual functions or
  * offloaded queues is device specific. To allow users to do flow
  * steering and specify these queues the ring cookie is partitioned
  * into a 32bit queue index with an 8 bit virtual function id.
@@ -891,7 +891,7 @@ struct ethtool_rx_flow_spec {
  * devices start supporting PCIe w/ARI. However at the moment I
  * do not know of any devices that support this so I do not reserve
  * space for this at this time. If a future patch consumes the next
- * byte it should be aware of this possiblity.
+ * byte it should be aware of this possibility.
  */
 #define ETHTOOL_RX_FLOW_SPEC_RING  0xLL
 #define ETHTOOL_RX_FLOW_SPEC_RING_VF   0x00FFLL
-- 
2.17.1

Re: [PATCH net] sctp: not allow to set asoc prsctp_enable by sockopt

2018-11-15 Thread Neil Horman

On Thu, Nov 15, 2018 at 08:25:36PM -0200, Marcelo Ricardo Leitner wrote:
> On Thu, Nov 15, 2018 at 04:43:10PM -0500, Neil Horman wrote:
> > On Thu, Nov 15, 2018 at 03:22:21PM -0200, Marcelo Ricardo Leitner wrote:
> > > On Thu, Nov 15, 2018 at 07:14:28PM +0800, Xin Long wrote:
> > > > As rfc7496#section4.5 says about SCTP_PR_SUPPORTED:
> > > > 
> > > >This socket option allows the enabling or disabling of the
> > > >negotiation of PR-SCTP support for future associations.  For existing
> > > >associations, it allows one to query whether or not PR-SCTP support
> > > >was negotiated on a particular association.
> > > > 
> > > > It means only sctp sock's prsctp_enable can be set.
> > > > 
> > > > Note that for the limitation of SCTP_{CURRENT|ALL}_ASSOC, we will
> > > > add it when introducing SCTP_{FUTURE|CURRENT|ALL}_ASSOC for linux
> > > > sctp in another patchset.
> > > > 
> > > > Fixes: 28aa4c26fce2 ("sctp: add SCTP_PR_SUPPORTED on sctp sockopt")
> > > > Reported-by: Ying Xu 
> > > > Signed-off-by: Xin Long 
> > > > ---
> > > >  net/sctp/socket.c | 13 +++--
> > > >  1 file changed, 3 insertions(+), 10 deletions(-)
> > > > 
> > > > diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> > > > index 739f3e5..e9b8232 100644
> > > > --- a/net/sctp/socket.c
> > > > +++ b/net/sctp/socket.c
> > > > @@ -3940,7 +3940,6 @@ static int sctp_setsockopt_pr_supported(struct 
> > > > sock *sk,
> > > > unsigned int optlen)
> > > >  {
> > > > struct sctp_assoc_value params;
> > > > -   struct sctp_association *asoc;
> > > > int retval = -EINVAL;
> > > >  
> > > > if (optlen != sizeof(params))
> > > > @@ -3951,16 +3950,10 @@ static int sctp_setsockopt_pr_supported(struct 
> > > > sock *sk,
> > > > goto out;
> > > > }
> > > >  
> > > > -   asoc = sctp_id2assoc(sk, params.assoc_id);
> > > > -   if (asoc) {
> > > > -   asoc->prsctp_enable = !!params.assoc_value;
> > > > -   } else if (!params.assoc_id) {
> > > > -   struct sctp_sock *sp = sctp_sk(sk);
> > > > -
> > > > -   sp->ep->prsctp_enable = !!params.assoc_value;
> > > > -   } else {
> > > > +   if (sctp_style(sk, UDP) && sctp_id2assoc(sk, params.assoc_id))
> > > 
> > > This would allow using a non-existent assoc id on UDP-style sockets to
> > > set it at the socket, which is not expected. It should be more like:
> > > 
> > > + if (sctp_style(sk, UDP) && params.assoc_id)
> > How do you see that to be the case? sctp_id2assoc will return NULL if an
> > association isn't found, so the use of sctp_id2assoc should work just fine.
> 
> Right, it will return NULL, and because of that it won't bail out as
> it should and will adjust the socket config instead.
> 

Oh, duh, you're absolutely right, NULL will evalutate to false there, and skip
the conditional goto out;

that said, It would make more sense to me to just change the sense of the second
condition to !sctp_id2assoc(sk, params.assoc_id), so that we goto out if no
association is found.  it still seems a bit dodgy to me to just check if
params.assoc_id is non-zero, as that will allow userspace to pass invalid assoc
ids in and have those trigger pr support updates.

Neil

[Patch net] net: invert the check of detecting hardware RX checksum fault

2018-11-15 Thread Cong Wang

The following evidences indicate this check is likely wrong:

1. In the assignment "skb->csum_valid = !sum", sum==0 indicates a valid 
checksum.

2. __skb_checksum_complete() always returns sum, and TCP packets are dropped
   only when it returns non-zero. So non-zero indicates a failure.

3. In __skb_checksum_validate_complete(), we have a nearly same check, where
   zero is considered as success.

4. csum_fold() already does the one’s complement, this indicates 0 should
   be considered as a successful validation.

5. We have triggered this fault for many times, but InCsumErrors field in
   /proc/net/snmp remains 0.

Base on the above, non-zero should be used as a checksum mismatch.

I tested this with mlx5 driver, no warning or InCsumErrors after 1 hour.

Fixes: fb286bb2990a ("[NET]: Detect hardware rx checksum faults correctly")
Cc: Herbert Xu 
Cc: Tom Herbert 
Cc: Eric Dumazet 
Signed-off-by: Cong Wang 
---
 net/core/datagram.c | 4 ++--
 net/core/dev.c  | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/core/datagram.c b/net/core/datagram.c
index 57f3a6fcfc1e..e542a9a212a7 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -733,7 +733,7 @@ __sum16 __skb_checksum_complete_head(struct sk_buff *skb, 
int len)
__sum16 sum;
 
sum = csum_fold(skb_checksum(skb, 0, len, skb->csum));
-   if (likely(!sum)) {
+   if (unlikely(sum)) {
if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE) &&
!skb->csum_complete_sw)
netdev_rx_csum_fault(skb->dev);
@@ -753,7 +753,7 @@ __sum16 __skb_checksum_complete(struct sk_buff *skb)
 
/* skb->csum holds pseudo checksum */
sum = csum_fold(csum_add(skb->csum, csum));
-   if (likely(!sum)) {
+   if (unlikely(sum)) {
if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE) &&
!skb->csum_complete_sw)
netdev_rx_csum_fault(skb->dev);
diff --git a/net/core/dev.c b/net/core/dev.c
index 0ffcbdd55fa9..c76dee329844 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5776,7 +5776,7 @@ __sum16 __skb_gro_checksum_complete(struct sk_buff *skb)
 
/* NAPI_GRO_CB(skb)->csum holds pseudo checksum */
sum = csum_fold(csum_add(NAPI_GRO_CB(skb)->csum, wsum));
-   if (likely(!sum)) {
+   if (unlikely(sum)) {
if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE) &&
!skb->csum_complete_sw)
netdev_rx_csum_fault(skb->dev);
-- 
2.19.1

Re: [Patch net-next] net: remove unused skb_send_sock()

2018-11-15 Thread David Miller

From: Cong Wang 
Date: Mon, 12 Nov 2018 18:05:24 -0800

> Signed-off-by: Cong Wang 

John, any plans to use this?  Looks like only skb_send_sock_lock()
currently has a user.

Re: [PATCH net-next 5/8] net: eth: altera: tse: Move common functions to altera_utils

2018-11-15 Thread Thor Thayer


On 11/14/18 6:50 PM, Dalon Westergreen wrote:

From: Dalon Westergreen 

Move request_and_map and other shared functions to altera_utils. This
is the first step to moving common code out of tse specific code so
that it can be shared with future altera ethernet ip.

Signed-off-by: Dalon Westergreen 
---
  drivers/net/ethernet/altera/altera_tse.h  | 45 --
  .../net/ethernet/altera/altera_tse_ethtool.c  |  1 +
  drivers/net/ethernet/altera/altera_tse_main.c | 32 +
  drivers/net/ethernet/altera/altera_utils.c| 30 
  drivers/net/ethernet/altera/altera_utils.h| 46 +++
  5 files changed, 78 insertions(+), 76 deletions(-)

diff --git a/drivers/net/ethernet/altera/altera_tse.h 
b/drivers/net/ethernet/altera/altera_tse.h
index 7f246040135d..f435fb0eca90 100644
--- a/drivers/net/ethernet/altera/altera_tse.h
+++ b/drivers/net/ethernet/altera/altera_tse.h
@@ -500,49 +500,4 @@ struct altera_tse_private {
   */
  void altera_tse_set_ethtool_ops(struct net_device *);
  
-static inline

-u32 csrrd32(void __iomem *mac, size_t offs)
-{
-   void __iomem *paddr = (void __iomem *)((uintptr_t)mac + offs);
-   return readl(paddr);
-}
-
-static inline
-u16 csrrd16(void __iomem *mac, size_t offs)
-{
-   void __iomem *paddr = (void __iomem *)((uintptr_t)mac + offs);
-   return readw(paddr);
-}
-
-static inline
-u8 csrrd8(void __iomem *mac, size_t offs)
-{
-   void __iomem *paddr = (void __iomem *)((uintptr_t)mac + offs);
-   return readb(paddr);
-}
-
-static inline
-void csrwr32(u32 val, void __iomem *mac, size_t offs)
-{
-   void __iomem *paddr = (void __iomem *)((uintptr_t)mac + offs);
-
-   writel(val, paddr);
-}
-
-static inline
-void csrwr16(u16 val, void __iomem *mac, size_t offs)
-{
-   void __iomem *paddr = (void __iomem *)((uintptr_t)mac + offs);
-
-   writew(val, paddr);
-}
-
-static inline
-void csrwr8(u8 val, void __iomem *mac, size_t offs)
-{
-   void __iomem *paddr = (void __iomem *)((uintptr_t)mac + offs);
-
-   writeb(val, paddr);
-}
-
  #endif /* __ALTERA_TSE_H__ */
diff --git a/drivers/net/ethernet/altera/altera_tse_ethtool.c 
b/drivers/net/ethernet/altera/altera_tse_ethtool.c
index 7c367713c3e6..2998655ab316 100644
--- a/drivers/net/ethernet/altera/altera_tse_ethtool.c
+++ b/drivers/net/ethernet/altera/altera_tse_ethtool.c
@@ -33,6 +33,7 @@
  #include 
  
  #include "altera_tse.h"

+#include "altera_utils.h"
  
  #define TSE_STATS_LEN	31

  #define TSE_NUM_REGS  128
diff --git a/drivers/net/ethernet/altera/altera_tse_main.c 
b/drivers/net/ethernet/altera/altera_tse_main.c
index f6b6a14b1ce9..b25d03506470 100644
--- a/drivers/net/ethernet/altera/altera_tse_main.c
+++ b/drivers/net/ethernet/altera/altera_tse_main.c
@@ -34,7 +34,6 @@
  #include 
  #include 
  #include 
-#include 
  #include 
  #include 
  #include 
@@ -44,7 +43,7 @@
  #include 
  #include 
  #include 
-#include 
+#include 
  #include 
  #include 
  
@@ -1332,35 +1331,6 @@ static struct net_device_ops altera_tse_netdev_ops = {

.ndo_validate_addr  = eth_validate_addr,
  };
  
-static int request_and_map(struct platform_device *pdev, const char *name,

-  struct resource **res, void __iomem **ptr)
-{
-   struct resource *region;
-   struct device *device = >dev;
-
-   *res = platform_get_resource_byname(pdev, IORESOURCE_MEM, name);
-   if (*res == NULL) {
-   dev_err(device, "resource %s not defined\n", name);
-   return -ENODEV;
-   }
-
-   region = devm_request_mem_region(device, (*res)->start,
-resource_size(*res), dev_name(device));
-   if (region == NULL) {
-   dev_err(device, "unable to request %s\n", name);
-   return -EBUSY;
-   }
-
-   *ptr = devm_ioremap_nocache(device, region->start,
-   resource_size(region));
-   if (*ptr == NULL) {
-   dev_err(device, "ioremap_nocache of %s failed!", name);
-   return -ENOMEM;
-   }
-
-   return 0;
-}
-
  /* Probe Altera TSE MAC device
   */
  static int altera_tse_probe(struct platform_device *pdev)
diff --git a/drivers/net/ethernet/altera/altera_utils.c 
b/drivers/net/ethernet/altera/altera_utils.c
index d7eeb1713ad2..bc33b7f0b0c5 100644
--- a/drivers/net/ethernet/altera/altera_utils.c
+++ b/drivers/net/ethernet/altera/altera_utils.c
@@ -42,3 +42,33 @@ int tse_bit_is_clear(void __iomem *ioaddr, size_t offs, u32 
bit_mask)
u32 value = csrrd32(ioaddr, offs);
return (value & bit_mask) ? 0 : 1;
  }
+
+int request_and_map(struct platform_device *pdev, const char *name,
+   struct resource **res, void __iomem **ptr)
+{
+   struct resource *region;
+   struct device *device = >dev;
+
+   *res = platform_get_resource_byname(pdev, IORESOURCE_MEM, name);
+   if (!*res) {
+   dev_err(device, "resource %s not

Re: [PATCH][net-next] net: slightly optimize eth_type_trans

2018-11-15 Thread David Miller

From: Li RongQing 
Date: Tue, 13 Nov 2018 09:34:31 +0800

> netperf udp stream shows that eth_type_trans takes certain cpu,
> so adjust the mac address check order, and firstly check if it
> is device address, and only check if it is multicast address
> only if not the device address.
> 
> After this change:
> To unicast, and skb dst mac is device mac, this is most of time
> reduce a comparision
> To unicast, and skb dst mac is not device mac, nothing change
> To multicast, increase a comparision
> 
> Before:
> 1.03%  [kernel]  [k] eth_type_trans
> 
> After:
> 0.78%  [kernel]  [k] eth_type_trans
> 
> Signed-off-by: Zhang Yu 
> Signed-off-by: Li RongQing 

Applied.

Re: [PATCH net-next 4/8] net: eth: altera: tse: add optional function to start tx dma

2018-11-15 Thread Thor Thayer


On 11/14/18 6:50 PM, Dalon Westergreen wrote:

From: Dalon Westergreen 

Allow for optional start up of tx dma if the start_txdma
function is defined in altera_dmaops.

Signed-off-by: Dalon Westergreen 
---
  drivers/net/ethernet/altera/altera_tse.h  | 1 +
  drivers/net/ethernet/altera/altera_tse_main.c | 5 +
  2 files changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/altera/altera_tse.h 
b/drivers/net/ethernet/altera/altera_tse.h
index d5b97e02e6d6..7f246040135d 100644
--- a/drivers/net/ethernet/altera/altera_tse.h
+++ b/drivers/net/ethernet/altera/altera_tse.h
@@ -412,6 +412,7 @@ struct altera_dmaops {
int (*init_dma)(struct altera_tse_private *priv);
void (*uninit_dma)(struct altera_tse_private *priv);
void (*start_rxdma)(struct altera_tse_private *priv);
+   void (*start_txdma)(struct altera_tse_private *priv);
  };
  
  /* This structure is private to each device.

diff --git a/drivers/net/ethernet/altera/altera_tse_main.c 
b/drivers/net/ethernet/altera/altera_tse_main.c
index 0c0e8f9bba9b..f6b6a14b1ce9 100644
--- a/drivers/net/ethernet/altera/altera_tse_main.c
+++ b/drivers/net/ethernet/altera/altera_tse_main.c
@@ -1256,6 +1256,9 @@ static int tse_open(struct net_device *dev)
  
  	priv->dmaops->start_rxdma(priv);
  
+	if (priv->dmaops->start_txdma)

+   priv->dmaops->start_txdma(priv);
+
/* Start MAC Rx/Tx */
spin_lock(>mac_cfg_lock);
tse_set_mac(priv, true);
@@ -1658,6 +1661,7 @@ static const struct altera_dmaops altera_dtype_sgdma = {
.init_dma = sgdma_initialize,
.uninit_dma = sgdma_uninitialize,
.start_rxdma = sgdma_start_rxdma,
+   .start_txdma = NULL,
  };
  
  static const struct altera_dmaops altera_dtype_msgdma = {

@@ -1677,6 +1681,7 @@ static const struct altera_dmaops altera_dtype_msgdma = {
.init_dma = msgdma_initialize,
.uninit_dma = msgdma_uninitialize,
.start_rxdma = msgdma_start_rxdma,
+   .start_txdma = NULL,
  };
  
  static const struct of_device_id altera_tse_ids[] = {



Acked-by: Thor Thayer

Re: [PATCH v3 net-next 0/4] net: batched receive in GRO path

2018-11-15 Thread Eric Dumazet




On 11/15/2018 02:45 PM, Edward Cree wrote:
> On 15/11/18 22:01, Eric Dumazet wrote:
>> On 11/15/2018 01:45 PM, Edward Cree wrote:
>>> If napi->poll() is only handling one packet, surely GRO can't do anything
>>>  useful either?  (AIUI at the end of the poll the GRO lists get flushed.)
>> That is my point.
>>
>> Adding yet another layer that will add no gain but add more waste of cpu 
>> cycles.
>>
>> In fact I know many people disabling GRO in some cases because it adds ~5% 
>> penalty
>> for traffic that is not aggregated.
> Does there maybe need to be an (ethtool -K) option to disable batch receive,
>  then, for this kind of user?

I do not want to hold on your patches, only to remind us that we add a lot of
features and stuff that might help in some cases only.

Another example is the IP early demux for UDP packets, which is clearly
a waste of time when the receiving socket is not a connected socket.

Re: [PATCH][net-next][v2] net: remove BUG_ON from __pskb_pull_tail

2018-11-15 Thread David Miller

From: Li RongQing 
Date: Tue, 13 Nov 2018 09:16:52 +0800

> if list is NULL pointer, and the following access of list
> will trigger panic, which is same as BUG_ON
> 
> Signed-off-by: Li RongQing 

Applied.

Re: [PATCH net-next 3/8] net: eth: altera: tse: fix altera_dmaops declaration

2018-11-15 Thread Thor Thayer


On 11/14/18 6:50 PM, Dalon Westergreen wrote:

From: Dalon Westergreen 

The declaration of struct altera_dmaops does not have
identifier names.  Add identifier names to confrom with
required coding styles.

Signed-off-by: Dalon Westergreen 
---
  drivers/net/ethernet/altera/altera_tse.h | 30 +---
  1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/altera/altera_tse.h 
b/drivers/net/ethernet/altera/altera_tse.h
index e2feee87180a..d5b97e02e6d6 100644
--- a/drivers/net/ethernet/altera/altera_tse.h
+++ b/drivers/net/ethernet/altera/altera_tse.h
@@ -396,20 +396,22 @@ struct altera_tse_private;
  struct altera_dmaops {
int altera_dtype;
int dmamask;
-   void (*reset_dma)(struct altera_tse_private *);
-   void (*enable_txirq)(struct altera_tse_private *);
-   void (*enable_rxirq)(struct altera_tse_private *);
-   void (*disable_txirq)(struct altera_tse_private *);
-   void (*disable_rxirq)(struct altera_tse_private *);
-   void (*clear_txirq)(struct altera_tse_private *);
-   void (*clear_rxirq)(struct altera_tse_private *);
-   int (*tx_buffer)(struct altera_tse_private *, struct tse_buffer *);
-   u32 (*tx_completions)(struct altera_tse_private *);
-   void (*add_rx_desc)(struct altera_tse_private *, struct tse_buffer *);
-   u32 (*get_rx_status)(struct altera_tse_private *);
-   int (*init_dma)(struct altera_tse_private *);
-   void (*uninit_dma)(struct altera_tse_private *);
-   void (*start_rxdma)(struct altera_tse_private *);
+   void (*reset_dma)(struct altera_tse_private *priv);
+   void (*enable_txirq)(struct altera_tse_private *priv);
+   void (*enable_rxirq)(struct altera_tse_private *priv);
+   void (*disable_txirq)(struct altera_tse_private *priv);
+   void (*disable_rxirq)(struct altera_tse_private *priv);
+   void (*clear_txirq)(struct altera_tse_private *priv);
+   void (*clear_rxirq)(struct altera_tse_private *priv);
+   int (*tx_buffer)(struct altera_tse_private *priv,
+struct tse_buffer *buffer);
+   u32 (*tx_completions)(struct altera_tse_private *priv);
+   void (*add_rx_desc)(struct altera_tse_private *priv,
+   struct tse_buffer *buffer);
+   u32 (*get_rx_status)(struct altera_tse_private *priv);
+   int (*init_dma)(struct altera_tse_private *priv);
+   void (*uninit_dma)(struct altera_tse_private *priv);
+   void (*start_rxdma)(struct altera_tse_private *priv);
  };
  
  /* This structure is private to each device.



Acked-by: Thor Thayer

Re: [net-next 00/14][pull request] 40GbE Intel Wired LAN Driver Updates 2018-11-14

2018-11-15 Thread David Miller

From: Jeff Kirsher 
Date: Wed, 14 Nov 2018 15:10:18 -0800

> This series contains updates to i40e and virtchnl.

Pulled, thanks Jeff.

Re: [PATCH net-next 2/8] net: eth: altera: set rx and tx ring size before init_dma call

2018-11-15 Thread Thor Thayer


On 11/14/18 6:50 PM, Dalon Westergreen wrote:

From: Dalon Westergreen 

It is more appropriate to set the rx and tx ring size before calling
the init function for the dma.

Signed-off-by: Dalon Westergreen 
---
  drivers/net/ethernet/altera/altera_tse_main.c | 6 --
  1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/altera/altera_tse_main.c 
b/drivers/net/ethernet/altera/altera_tse_main.c
index dcb330129e23..0c0e8f9bba9b 100644
--- a/drivers/net/ethernet/altera/altera_tse_main.c
+++ b/drivers/net/ethernet/altera/altera_tse_main.c
@@ -1166,6 +1166,10 @@ static int tse_open(struct net_device *dev)
int i;
unsigned long int flags;
  
+	/* set tx and rx ring size */

+   priv->rx_ring_size = dma_rx_num;
+   priv->tx_ring_size = dma_tx_num;
+
/* Reset and configure TSE MAC and probe associated PHY */
ret = priv->dmaops->init_dma(priv);
if (ret != 0) {
@@ -1208,8 +1212,6 @@ static int tse_open(struct net_device *dev)
priv->dmaops->reset_dma(priv);
  
  	/* Create and initialize the TX/RX descriptors chains. */

-   priv->rx_ring_size = dma_rx_num;
-   priv->tx_ring_size = dma_tx_num;
ret = alloc_init_skbufs(priv);
if (ret) {
netdev_err(dev, "DMA descriptors initialization failed\n");


Acked-by: Thor Thayer

Re: [PATCH net-next 1/8] net: eth: altera: tse_start_xmit ignores tx_buffer call response

2018-11-15 Thread Thor Thayer


On 11/14/18 6:50 PM, Dalon Westergreen wrote:

From: Dalon Westergreen 

The return from tx_buffer call in tse_start_xmit is
inapropriately ignored.  tse_buffer calls should return
0 for success or NETDEV_TX_BUSY.  tse_start_xmit should
return not report a successful transmit when the tse_buffer
call returns an error condition.

In addition to the above, the msgdma and sgdma do not return
the same value on success or failure.  The sgdma_tx_buffer
returned 0 on failure and a positive number of transmitted
packets on success.  Given that it only ever sends 1 packet,
this made no sense.  The msgdma implementation msgdma_tx_buffer
returns 0 on success.

   -> Don't ignore the return from tse_buffer calls
   -> Fix sgdma tse_buffer call to return 0 on success
  and NETDEV_TX_BUSY on failure.

Signed-off-by: Dalon Westergreen 
---
  drivers/net/ethernet/altera/altera_sgdma.c| 14 --
  drivers/net/ethernet/altera/altera_tse_main.c |  4 +++-
  2 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/altera/altera_sgdma.c 
b/drivers/net/ethernet/altera/altera_sgdma.c
index 88ef67a998b4..eb47b9b820bb 100644
--- a/drivers/net/ethernet/altera/altera_sgdma.c
+++ b/drivers/net/ethernet/altera/altera_sgdma.c
@@ -15,6 +15,7 @@
   */
  
  #include 

+#include 
  #include "altera_utils.h"
  #include "altera_tse.h"
  #include "altera_sgdmahw.h"
@@ -170,10 +171,11 @@ void sgdma_clear_txirq(struct altera_tse_private *priv)
SGDMA_CTRLREG_CLRINT);
  }
  
-/* transmits buffer through SGDMA. Returns number of buffers

- * transmitted, 0 if not possible.
- *
- * tx_lock is held by the caller
+/* transmits buffer through SGDMA.
+ *   original behavior returned the number of transmitted packets (always 1) &
+ *   returned 0 on error.  This differs from the msgdma.  the calling function
+ *   will now actually look at the code, so from now, 0 is good and return
+ *   NETDEV_TX_BUSY when busy.
   */
  int sgdma_tx_buffer(struct altera_tse_private *priv, struct tse_buffer 
*buffer)
  {
@@ -185,7 +187,7 @@ int sgdma_tx_buffer(struct altera_tse_private *priv, struct 
tse_buffer *buffer)
  
  	/* wait 'til the tx sgdma is ready for the next transmit request */

if (sgdma_txbusy(priv))
-   return 0;
+   return NETDEV_TX_BUSY;
  
  	sgdma_setup_descrip(cdesc,			/* current descriptor */

ndesc,  /* next descriptor */
@@ -202,7 +204,7 @@ int sgdma_tx_buffer(struct altera_tse_private *priv, struct 
tse_buffer *buffer)
/* enqueue the request to the pending transmit queue */
queue_tx(priv, buffer);
  
-	return 1;

+   return 0;
  }
  
  
diff --git a/drivers/net/ethernet/altera/altera_tse_main.c b/drivers/net/ethernet/altera/altera_tse_main.c

index baca8f704a45..dcb330129e23 100644
--- a/drivers/net/ethernet/altera/altera_tse_main.c
+++ b/drivers/net/ethernet/altera/altera_tse_main.c
@@ -606,7 +606,9 @@ static int tse_start_xmit(struct sk_buff *skb, struct 
net_device *dev)
buffer->dma_addr = dma_addr;
buffer->len = nopaged_len;
  
-	priv->dmaops->tx_buffer(priv, buffer);

+   ret = priv->dmaops->tx_buffer(priv, buffer);
+   if (ret)
+   goto out;
  
  	skb_tx_timestamp(skb);
  


Acked-by: Thor Thayer

Re: [PATCH net-next 00/11] mlxsw: spectrum: acl: Introduce ERP sharing by multiple masks

2018-11-15 Thread David Miller

From: Ido Schimmel 
Date: Wed, 14 Nov 2018 08:22:25 +

> Jiri says:
> 
> The Spectrum-2 hardware has limitation number of ERPs per-region. In
> order to accommodate more masks than number of ERPs, the hardware
> supports to insert rules with delta bits. By that, the rules with masks
> that differ in up-to 8 consecutive bits can share the same ERP.
> 
> Patches 1 and 2 fix couple of issues that would appear in existing
> selftests after adding delta support
> 
> Patch 3 introduces a generic object aggregation library. Now it is
> static, but it will get extended for recalculation of aggregations in
> the future in order to reach more optimal aggregation.
> 
> Patch 4 just simply converts existing ERP code to use the objagg library
> instead of a rhashtable.
> 
> Patches 5-9 do more or less small changes to prepare ground for the last
> patch.
> 
> Patch 10 fills-up delta callbacks of objagg library and utilizes the
> delta bits for rule insertion.
> 
> The last patch adds selftest to test the mlxsw Spectrum-2 delta flows.

Series applied, but I had to fix the following warning:


[PATCH] test_objagg: Fix warning.

lib/test_objagg.c: In function ‘test_delta_action_item’:
./include/linux/printk.h:308:2: warning: ‘errmsg’ may be used uninitialized in 
this function [-Wmaybe-uninitialized]

Signed-off-by: David S. Miller 
---
 lib/test_objagg.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/test_objagg.c b/lib/test_objagg.c
index aac5d8e8800c..ab57144bb0cd 100644
--- a/lib/test_objagg.c
+++ b/lib/test_objagg.c
@@ -769,6 +769,7 @@ static int test_delta_action_item(struct world *world,
if (err)
goto errout;
 
+   errmsg = NULL;
err = check_expect_stats(objagg, _item->expect_stats, );
if (err) {
pr_err("Key %u: Stats: %s\n", action_item->key_id, errmsg);
-- 
2.19.1

Re: [PATCH v3 net-next 0/4] net: batched receive in GRO path

2018-11-15 Thread Edward Cree

On 15/11/18 22:01, Eric Dumazet wrote:
> On 11/15/2018 01:45 PM, Edward Cree wrote:
>> If napi->poll() is only handling one packet, surely GRO can't do anything
>>  useful either?  (AIUI at the end of the poll the GRO lists get flushed.)
> That is my point.
>
> Adding yet another layer that will add no gain but add more waste of cpu 
> cycles.
>
> In fact I know many people disabling GRO in some cases because it adds ~5% 
> penalty
> for traffic that is not aggregated.
Does there maybe need to be an (ethtool -K) option to disable batch receive,
 then, for this kind of user?

>>  Is it maybe a sign that you're just spreading over too many queues??
> Not really. You also want to be able to receive more traffic if the need 
> comes.
Oh I see, this is about using less CPU when not maxed out, rather than
 increasing the maximum performance.
I did see a 6% RXCPU usage increase in the "TCP RR, GRO on" test.  (Before=
 188.7%, after=200%, Welch p<0.001, Cohen's d=6.2.)  I'll try adding a "skip
 batching for short lists" and retest, see if that improves matters.

[PATCH iproute2 10/22] ipxfrm: make local functions static

2018-11-15 Thread Stephen Hemminger

Make functions only used in ipxfrm.c static.

Signed-off-by: Stephen Hemminger 
---
 ip/ipxfrm.c   | 11 ++-
 ip/xfrm.h |  9 -
 ip/xfrm_monitor.c |  2 +-
 3 files changed, 7 insertions(+), 15 deletions(-)

diff --git a/ip/ipxfrm.c b/ip/ipxfrm.c
index 17ab4abef4be..2dea4e37f209 100644
--- a/ip/ipxfrm.c
+++ b/ip/ipxfrm.c
@@ -186,7 +186,7 @@ const char *strxf_algotype(int type)
return str;
 }
 
-const char *strxf_mask8(__u8 mask)
+static const char *strxf_mask8(__u8 mask)
 {
static char str[16];
const int sn = sizeof(mask) * 8 - 1;
@@ -209,7 +209,7 @@ const char *strxf_mask32(__u32 mask)
return str;
 }
 
-const char *strxf_share(__u8 share)
+static const char *strxf_share(__u8 share)
 {
static char str[32];
 
@@ -270,7 +270,7 @@ const char *strxf_ptype(__u8 ptype)
return str;
 }
 
-void xfrm_id_info_print(xfrm_address_t *saddr, struct xfrm_id *id,
+static void xfrm_id_info_print(xfrm_address_t *saddr, struct xfrm_id *id,
__u8 mode, __u32 reqid, __u16 family, int force_spi,
FILE *fp, const char *prefix, const char *title)
 {
@@ -337,7 +337,8 @@ static const char *strxf_limit(__u64 limit)
return str;
 }
 
-void xfrm_stats_print(struct xfrm_stats *s, FILE *fp, const char *prefix)
+static void xfrm_stats_print(struct xfrm_stats *s, FILE *fp,
+const char *prefix)
 {
if (prefix)
fputs(prefix, fp);
@@ -371,7 +372,7 @@ static const char *strxf_time(__u64 time)
return str;
 }
 
-void xfrm_lifetime_print(struct xfrm_lifetime_cfg *cfg,
+static void xfrm_lifetime_print(struct xfrm_lifetime_cfg *cfg,
 struct xfrm_lifetime_cur *cur,
 FILE *fp, const char *prefix)
 {
diff --git a/ip/xfrm.h b/ip/xfrm.h
index 3b158ad71c13..72390d79cfb5 100644
--- a/ip/xfrm.h
+++ b/ip/xfrm.h
@@ -118,18 +118,9 @@ int xfrm_algotype_getbyname(char *name);
 int xfrm_parse_mark(struct xfrm_mark *mark, int *argcp, char ***argvp);
 const char *strxf_xfrmproto(__u8 proto);
 const char *strxf_algotype(int type);
-const char *strxf_mask8(__u8 mask);
 const char *strxf_mask32(__u32 mask);
-const char *strxf_share(__u8 share);
 const char *strxf_proto(__u8 proto);
 const char *strxf_ptype(__u8 ptype);
-void xfrm_id_info_print(xfrm_address_t *saddr, struct xfrm_id *id,
-   __u8 mode, __u32 reqid, __u16 family, int force_spi,
-   FILE *fp, const char *prefix, const char *title);
-void xfrm_stats_print(struct xfrm_stats *s, FILE *fp, const char *prefix);
-void xfrm_lifetime_print(struct xfrm_lifetime_cfg *cfg,
-struct xfrm_lifetime_cur *cur,
-FILE *fp, const char *prefix);
 void xfrm_selector_print(struct xfrm_selector *sel, __u16 family,
 FILE *fp, const char *prefix);
 void xfrm_xfrma_print(struct rtattr *tb[], __u16 family,
diff --git a/ip/xfrm_monitor.c b/ip/xfrm_monitor.c
index eb07af17cadf..76905ed3f1e1 100644
--- a/ip/xfrm_monitor.c
+++ b/ip/xfrm_monitor.c
@@ -34,7 +34,7 @@
 #include "ip_common.h"
 
 static void usage(void) __attribute__((noreturn));
-int listen_all_nsid;
+static int listen_all_nsid;
 
 static void usage(void)
 {
-- 
2.17.1

Re: [PATCH v3 net-next 0/4] net: batched receive in GRO path

2018-11-15 Thread Edward Cree

Some corrections as it looks like I didn't proofread this carefully enough
 before sending it...

On 14/11/18 18:07, Edward Cree wrote:
> Payload_size in all tests was 8000 bytes.
This was for TCP tests; the UDP test used 1-byte payloads.

> UDP Stream (GRO off):
> net-next: 7.808 Gb/s
> after #4: 7.848 Gb/s
These numbers were Mb/s, not Gb/s.
>   0.5% slower; p = 0.144
And of course the 'after' state was 0.5% _faster_.
> * UDP throughput might be slightly slowed (probably by patch #3) but it's
>   not statistically significant.
Ditto here, UDP has not been slowed.

-Ed

[PATCH iproute2 22/22] rdma: make local functions static

2018-11-15 Thread Stephen Hemminger

Several functions only used inside utils.c

Signed-off-by: Stephen Hemminger 
---
 rdma/rdma.h  | 11 ---
 rdma/utils.c | 12 ++--
 2 files changed, 6 insertions(+), 17 deletions(-)

diff --git a/rdma/rdma.h b/rdma/rdma.h
index c3b7530b6cc7..05c3c69b07fd 100644
--- a/rdma/rdma.h
+++ b/rdma/rdma.h
@@ -74,13 +74,6 @@ struct rd_cmd {
int (*func)(struct rd *rd);
 };
 
-/*
- * Parser interface
- */
-bool rd_no_arg(struct rd *rd);
-void rd_arg_inc(struct rd *rd);
-
-char *rd_argv(struct rd *rd);
 
 /*
  * Commands interface
@@ -95,8 +88,6 @@ void rd_free(struct rd *rd);
 int rd_set_arg_to_devname(struct rd *rd);
 int rd_argc(struct rd *rd);
 
-int strcmpx(const char *str1, const char *str2);
-
 /*
  * Device manipulation
  */
@@ -117,14 +108,12 @@ int rd_recv_msg(struct rd *rd, mnl_cb_t callback, void 
*data, uint32_t seq);
 void rd_prepare_msg(struct rd *rd, uint32_t cmd, uint32_t *seq, uint16_t 
flags);
 int rd_dev_init_cb(const struct nlmsghdr *nlh, void *data);
 int rd_attr_cb(const struct nlattr *attr, void *data);
-int rd_attr_check(const struct nlattr *attr, int *typep);
 
 /*
  * Print helpers
  */
 void print_driver_table(struct rd *rd, struct nlattr *tb);
 void newline(struct rd *rd);
-void newline_indent(struct rd *rd);
 #define MAX_LINE_LENGTH 80
 
 #endif /* _RDMA_TOOL_H_ */
diff --git a/rdma/utils.c b/rdma/utils.c
index 4840bf226d54..1a0cf56800d4 100644
--- a/rdma/utils.c
+++ b/rdma/utils.c
@@ -18,14 +18,14 @@ int rd_argc(struct rd *rd)
return rd->argc;
 }
 
-char *rd_argv(struct rd *rd)
+static char *rd_argv(struct rd *rd)
 {
if (!rd_argc(rd))
return NULL;
return *rd->argv;
 }
 
-int strcmpx(const char *str1, const char *str2)
+static int strcmpx(const char *str1, const char *str2)
 {
if (strlen(str1) > strlen(str2))
return -1;
@@ -39,7 +39,7 @@ static bool rd_argv_match(struct rd *rd, const char *pattern)
return strcmpx(rd_argv(rd), pattern) == 0;
 }
 
-void rd_arg_inc(struct rd *rd)
+static void rd_arg_inc(struct rd *rd)
 {
if (!rd_argc(rd))
return;
@@ -47,7 +47,7 @@ void rd_arg_inc(struct rd *rd)
rd->argv++;
 }
 
-bool rd_no_arg(struct rd *rd)
+static bool rd_no_arg(struct rd *rd)
 {
return rd_argc(rd) == 0;
 }
@@ -404,7 +404,7 @@ static const enum mnl_attr_data_type 
nldev_policy[RDMA_NLDEV_ATTR_MAX] = {
[RDMA_NLDEV_ATTR_DRIVER_U64] = MNL_TYPE_U64,
 };
 
-int rd_attr_check(const struct nlattr *attr, int *typep)
+static int rd_attr_check(const struct nlattr *attr, int *typep)
 {
int type;
 
@@ -696,7 +696,7 @@ void newline(struct rd *rd)
pr_out("\n");
 }
 
-void newline_indent(struct rd *rd)
+static void newline_indent(struct rd *rd)
 {
newline(rd);
if (!rd->json_output)
-- 
2.17.1

[PATCH iproute2 15/22] ss: make local variables static

2018-11-15 Thread Stephen Hemminger

Several variables only used in this code.

Signed-off-by: Stephen Hemminger 
---
 misc/ss.c | 28 ++--
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/misc/ss.c b/misc/ss.c
index 4d12fb5d19df..e4d6ae489e79 100644
--- a/misc/ss.c
+++ b/misc/ss.c
@@ -96,20 +96,20 @@ static int security_get_initial_context(char *name,  char 
**context)
 }
 #endif
 
-int resolve_services = 1;
+static int resolve_services = 1;
 int preferred_family = AF_UNSPEC;
-int show_options;
+static int show_options;
 int show_details;
-int show_users;
-int show_mem;
-int show_tcpinfo;
-int show_bpf;
-int show_proc_ctx;
-int show_sock_ctx;
-int show_header = 1;
-int follow_events;
-int sctp_ino;
-int show_tipcinfo;
+static int show_users;
+static int show_mem;
+static int show_tcpinfo;
+static int show_bpf;
+static int show_proc_ctx;
+static int show_sock_ctx;
+static int show_header = 1;
+static int follow_events;
+static int sctp_ino;
+static int show_tipcinfo;
 
 enum col_id {
COL_NETID,
@@ -494,7 +494,7 @@ struct user_ent {
 };
 
 #define USER_ENT_HASH_SIZE 256
-struct user_ent *user_ent_hash[USER_ENT_HASH_SIZE];
+static struct user_ent *user_ent_hash[USER_ENT_HASH_SIZE];
 
 static int user_ent_hashfn(unsigned int ino)
 {
@@ -1404,7 +1404,7 @@ struct scache {
const char *proto;
 };
 
-struct scache *rlist;
+static struct scache *rlist;
 
 static void init_service_resolver(void)
 {
-- 
2.17.1

[PATCH iproute2 06/22] genl: remove dead code

2018-11-15 Thread Stephen Hemminger

The function genl_ctrl_resolve_family is defined but never used
in current code.

Signed-off-by: Stephen Hemminger 
---
 genl/ctrl.c   | 71 ---
 genl/genl_utils.h |  2 --
 2 files changed, 73 deletions(-)

diff --git a/genl/ctrl.c b/genl/ctrl.c
index 616ab435..0fb464b01cfb 100644
--- a/genl/ctrl.c
+++ b/genl/ctrl.c
@@ -38,77 +38,6 @@ static int usage(void)
return -1;
 }
 
-int genl_ctrl_resolve_family(const char *family)
-{
-   struct rtnl_handle rth;
-   int ret = 0;
-   struct {
-   struct nlmsghdr n;
-   struct genlmsghdr   g;
-   charbuf[4096];
-   } req = {
-   .n.nlmsg_len = NLMSG_LENGTH(GENL_HDRLEN),
-   .n.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK,
-   .n.nlmsg_type = GENL_ID_CTRL,
-   .g.cmd = CTRL_CMD_GETFAMILY,
-   };
-   struct nlmsghdr *nlh = 
-   struct genlmsghdr *ghdr = 
-   struct nlmsghdr *answer = NULL;
-
-   if (rtnl_open_byproto(, 0, NETLINK_GENERIC) < 0) {
-   fprintf(stderr, "Cannot open generic netlink socket\n");
-   exit(1);
-   }
-
-   addattr_l(nlh, 128, CTRL_ATTR_FAMILY_NAME, family, strlen(family) + 1);
-
-   if (rtnl_talk(, nlh, ) < 0) {
-   fprintf(stderr, "Error talking to the kernel\n");
-   goto errout;
-   }
-
-   {
-   struct rtattr *tb[CTRL_ATTR_MAX + 1];
-   int len = answer->nlmsg_len;
-   struct rtattr *attrs;
-
-   if (answer->nlmsg_type !=  GENL_ID_CTRL) {
-   fprintf(stderr, "Not a controller message, nlmsg_len=%d 
"
-   "nlmsg_type=0x%x\n", answer->nlmsg_len, 
answer->nlmsg_type);
-   goto errout;
-   }
-
-   if (ghdr->cmd != CTRL_CMD_NEWFAMILY) {
-   fprintf(stderr, "Unknown controller command %d\n", 
ghdr->cmd);
-   goto errout;
-   }
-
-   len -= NLMSG_LENGTH(GENL_HDRLEN);
-
-   if (len < 0) {
-   fprintf(stderr, "wrong controller message len %d\n", 
len);
-   free(answer);
-   return -1;
-   }
-
-   attrs = (struct rtattr *) ((char *) answer + 
NLMSG_LENGTH(GENL_HDRLEN));
-   parse_rtattr(tb, CTRL_ATTR_MAX, attrs, len);
-
-   if (tb[CTRL_ATTR_FAMILY_ID] == NULL) {
-   fprintf(stderr, "Missing family id TLV\n");
-   goto errout;
-   }
-
-   ret = rta_getattr_u16(tb[CTRL_ATTR_FAMILY_ID]);
-   }
-
-errout:
-   free(answer);
-   rtnl_close();
-   return ret;
-}
-
 static void print_ctrl_cmd_flags(FILE *fp, __u32 fl)
 {
fprintf(fp, "\n\t\tCapabilities (0x%x):\n ", fl);
diff --git a/genl/genl_utils.h b/genl/genl_utils.h
index cc1f3fb76596..a8d433a9574f 100644
--- a/genl/genl_utils.h
+++ b/genl/genl_utils.h
@@ -13,6 +13,4 @@ struct genl_util
int (*print_genlopt)(struct nlmsghdr *n, void *arg);
 };
 
-int genl_ctrl_resolve_family(const char *family);
-
 #endif
-- 
2.17.1

[PATCH iproute2 03/22] lib/color: make local functions static

2018-11-15 Thread Stephen Hemminger

color_enable etc, only used here.

Signed-off-by: Stephen Hemminger 
---
 include/color.h | 2 --
 lib/color.c | 6 --
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/color.h b/include/color.h
index e30f28c51c84..17ec56f3d7b4 100644
--- a/include/color.h
+++ b/include/color.h
@@ -20,10 +20,8 @@ enum color_opt {
COLOR_OPT_ALWAYS = 2
 };
 
-void enable_color(void);
 bool check_enable_color(int color, int json);
 bool matches_color(const char *arg, int *val);
-void set_color_palette(void);
 int color_fprintf(FILE *fp, enum color_attr attr, const char *fmt, ...);
 enum color_attr ifa_family_color(__u8 ifa_family);
 enum color_attr oper_state_color(__u8 state);
diff --git a/lib/color.c b/lib/color.c
index e5406294dfc4..59976847295c 100644
--- a/lib/color.c
+++ b/lib/color.c
@@ -11,6 +11,8 @@
 #include "color.h"
 #include "utils.h"
 
+static void set_color_palette(void);
+
 enum color {
C_RED,
C_GREEN,
@@ -73,7 +75,7 @@ static enum color attr_colors_dark[] = {
 static int is_dark_bg;
 static int color_is_enabled;
 
-void enable_color(void)
+static void enable_color(void)
 {
color_is_enabled = 1;
set_color_palette();
@@ -117,7 +119,7 @@ bool matches_color(const char *arg, int *val)
return true;
 }
 
-void set_color_palette(void)
+static void set_color_palette(void)
 {
char *p = getenv("COLORFGBG");
 
-- 
2.17.1

[PATCH iproute2 04/22] lib/ll_map: make local function static

2018-11-15 Thread Stephen Hemminger

ll_idx_a2n is only used in ll_map.

Signed-off-by: Stephen Hemminger 
---
 include/ll_map.h | 1 -
 lib/ll_map.c | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/include/ll_map.h b/include/ll_map.h
index fb708191c22c..511fe00b8567 100644
--- a/include/ll_map.h
+++ b/include/ll_map.h
@@ -12,6 +12,5 @@ int ll_index_to_flags(unsigned idx);
 unsigned namehash(const char *str);
 
 const char *ll_idx_n2a(unsigned int idx);
-unsigned int ll_idx_a2n(const char *name);
 
 #endif /* __LL_MAP_H__ */
diff --git a/lib/ll_map.c b/lib/ll_map.c
index 1b4095a7d873..1ab8ef0758ac 100644
--- a/lib/ll_map.c
+++ b/lib/ll_map.c
@@ -143,7 +143,7 @@ const char *ll_idx_n2a(unsigned int idx)
return buf;
 }
 
-unsigned int ll_idx_a2n(const char *name)
+static unsigned int ll_idx_a2n(const char *name)
 {
unsigned int idx;
 
-- 
2.17.1

[PATCH iproute2 09/22] ipmonitor: make local variable static

2018-11-15 Thread Stephen Hemminger

prefix_banner only used in one file.

Signed-off-by: Stephen Hemminger 
---
 ip/ipmonitor.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ip/ipmonitor.c b/ip/ipmonitor.c
index 9d5ac2b5e4d2..743632cc5569 100644
--- a/ip/ipmonitor.c
+++ b/ip/ipmonitor.c
@@ -24,7 +24,7 @@
 #include "ip_common.h"
 
 static void usage(void) __attribute__((noreturn));
-int prefix_banner;
+static int prefix_banner;
 int listen_all_nsid;
 
 static void usage(void)
-- 
2.17.1

[PATCH iproute2 11/22] tc: drop unused name_to_id function

2018-11-15 Thread Stephen Hemminger

Not used in current code.

Signed-off-by: Stephen Hemminger 
---
 include/names.h |  1 -
 lib/names.c | 28 
 2 files changed, 29 deletions(-)

diff --git a/include/names.h b/include/names.h
index 3e5d3b146a23..2fcaacc398d4 100644
--- a/include/names.h
+++ b/include/names.h
@@ -22,6 +22,5 @@ int db_names_load(struct db_names *db, const char *path);
 void db_names_free(struct db_names *db);
 
 char *id_to_name(struct db_names *db, int id, char *name);
-int name_to_id(struct db_names *db, int *id, const char *name);
 
 #endif
diff --git a/lib/names.c b/lib/names.c
index fbd6503f22d4..b46ea7910946 100644
--- a/lib/names.c
+++ b/lib/names.c
@@ -150,31 +150,3 @@ char *id_to_name(struct db_names *db, int id, char *name)
snprintf(name, IDNAME_MAX, "%d", id);
return NULL;
 }
-
-int name_to_id(struct db_names *db, int *id, const char *name)
-{
-   struct db_entry *entry;
-   int i;
-
-   if (!db)
-   return -1;
-
-   if (db->cached && strcmp(db->cached->name, name) == 0) {
-   *id = db->cached->id;
-   return 0;
-   }
-
-   for (i = 0; i < db->size; i++) {
-   entry = db->hash[i];
-   while (entry && strcmp(entry->name, name))
-   entry = entry->next;
-
-   if (entry) {
-   db->cached = entry;
-   *id = entry->id;
-   return 0;
-   }
-   }
-
-   return -1;
-}
-- 
2.17.1

[PATCH iproute2 12/22] tipc: make cmd_find static

2018-11-15 Thread Stephen Hemminger

Function only used in one file.

Signed-off-by: Stephen Hemminger 
---
 tipc/cmdl.c | 2 +-
 tipc/cmdl.h | 2 --
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/tipc/cmdl.c b/tipc/cmdl.c
index 4a2f4fd92f48..f2f259cc5320 100644
--- a/tipc/cmdl.c
+++ b/tipc/cmdl.c
@@ -17,7 +17,7 @@
 
 #include "cmdl.h"
 
-const struct cmd *find_cmd(const struct cmd *cmds, char *str)
+static const struct cmd *find_cmd(const struct cmd *cmds, char *str)
 {
const struct cmd *c;
const struct cmd *match = NULL;
diff --git a/tipc/cmdl.h b/tipc/cmdl.h
index d37239f85690..03db359956e6 100644
--- a/tipc/cmdl.h
+++ b/tipc/cmdl.h
@@ -54,6 +54,4 @@ char *shift_cmdl(struct cmdl *cmdl);
 int run_cmd(struct nlmsghdr *nlh, const struct cmd *caller,
const struct cmd *cmds, struct cmdl *cmdl, void *data);
 
-const struct cmd *find_cmd(const struct cmd *cmds, char *str);
-
 #endif
-- 
2.17.1

[PATCH iproute2 13/22] tc/class: make filter variables static

2018-11-15 Thread Stephen Hemminger

Only used in this file.

Signed-off-by: Stephen Hemminger 
---
 tc/tc_class.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tc/tc_class.c b/tc/tc_class.c
index 7e4e17fd7f39..7ac700d7ab31 100644
--- a/tc/tc_class.c
+++ b/tc/tc_class.c
@@ -153,9 +153,9 @@ static int tc_class_modify(int cmd, unsigned int flags, int 
argc, char **argv)
return 0;
 }
 
-int filter_ifindex;
-__u32 filter_qdisc;
-__u32 filter_classid;
+static int filter_ifindex;
+static __u32 filter_qdisc;
+static __u32 filter_classid;
 
 static void graph_node_add(__u32 parent_id, __u32 id, void *data,
int len)
-- 
2.17.1

[PATCH iproute2 20/22] tc/action: make variables static

2018-11-15 Thread Stephen Hemminger

Signed-off-by: Stephen Hemminger 
---
 tc/m_action.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tc/m_action.c b/tc/m_action.c
index e90867fc6c25..d5fd5affe703 100644
--- a/tc/m_action.c
+++ b/tc/m_action.c
@@ -30,9 +30,9 @@
 
 static struct action_util *action_list;
 #ifdef CONFIG_GACT
-int gact_ld; /* f*ckin backward compatibility */
+static int gact_ld; /* f*ckin backward compatibility */
 #endif
-int tab_flush;
+static int tab_flush;
 
 static void act_usage(void)
 {
-- 
2.17.1

[PATCH iproute2 21/22] tc/pedit: use structure initialization

2018-11-15 Thread Stephen Hemminger

The pedit callback structure table should be iniatialized using
structure initialization to avoid structure changes problems.

Signed-off-by: Stephen Hemminger 
---
 tc/p_eth.c  | 5 ++---
 tc/p_icmp.c | 5 ++---
 tc/p_ip.c   | 5 ++---
 tc/p_ip6.c  | 5 ++---
 tc/p_tcp.c  | 5 ++---
 tc/p_udp.c  | 5 ++---
 6 files changed, 12 insertions(+), 18 deletions(-)

diff --git a/tc/p_eth.c b/tc/p_eth.c
index 53ce736a1d78..674f9c11202a 100644
--- a/tc/p_eth.c
+++ b/tc/p_eth.c
@@ -68,7 +68,6 @@ done:
 }
 
 struct m_pedit_util p_pedit_eth = {
-   NULL,
-   "eth",
-   parse_eth,
+   .id = "eth",
+   .parse_peopt = parse_eth,
 };
diff --git a/tc/p_icmp.c b/tc/p_icmp.c
index 2c1baf82f7ad..15ce32309e39 100644
--- a/tc/p_icmp.c
+++ b/tc/p_icmp.c
@@ -55,7 +55,6 @@ done:
 }
 
 struct m_pedit_util p_pedit_icmp = {
-   NULL,
-   "icmp",
-   parse_icmp,
+   .id = "icmp",
+   .parse_peopt = parse_icmp,
 };
diff --git a/tc/p_ip.c b/tc/p_ip.c
index e9fd6f834efc..c385ac6dbcaa 100644
--- a/tc/p_ip.c
+++ b/tc/p_ip.c
@@ -156,7 +156,6 @@ done:
 }
 
 struct m_pedit_util p_pedit_ip = {
-   NULL,
-   "ip",
-   parse_ip,
+   .id = "ip",
+   .parse_peopt = parse_ip,
 };
diff --git a/tc/p_ip6.c b/tc/p_ip6.c
index bc45ab70d319..dbfdca42cce7 100644
--- a/tc/p_ip6.c
+++ b/tc/p_ip6.c
@@ -84,7 +84,6 @@ done:
 }
 
 struct m_pedit_util p_pedit_ip6 = {
-   NULL,
-   "ipv6",
-   parse_ip6,
+   .id = "ipv6",
+   .parse_peopt = parse_ip6,
 };
diff --git a/tc/p_tcp.c b/tc/p_tcp.c
index eeb68fcf87b3..d2dbfd719526 100644
--- a/tc/p_tcp.c
+++ b/tc/p_tcp.c
@@ -67,7 +67,6 @@ done:
return res;
 }
 struct m_pedit_util p_pedit_tcp = {
-   NULL,
-   "tcp",
-   parse_tcp,
+   .id = "tcp",
+   .parse_peopt = parse_tcp,
 };
diff --git a/tc/p_udp.c b/tc/p_udp.c
index 68c688efd110..bab456de9831 100644
--- a/tc/p_udp.c
+++ b/tc/p_udp.c
@@ -61,7 +61,6 @@ done:
 }
 
 struct m_pedit_util p_pedit_udp = {
-   NULL,
-   "udp",
-   parse_udp,
+   .id = "udp",
+   .parse_peopt = parse_udp,
 };
-- 
2.17.1

[PATCH iproute2 14/22] tc/police: make print_police static

2018-11-15 Thread Stephen Hemminger

print_police function only used by m_police.

Signed-off-by: Stephen Hemminger 
---
 tc/m_police.c | 10 +++---
 tc/tc_util.h  |  3 ---
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/tc/m_police.c b/tc/m_police.c
index f3b07f7b0439..d645999ba08b 100644
--- a/tc/m_police.c
+++ b/tc/m_police.c
@@ -25,6 +25,10 @@
 #include "utils.h"
 #include "tc_util.h"
 
+static int act_parse_police(struct action_util *a, int *argc_p,
+   char ***argv_p, int tca_id, struct nlmsghdr *n);
+static int print_police(struct action_util *a, FILE *f, struct rtattr *tb);
+
 struct action_util police_action_util = {
.id = "police",
.parse_aopt = act_parse_police,
@@ -50,8 +54,8 @@ static void explain1(char *arg)
fprintf(stderr, "Illegal \"%s\"\n", arg);
 }
 
-int act_parse_police(struct action_util *a, int *argc_p, char ***argv_p,
-int tca_id, struct nlmsghdr *n)
+static int act_parse_police(struct action_util *a, int *argc_p, char ***argv_p,
+   int tca_id, struct nlmsghdr *n)
 {
int argc = *argc_p;
char **argv = *argv_p;
@@ -256,7 +260,7 @@ int parse_police(int *argc_p, char ***argv_p, int tca_id, 
struct nlmsghdr *n)
return act_parse_police(NULL, argc_p, argv_p, tca_id, n);
 }
 
-int print_police(struct action_util *a, FILE *f, struct rtattr *arg)
+static int print_police(struct action_util *a, FILE *f, struct rtattr *arg)
 {
SPRINT_BUF(b1);
SPRINT_BUF(b2);
diff --git a/tc/tc_util.h b/tc/tc_util.h
index 76fd986d6e4c..e22c6da25696 100644
--- a/tc/tc_util.h
+++ b/tc/tc_util.h
@@ -117,9 +117,6 @@ int parse_action_control_slash(int *argc_p, char ***argv_p,
   int *result1_p, int *result2_p, bool allow_num);
 void print_action_control(FILE *f, const char *prefix,
  int action, const char *suffix);
-int act_parse_police(struct action_util *a, int *argc_p,
-char ***argv_p, int tca_id, struct nlmsghdr *n);
-int print_police(struct action_util *a, FILE *f, struct rtattr *tb);
 int police_print_xstats(struct action_util *a, FILE *f, struct rtattr *tb);
 int tc_print_action(FILE *f, const struct rtattr *tb, unsigned short tot_acts);
 int tc_print_ipt(FILE *f, const struct rtattr *tb);
-- 
2.17.1

[PATCH iproute2 16/22] tc/pedit: make functions static

2018-11-15 Thread Stephen Hemminger

The parse and pack functions are only used by the pedit routines.

Signed-off-by: Stephen Hemminger 
---
 tc/m_pedit.c | 33 ++---
 tc/m_pedit.h | 15 ---
 2 files changed, 14 insertions(+), 34 deletions(-)

diff --git a/tc/m_pedit.c b/tc/m_pedit.c
index 2aeb56d9615f..6f8d078b7d3c 100644
--- a/tc/m_pedit.c
+++ b/tc/m_pedit.c
@@ -118,7 +118,7 @@ noexist:
return p;
 }
 
-int pack_key(struct m_pedit_sel *_sel, struct m_pedit_key *tkey)
+static int pack_key(struct m_pedit_sel *_sel, struct m_pedit_key *tkey)
 {
struct tc_pedit_sel *sel = &_sel->sel;
struct m_pedit_key_ex *keys_ex = _sel->keys_ex;
@@ -155,8 +155,8 @@ int pack_key(struct m_pedit_sel *_sel, struct m_pedit_key 
*tkey)
return 0;
 }
 
-int pack_key32(__u32 retain, struct m_pedit_sel *sel,
-  struct m_pedit_key *tkey)
+static int pack_key32(__u32 retain, struct m_pedit_sel *sel,
+ struct m_pedit_key *tkey)
 {
if (tkey->off > (tkey->off & ~3)) {
fprintf(stderr,
@@ -169,8 +169,8 @@ int pack_key32(__u32 retain, struct m_pedit_sel *sel,
return pack_key(sel, tkey);
 }
 
-int pack_key16(__u32 retain, struct m_pedit_sel *sel,
-  struct m_pedit_key *tkey)
+static int pack_key16(__u32 retain, struct m_pedit_sel *sel,
+ struct m_pedit_key *tkey)
 {
int ind, stride;
__u32 m[4] = { 0x, 0xFFFF, 0x };
@@ -197,10 +197,10 @@ int pack_key16(__u32 retain, struct m_pedit_sel *sel,
printf("pack_key16: Final val %08x mask %08x\n",
   tkey->val, tkey->mask);
return pack_key(sel, tkey);
-
 }
 
-int pack_key8(__u32 retain, struct m_pedit_sel *sel, struct m_pedit_key *tkey)
+static int pack_key8(__u32 retain, struct m_pedit_sel *sel,
+struct m_pedit_key *tkey)
 {
int ind, stride;
__u32 m[4] = { 0x00FF, 0xFF00, 0x00FF, 0xFF00 };
@@ -283,7 +283,7 @@ static int pack_ipv6(struct m_pedit_sel *sel, struct 
m_pedit_key *tkey,
return 0;
 }
 
-int parse_val(int *argc_p, char ***argv_p, __u32 *val, int type)
+static int parse_val(int *argc_p, char ***argv_p, __u32 *val, int type)
 {
int argc = *argc_p;
char **argv = *argv_p;
@@ -433,8 +433,8 @@ done:
 
 }
 
-int parse_offset(int *argc_p, char ***argv_p, struct m_pedit_sel *sel,
-struct m_pedit_key *tkey)
+static int parse_offset(int *argc_p, char ***argv_p, struct m_pedit_sel *sel,
+   struct m_pedit_key *tkey)
 {
int off;
__u32 len, retain;
@@ -612,8 +612,8 @@ static int pedit_keys_ex_addattr(struct m_pedit_sel *sel, 
struct nlmsghdr *n)
return 0;
 }
 
-int parse_pedit(struct action_util *a, int *argc_p, char ***argv_p, int tca_id,
-   struct nlmsghdr *n)
+static int parse_pedit(struct action_util *a, int *argc_p, char ***argv_p,
+  int tca_id, struct nlmsghdr *n)
 {
struct m_pedit_sel sel = {};
 
@@ -705,7 +705,7 @@ int parse_pedit(struct action_util *a, int *argc_p, char 
***argv_p, int tca_id,
return 0;
 }
 
-const char *pedit_htype_str[] = {
+static const char * const pedit_htype_str[] = {
[TCA_PEDIT_KEY_EX_HDR_TYPE_NETWORK] = "",
[TCA_PEDIT_KEY_EX_HDR_TYPE_ETH] = "eth",
[TCA_PEDIT_KEY_EX_HDR_TYPE_IP4] = "ipv4",
@@ -730,7 +730,7 @@ static void print_pedit_location(FILE *f,
fprintf(f, "%c%d", (int)off  >= 0 ? '+' : '-', abs((int)off));
 }
 
-int print_pedit(struct action_util *au, FILE *f, struct rtattr *arg)
+static int print_pedit(struct action_util *au, FILE *f, struct rtattr *arg)
 {
struct tc_pedit_sel *sel;
struct rtattr *tb[TCA_PEDIT_MAX + 1];
@@ -826,11 +826,6 @@ int print_pedit(struct action_util *au, FILE *f, struct 
rtattr *arg)
return 0;
 }
 
-int pedit_print_xstats(struct action_util *au, FILE *f, struct rtattr *xstats)
-{
-   return 0;
-}
-
 struct action_util pedit_action_util = {
.id = "pedit",
.parse_aopt = parse_pedit,
diff --git a/tc/m_pedit.h b/tc/m_pedit.h
index b6b274bd08c7..5d3628a70b99 100644
--- a/tc/m_pedit.h
+++ b/tc/m_pedit.h
@@ -71,22 +71,7 @@ struct m_pedit_util {
   struct m_pedit_key *tkey);
 };
 
-int pack_key(struct m_pedit_sel *sel, struct m_pedit_key *tkey);
-int pack_key32(__u32 retain, struct m_pedit_sel *sel,
-  struct m_pedit_key *tkey);
-int pack_key16(__u32 retain, struct m_pedit_sel *sel,
-  struct m_pedit_key *tkey);
-int pack_key8(__u32 retain, struct m_pedit_sel *sel,
-struct m_pedit_key *tkey);
-int parse_val(int *argc_p, char ***argv_p, __u32 *val, int type);
 int parse_cmd(int *argc_p, char ***argv_p, __u32 len, int type,
  __u32 retain,
  struct m_pedit_sel *sel, struct m_pedit_key *tkey);
-int parse_offset(int *argc_p, char ***argv_p,
-struct m_pedit_sel *sel, struct

[PATCH iproute2 19/22] tc/meta: make meta_table static and const

2018-11-15 Thread Stephen Hemminger

The mapping table is only used by em_meta.

Signed-off-by: Stephen Hemminger 
---
 tc/em_meta.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tc/em_meta.c b/tc/em_meta.c
index d94fe88d9b2f..2ddc65ed6cb6 100644
--- a/tc/em_meta.c
+++ b/tc/em_meta.c
@@ -38,7 +38,7 @@ static void meta_print_usage(FILE *fd)
"For a list of meta identifiers, use meta(list).\n");
 }
 
-struct meta_entry {
+static const struct meta_entry {
int id;
char *kind;
char *mask;
@@ -121,7 +121,7 @@ static inline int map_type(char k)
return INT_MAX;
 }
 
-static struct meta_entry *lookup_meta_entry(struct bstr *kind)
+static const struct meta_entry *lookup_meta_entry(struct bstr *kind)
 {
int i;
 
@@ -133,7 +133,7 @@ static struct meta_entry *lookup_meta_entry(struct bstr 
*kind)
return NULL;
 }
 
-static struct meta_entry *lookup_meta_entry_byid(int id)
+static const struct meta_entry *lookup_meta_entry_byid(int id)
 {
int i;
 
@@ -168,8 +168,8 @@ static inline void dump_value(struct nlmsghdr *n, int tlv, 
unsigned long val,
 static inline int is_compatible(struct tcf_meta_val *what,
struct tcf_meta_val *needed)
 {
+   const struct meta_entry *entry;
char *p;
-   struct meta_entry *entry;
 
entry = lookup_meta_entry_byid(TCF_META_ID(what->kind));
 
@@ -249,7 +249,7 @@ static inline struct bstr *
 parse_object(struct bstr *args, struct bstr *arg, struct tcf_meta_val *obj,
 unsigned long *dst, struct tcf_meta_val *left)
 {
-   struct meta_entry *entry;
+   const struct meta_entry *entry;
unsigned long num;
struct bstr *a;
 
@@ -461,7 +461,7 @@ static int print_object(FILE *fd, struct tcf_meta_val *obj, 
struct rtattr *rta)
 {
int id = TCF_META_ID(obj->kind);
int type = TCF_META_TYPE(obj->kind);
-   struct meta_entry *entry;
+   const struct meta_entry *entry;
 
if (id == TCF_META_ID_VALUE)
return print_value(fd, type, rta);
-- 
2.17.1

[PATCH iproute2 07/22] bridge: make local variables static

2018-11-15 Thread Stephen Hemminger

enable_color and set_color_palette only used here.

Signed-off-by: Stephen Hemminger 
---
 bridge/bridge.c  | 5 ++---
 bridge/monitor.c | 2 +-
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/bridge/bridge.c b/bridge/bridge.c
index ac4d6a14f510..389f1bd5382b 100644
--- a/bridge/bridge.c
+++ b/bridge/bridge.c
@@ -23,12 +23,11 @@ int preferred_family = AF_UNSPEC;
 int oneline;
 int show_stats;
 int show_details;
-int show_pretty;
-int color;
+static int color;
 int compress_vlans;
 int json;
 int timestamp;
-char *batch_file;
+static const char *batch_file;
 int force;
 
 static void usage(void) __attribute__((noreturn));
diff --git a/bridge/monitor.c b/bridge/monitor.c
index 82bc6b407a06..708a1bd2ccb0 100644
--- a/bridge/monitor.c
+++ b/bridge/monitor.c
@@ -27,7 +27,7 @@
 
 
 static void usage(void) __attribute__((noreturn));
-int prefix_banner;
+static int prefix_banner;
 
 static void usage(void)
 {
-- 
2.17.1

[PATCH iproute2 18/22] tc/util: make local functions static

2018-11-15 Thread Stephen Hemminger

The tc util library parse/print has functions only used locally
(and some dead code removed).

Signed-off-by: Stephen Hemminger 
---
 tc/tc_util.c | 17 +++--
 tc/tc_util.h |  5 -
 2 files changed, 3 insertions(+), 19 deletions(-)

diff --git a/tc/tc_util.c b/tc/tc_util.c
index a082c73c9350..82856a85170b 100644
--- a/tc/tc_util.c
+++ b/tc/tc_util.c
@@ -190,7 +190,7 @@ static const struct rate_suffix {
{ NULL }
 };
 
-int parse_percent_rate(char *rate, const char *str, const char *dev)
+static int parse_percent_rate(char *rate, const char *str, const char *dev)
 {
long dev_mbit;
int ret;
@@ -409,7 +409,7 @@ void print_devname(enum output_type type, int ifindex)
   "dev", "%s ", ifname);
 }
 
-void print_size(char *buf, int len, __u32 sz)
+static void print_size(char *buf, int len, __u32 sz)
 {
double tmp = sz;
 
@@ -427,17 +427,6 @@ char *sprint_size(__u32 size, char *buf)
return buf;
 }
 
-void print_qdisc_handle(char *buf, int len, __u32 h)
-{
-   snprintf(buf, len, "%x:", TC_H_MAJ(h)>>16);
-}
-
-char *sprint_qdisc_handle(__u32 h, char *buf)
-{
-   print_qdisc_handle(buf, SPRINT_BSIZE-1, h);
-   return buf;
-}
-
 static const char *action_n2a(int action)
 {
static char buf[64];
@@ -709,7 +698,7 @@ int get_linklayer(unsigned int *val, const char *arg)
return 0;
 }
 
-void print_linklayer(char *buf, int len, unsigned int linklayer)
+static void print_linklayer(char *buf, int len, unsigned int linklayer)
 {
switch (linklayer) {
case LINKLAYER_UNSPEC:
diff --git a/tc/tc_util.h b/tc/tc_util.h
index e22c6da25696..825fea36a080 100644
--- a/tc/tc_util.h
+++ b/tc/tc_util.h
@@ -73,7 +73,6 @@ const char *get_tc_lib(void);
 struct qdisc_util *get_qdisc_kind(const char *str);
 struct filter_util *get_filter_kind(const char *str);
 
-int parse_percent_rate(char *rate, const char *str, const char *dev);
 int get_qdisc_handle(__u32 *h, const char *str);
 int get_rate(unsigned int *rate, const char *str);
 int get_percent_rate(unsigned int *rate, const char *str, const char *dev);
@@ -84,14 +83,10 @@ int get_size_and_cell(unsigned int *size, int *cell_log, 
char *str);
 int get_linklayer(unsigned int *val, const char *arg);
 
 void print_rate(char *buf, int len, __u64 rate);
-void print_size(char *buf, int len, __u32 size);
-void print_qdisc_handle(char *buf, int len, __u32 h);
-void print_linklayer(char *buf, int len, unsigned int linklayer);
 void print_devname(enum output_type type, int ifindex);
 
 char *sprint_rate(__u64 rate, char *buf);
 char *sprint_size(__u32 size, char *buf);
-char *sprint_qdisc_handle(__u32 h, char *buf);
 char *sprint_tc_classid(__u32 h, char *buf);
 char *sprint_ticks(__u32 ticks, char *buf);
 char *sprint_linklayer(unsigned int linklayer, char *buf);
-- 
2.17.1

[PATCH iproute2 17/22] tc/ematch: make local functions static

2018-11-15 Thread Stephen Hemminger

The print handling is only used in tc/m_ematch.c

Remove unused function to print_ematch_tree.

Signed-off-by: Stephen Hemminger 
---
 tc/m_ematch.c | 30 +++---
 tc/m_ematch.h |  1 -
 2 files changed, 3 insertions(+), 28 deletions(-)

diff --git a/tc/m_ematch.c b/tc/m_ematch.c
index a524b520b276..8840a0dc62a1 100644
--- a/tc/m_ematch.c
+++ b/tc/m_ematch.c
@@ -38,6 +38,8 @@ struct ematch *ematch_root;
 static int begin_argc;
 static char **begin_argv;
 
+static void bstr_print(FILE *fd, const struct bstr *b, int ascii);
+
 static inline void map_warning(int num, char *kind)
 {
fprintf(stderr,
@@ -548,7 +550,7 @@ unsigned long bstrtoul(const struct bstr *b)
return l;
 }
 
-void bstr_print(FILE *fd, const struct bstr *b, int ascii)
+static void bstr_print(FILE *fd, const struct bstr *b, int ascii)
 {
int i;
char *s = b->data;
@@ -565,29 +567,3 @@ void bstr_print(FILE *fd, const struct bstr *b, int ascii)
fprintf(fd, "\"");
}
 }
-
-void print_ematch_tree(const struct ematch *tree)
-{
-   const struct ematch *t;
-
-   for (t = tree; t; t = t->next) {
-   if (t->inverted)
-   printf("NOT ");
-
-   if (t->child) {
-   printf("(");
-   print_ematch_tree(t->child);
-   printf(")");
-   } else {
-   struct bstr *b;
-
-   for (b = t->args; b; b = b->next)
-   printf("%s%s", b->data, b->next ? " " : "");
-   }
-
-   if (t->relation == TCF_EM_REL_AND)
-   printf(" AND ");
-   else if (t->relation == TCF_EM_REL_OR)
-   printf(" OR ");
-   }
-}
diff --git a/tc/m_ematch.h b/tc/m_ematch.h
index 356f2eded7fc..c4443ee22942 100644
--- a/tc/m_ematch.h
+++ b/tc/m_ematch.h
@@ -51,7 +51,6 @@ static inline struct bstr *bstr_next(struct bstr *b)
 }
 
 unsigned long bstrtoul(const struct bstr *b);
-void bstr_print(FILE *fd, const struct bstr *b, int ascii);
 
 struct ematch {
struct bstr *args;
-- 
2.17.1

[PATCH iproute2 08/22] ip: make flag names const/static

2018-11-15 Thread Stephen Hemminger

The table of filter flags is only used in ipaddress

Signed-off-by: Stephen Hemminger 
---
 ip/ipaddress.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ip/ipaddress.c b/ip/ipaddress.c
index cd8cc76a3473..2671c4e162e3 100644
--- a/ip/ipaddress.c
+++ b/ip/ipaddress.c
@@ -1149,7 +1149,7 @@ static unsigned int get_ifa_flags(struct ifaddrmsg *ifa,
 }
 
 /* Mapping from argument to address flag mask */
-struct {
+static const struct {
const char *name;
unsigned long value;
 } ifa_flag_names[] = {
-- 
2.17.1

[PATCH iproute2 00/22] misc cleanups

2018-11-15 Thread Stephen Hemminger

Code cleanup including:
   * make local functions static
   * drop dead code
   * whitespace code style cleanup

Stephen Hemminger (22):
  lib/ll_addr: whitespace and indent cleanup
  lib/utils: make local functions static
  lib/color: make local functions static
  lib/ll_map: make local function static
  libnetlnk: unused and local functions cleanup
  genl: remove dead code
  bridge: make local variables static
  ip: make flag names const/static
  ipmonitor: make local variable static
  ipxfrm: make local functions static
  tc: drop unused name_to_id function
  tipc: make cmd_find static
  tc/class: make filter variables static
  tc/police: make print_police static
  ss: make local variables static
  tc/pedit: make functions static
  tc/ematch: make local functions static
  tc/util: make local functions static
  tc/meta: make meta_table static and const
  tc/action: make variables static
  tc/pedit: use structure initialization
  rdma: make local functions static

 bridge/bridge.c  |  5 ++--
 bridge/monitor.c |  2 +-
 genl/ctrl.c  | 71 
 genl/genl_utils.h|  2 --
 include/color.h  |  2 --
 include/libnetlink.h |  7 -
 include/ll_map.h |  1 -
 include/names.h  |  1 -
 include/utils.h  |  5 
 ip/ipaddress.c   |  2 +-
 ip/ipmonitor.c   |  2 +-
 ip/ipxfrm.c  | 11 +++
 ip/xfrm.h|  9 --
 ip/xfrm_monitor.c|  2 +-
 lib/color.c  |  6 ++--
 lib/libnetlink.c | 28 ++---
 lib/ll_addr.c| 24 ---
 lib/ll_map.c |  2 +-
 lib/names.c  | 28 -
 lib/utils.c  | 48 +-
 misc/ss.c| 28 -
 rdma/rdma.h  | 11 ---
 rdma/utils.c | 12 
 tc/em_meta.c | 12 
 tc/m_action.c|  4 +--
 tc/m_ematch.c| 30 ++-
 tc/m_ematch.h|  1 -
 tc/m_pedit.c | 33 +---
 tc/m_pedit.h | 15 --
 tc/m_police.c| 10 +--
 tc/p_eth.c   |  5 ++--
 tc/p_icmp.c  |  5 ++--
 tc/p_ip.c|  5 ++--
 tc/p_ip6.c   |  5 ++--
 tc/p_tcp.c   |  5 ++--
 tc/p_udp.c   |  5 ++--
 tc/tc_class.c|  6 ++--
 tc/tc_util.c | 17 ++-
 tc/tc_util.h |  8 -
 tipc/cmdl.c  |  2 +-
 tipc/cmdl.h  |  2 --
 41 files changed, 110 insertions(+), 369 deletions(-)

-- 
2.17.1

[PATCH iproute2 02/22] lib/utils: make local functions static

2018-11-15 Thread Stephen Hemminger

Some of the print/parsing is only used internally.
Drop unused get_s8/get_s16.

Signed-off-by: Stephen Hemminger 
---
 include/utils.h |  5 -
 lib/utils.c | 48 +++-
 2 files changed, 7 insertions(+), 46 deletions(-)

diff --git a/include/utils.h b/include/utils.h
index bf6dea23df66..1630dd0b2854 100644
--- a/include/utils.h
+++ b/include/utils.h
@@ -157,9 +157,7 @@ int get_u64(__u64 *val, const char *arg, int base);
 int get_u32(__u32 *val, const char *arg, int base);
 int get_s32(__s32 *val, const char *arg, int base);
 int get_u16(__u16 *val, const char *arg, int base);
-int get_s16(__s16 *val, const char *arg, int base);
 int get_u8(__u8 *val, const char *arg, int base);
-int get_s8(__s8 *val, const char *arg, int base);
 int get_be64(__be64 *val, const char *arg, int base);
 int get_be32(__be32 *val, const char *arg, int base);
 int get_be16(__be16 *val, const char *arg, int base);
@@ -172,7 +170,6 @@ __u8 *hexstring_a2n(const char *str, __u8 *buf, int blen, 
unsigned int *len);
 int addr64_n2a(__u64 addr, char *buff, size_t len);
 
 int af_bit_len(int af);
-int af_byte_len(int af);
 
 const char *format_host_r(int af, int len, const void *addr,
   char *buf, int buflen);
@@ -326,8 +323,6 @@ void drop_cap(void);
 
 int get_time(unsigned int *time, const char *str);
 int get_time64(__s64 *time, const char *str);
-void print_time(char *buf, int len, __u32 time);
-void print_time64(char *buf, int len, __s64 time);
 char *sprint_time(__u32 time, char *buf);
 char *sprint_time64(__s64 time, char *buf);
 
diff --git a/lib/utils.c b/lib/utils.c
index 345630d04929..4965a5750880 100644
--- a/lib/utils.c
+++ b/lib/utils.c
@@ -45,6 +45,10 @@ int timestamp_short;
 int pretty;
 const char *_SL_ = "\n";
 
+static int af_byte_len(int af);
+static void print_time(char *buf, int len, __u32 time);
+static void print_time64(char *buf, int len, __s64 time);
+
 int read_prop(const char *dev, char *prop, long *value)
 {
char fname[128], buf[80], *endp, *nl;
@@ -426,43 +430,6 @@ int get_s32(__s32 *val, const char *arg, int base)
return 0;
 }
 
-int get_s16(__s16 *val, const char *arg, int base)
-{
-   long res;
-   char *ptr;
-
-   if (!arg || !*arg)
-   return -1;
-   res = strtol(arg, , base);
-   if (!ptr || ptr == arg || *ptr)
-   return -1;
-   if ((res == LONG_MIN || res == LONG_MAX) && errno == ERANGE)
-   return -1;
-   if (res > 0x7FFF || res < -0x8000)
-   return -1;
-
-   *val = res;
-   return 0;
-}
-
-int get_s8(__s8 *val, const char *arg, int base)
-{
-   long res;
-   char *ptr;
-
-   if (!arg || !*arg)
-   return -1;
-   res = strtol(arg, , base);
-   if (!ptr || ptr == arg || *ptr)
-   return -1;
-   if ((res == LONG_MIN || res == LONG_MAX) && errno == ERANGE)
-   return -1;
-   if (res > 0x7F || res < -0x80)
-   return -1;
-   *val = res;
-   return 0;
-}
-
 int get_be64(__be64 *val, const char *arg, int base)
 {
__u64 v;
@@ -708,7 +675,7 @@ int af_bit_len(int af)
return 0;
 }
 
-int af_byte_len(int af)
+static int af_byte_len(int af)
 {
return af_bit_len(af) / 8;
 }
@@ -1710,8 +1677,7 @@ int get_time(unsigned int *time, const char *str)
return 0;
 }
 
-
-void print_time(char *buf, int len, __u32 time)
+static void print_time(char *buf, int len, __u32 time)
 {
double tmp = time;
 
@@ -1764,7 +1730,7 @@ int get_time64(__s64 *time, const char *str)
return 0;
 }
 
-void print_time64(char *buf, int len, __s64 time)
+static void print_time64(char *buf, int len, __s64 time)
 {
double nsec = time;
 
-- 
2.17.1

[PATCH iproute2 01/22] lib/ll_addr: whitespace and indent cleanup

2018-11-15 Thread Stephen Hemminger

Run old ll_addr through kernel Lindent.

Signed-off-by: Stephen Hemminger 
---
 lib/ll_addr.c | 24 +---
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/lib/ll_addr.c b/lib/ll_addr.c
index 84de64e2e053..00b562aeda22 100644
--- a/lib/ll_addr.c
+++ b/lib/ll_addr.c
@@ -26,20 +26,20 @@
 #include "rt_names.h"
 #include "utils.h"
 
-
-const char *ll_addr_n2a(const unsigned char *addr, int alen, int type, char 
*buf, int blen)
+const char *ll_addr_n2a(const unsigned char *addr, int alen, int type,
+   char *buf, int blen)
 {
int i;
int l;
 
if (alen == 4 &&
-   (type == ARPHRD_TUNNEL || type == ARPHRD_SIT || type == 
ARPHRD_IPGRE)) {
+   (type == ARPHRD_TUNNEL || type == ARPHRD_SIT
+|| type == ARPHRD_IPGRE))
return inet_ntop(AF_INET, addr, buf, blen);
-   }
-   if (alen == 16 &&
-   (type == ARPHRD_TUNNEL6 || type == ARPHRD_IP6GRE)) {
+
+   if (alen == 16 && (type == ARPHRD_TUNNEL6 || type == ARPHRD_IP6GRE))
return inet_ntop(AF_INET6, addr, buf, blen);
-   }
+
snprintf(buf, blen, "%02x", addr[0]);
for (i = 1, l = 2; i < alen && l < blen; i++, l += 3)
snprintf(buf + l, blen - l, ":%02x", addr[i]);
@@ -62,7 +62,7 @@ int ll_addr_a2n(char *lladdr, int len, const char *arg)
} else {
int i;
 
-   for (i=0; i 255) {
-   fprintf(stderr, "\"%s\" is invalid lladdr.\n", 
arg);
+   fprintf(stderr, "\"%s\" is invalid lladdr.\n",
+   arg);
return -1;
}
lladdr[i] = temp;
@@ -82,6 +84,6 @@ int ll_addr_a2n(char *lladdr, int len, const char *arg)
break;
arg = cp;
}
-   return i+1;
+   return i + 1;
}
 }
-- 
2.17.1

[PATCH iproute2 05/22] libnetlnk: unused and local functions cleanup

2018-11-15 Thread Stephen Hemminger

rntl_talk_extack and parse_rtattr_index not used in current code.
rtnl_dump_filter_l is only used in this file.

Signed-off-by: Stephen Hemminger 
---
 include/libnetlink.h |  7 ---
 lib/libnetlink.c | 28 ++--
 2 files changed, 2 insertions(+), 33 deletions(-)

diff --git a/include/libnetlink.h b/include/libnetlink.h
index fa8de093d484..138840d5c892 100644
--- a/include/libnetlink.h
+++ b/include/libnetlink.h
@@ -102,8 +102,6 @@ struct rtnl_dump_filter_arg {
__u16 nc_flags;
 };
 
-int rtnl_dump_filter_l(struct rtnl_handle *rth,
- const struct rtnl_dump_filter_arg *arg);
 int rtnl_dump_filter_nc(struct rtnl_handle *rth,
rtnl_filter_t filter,
void *arg, __u16 nc_flags);
@@ -115,9 +113,6 @@ int rtnl_talk(struct rtnl_handle *rtnl, struct nlmsghdr *n,
 int rtnl_talk_iov(struct rtnl_handle *rtnl, struct iovec *iovec, size_t iovlen,
  struct nlmsghdr **answer)
__attribute__((warn_unused_result));
-int rtnl_talk_extack(struct rtnl_handle *rtnl, struct nlmsghdr *n,
- struct nlmsghdr **answer, nl_ext_ack_fn_t errfn)
-   __attribute__((warn_unused_result));
 int rtnl_talk_suppress_rtnl_errmsg(struct rtnl_handle *rtnl, struct nlmsghdr 
*n,
   struct nlmsghdr **answer)
__attribute__((warn_unused_result));
@@ -152,8 +147,6 @@ int rta_addattr_l(struct rtattr *rta, int maxlen, int type,
 int parse_rtattr(struct rtattr *tb[], int max, struct rtattr *rta, int len);
 int parse_rtattr_flags(struct rtattr *tb[], int max, struct rtattr *rta,
  int len, unsigned short flags);
-int parse_rtattr_byindex(struct rtattr *tb[], int max,
-struct rtattr *rta, int len);
 struct rtattr *parse_rtattr_one(int type, struct rtattr *rta, int len);
 int __parse_rtattr_nested_compat(struct rtattr *tb[], int max, struct rtattr 
*rta, int len);
 
diff --git a/lib/libnetlink.c b/lib/libnetlink.c
index fe4a7a4b9c71..c0b80ed6fdfb 100644
--- a/lib/libnetlink.c
+++ b/lib/libnetlink.c
@@ -611,8 +611,8 @@ static int rtnl_recvmsg(int fd, struct msghdr *msg, char 
**answer)
return len;
 }
 
-int rtnl_dump_filter_l(struct rtnl_handle *rth,
-  const struct rtnl_dump_filter_arg *arg)
+static int rtnl_dump_filter_l(struct rtnl_handle *rth,
+ const struct rtnl_dump_filter_arg *arg)
 {
struct sockaddr_nl nladdr;
struct iovec iov;
@@ -877,13 +877,6 @@ int rtnl_talk_iov(struct rtnl_handle *rtnl, struct iovec 
*iovec, size_t iovlen,
return __rtnl_talk_iov(rtnl, iovec, iovlen, answer, true, NULL);
 }
 
-int rtnl_talk_extack(struct rtnl_handle *rtnl, struct nlmsghdr *n,
-struct nlmsghdr **answer,
-nl_ext_ack_fn_t errfn)
-{
-   return __rtnl_talk(rtnl, n, answer, true, errfn);
-}
-
 int rtnl_talk_suppress_rtnl_errmsg(struct rtnl_handle *rtnl, struct nlmsghdr 
*n,
   struct nlmsghdr **answer)
 {
@@ -1242,23 +1235,6 @@ int parse_rtattr_flags(struct rtattr *tb[], int max, 
struct rtattr *rta,
return 0;
 }
 
-int parse_rtattr_byindex(struct rtattr *tb[], int max,
-struct rtattr *rta, int len)
-{
-   int i = 0;
-
-   memset(tb, 0, sizeof(struct rtattr *) * max);
-   while (RTA_OK(rta, len)) {
-   if (rta->rta_type <= max && i < max)
-   tb[i++] = rta;
-   rta = RTA_NEXT(rta, len);
-   }
-   if (len)
-   fprintf(stderr, "!!!Deficit %d, rta_len=%d\n",
-   len, rta->rta_len);
-   return i;
-}
-
 struct rtattr *parse_rtattr_one(int type, struct rtattr *rta, int len)
 {
while (RTA_OK(rta, len)) {
-- 
2.17.1

Re: [PATCH net] sctp: not allow to set asoc prsctp_enable by sockopt

2018-11-15 Thread Marcelo Ricardo Leitner

On Thu, Nov 15, 2018 at 04:43:10PM -0500, Neil Horman wrote:
> On Thu, Nov 15, 2018 at 03:22:21PM -0200, Marcelo Ricardo Leitner wrote:
> > On Thu, Nov 15, 2018 at 07:14:28PM +0800, Xin Long wrote:
> > > As rfc7496#section4.5 says about SCTP_PR_SUPPORTED:
> > > 
> > >This socket option allows the enabling or disabling of the
> > >negotiation of PR-SCTP support for future associations.  For existing
> > >associations, it allows one to query whether or not PR-SCTP support
> > >was negotiated on a particular association.
> > > 
> > > It means only sctp sock's prsctp_enable can be set.
> > > 
> > > Note that for the limitation of SCTP_{CURRENT|ALL}_ASSOC, we will
> > > add it when introducing SCTP_{FUTURE|CURRENT|ALL}_ASSOC for linux
> > > sctp in another patchset.
> > > 
> > > Fixes: 28aa4c26fce2 ("sctp: add SCTP_PR_SUPPORTED on sctp sockopt")
> > > Reported-by: Ying Xu 
> > > Signed-off-by: Xin Long 
> > > ---
> > >  net/sctp/socket.c | 13 +++--
> > >  1 file changed, 3 insertions(+), 10 deletions(-)
> > > 
> > > diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> > > index 739f3e5..e9b8232 100644
> > > --- a/net/sctp/socket.c
> > > +++ b/net/sctp/socket.c
> > > @@ -3940,7 +3940,6 @@ static int sctp_setsockopt_pr_supported(struct sock 
> > > *sk,
> > >   unsigned int optlen)
> > >  {
> > >   struct sctp_assoc_value params;
> > > - struct sctp_association *asoc;
> > >   int retval = -EINVAL;
> > >  
> > >   if (optlen != sizeof(params))
> > > @@ -3951,16 +3950,10 @@ static int sctp_setsockopt_pr_supported(struct 
> > > sock *sk,
> > >   goto out;
> > >   }
> > >  
> > > - asoc = sctp_id2assoc(sk, params.assoc_id);
> > > - if (asoc) {
> > > - asoc->prsctp_enable = !!params.assoc_value;
> > > - } else if (!params.assoc_id) {
> > > - struct sctp_sock *sp = sctp_sk(sk);
> > > -
> > > - sp->ep->prsctp_enable = !!params.assoc_value;
> > > - } else {
> > > + if (sctp_style(sk, UDP) && sctp_id2assoc(sk, params.assoc_id))
> > 
> > This would allow using a non-existent assoc id on UDP-style sockets to
> > set it at the socket, which is not expected. It should be more like:
> > 
> > +   if (sctp_style(sk, UDP) && params.assoc_id)
> How do you see that to be the case? sctp_id2assoc will return NULL if an
> association isn't found, so the use of sctp_id2assoc should work just fine.

Right, it will return NULL, and because of that it won't bail out as
it should and will adjust the socket config instead.

> Just checking params.assoc_id would instead fail the setting of any 
> association
> id that isn't 0, which I don't think is what we want at all.

I think it is.

For exisitng associations, we can't set this anymore because it was
already negotiated on the handshake
(sctp_process_ext_param()/SCTP_CID_FWD_TSN) and there is no way back
after it. 
For non-existing assocs, they will always inherit it from the socket
value.

Question then is which semantics we want on validating the parameter
here. We have cases such as in sctp_setsockopt_delayed_ack() on which
it will reject using invalid asoc_ids as a way to mean the socket
itself for UDP-style sockets:

asoc = sctp_id2assoc(sk, params.sack_assoc_id);
if (!asoc && params.sack_assoc_id && sctp_style(sk, UDP))
return -EINVAL;

As we are returning the same error for both situations(invalid assoc id
and setting it on existing asoc), we don't need the asoc pointer
itself and can avoid sctp_id2assoc() call, leading to the if() I
suggested.

  Marcelo

> 
> Neil
> 
> > 
> > >   goto out;
> > > - }
> > > +
> > > + sctp_sk(sk)->ep->prsctp_enable = !!params.assoc_value;
> > >  
> > >   retval = 0;
> > >  
> > > -- 
> > > 2.1.0
> > > 
> >

Re: [PATCH v3 net-next 0/4] net: batched receive in GRO path

2018-11-15 Thread Eric Dumazet

On 11/15/2018 01:45 PM, Edward Cree wrote:
> 
> If napi->poll() is only handling one packet, surely GRO can't do anything
>  useful either?  (AIUI at the end of the poll the GRO lists get flushed.)

That is my point.

Adding yet another layer that will add no gain but add more waste of cpu cycles.

In fact I know many people disabling GRO in some cases because it adds ~5% 
penalty
for traffic that is not aggregated.

>  Is it maybe a sign that you're just spreading over too many queues??

Not really. You also want to be able to receive more traffic if the need comes.

Most NIC share the same IRQ for one TX/RX queue, and you might have an 
imbalance between TX and RX load.

Re: [PATCH v3 net-next 0/4] net: batched receive in GRO path

2018-11-15 Thread Edward Cree

On 15/11/18 20:08, Eric Dumazet wrote:
> On 11/15/2018 10:43 AM, Edward Cree wrote:
>
> Most of the packet isn't touched and thus won't be brought into cache.
>> Only the headers of each packet (worst-case let's say 256 bytes) will
>>  be touched during batch processing, that's 16kB.
> You assume perfect use of the caches, but part of the cache has collisions.
I assume nothing, that's why I'm running lots of tests & benchmarks.
Remember that gains from batching are not only in I$; the D$ cache is
 also going to be used for things like route lookups and netfilter
 progs, and locality for those is improved by batching.
It might be possible to use PMCs to get hard numbers on how I$ and D$
 hit & eviction rates change, idk how useful that would be.

> I am alarmed by the complexity added, for example in GRO, considering
> that we also added GRO for UDP.
This series doesn't really add complexity _in_ GRO, it's more a piece
 on the outside that's calling GRO machinery slightly differently.
Drivers which just call the existing non-list-based entry points won't
 even see any of this code.

> I dunno, can you show us for example if a reassembly workload can benefit
> from all this stuff ?
Sure, I can try a UDP test with payload_size > MTU.  (I can't think of a
 way to force interleaving of fragments from different packets, though.)

> If you present numbers for traffic that GRO handles just fine, it does not
> really make sense, unless your plan maybe is to remove GRO completely ?
That's just the easiest thing to test.  It's that much harder to set up
 tests to use e.g. IP options that GRO will baulk at.  It's also not too
 easy to create traffic with the kind of flow interleaving that DDoS
 scenarios would present, as that requires something like a many-to-one
 rig with a switch and I don't have enough lab machines for such a test.
I'm not planning to remove GRO.  GRO is faster than batched receive.
Batched receive, however, works equally well for all traffic whether it's
 GRO-able or not.
Thus both are worth having.  This patch series is about using batched
 receive for packets that GRO looks at and says "no thanks".

> We have observed at Google a constant increase of cpu cycles spent for TCP_RR
> on latest kernels. The gap is now about 20% with kernels from two years ago,
> and I could not yet find a faulty commit. It seems we add little overhead 
> after
> another, and every patch author is convinced he is doing the right thing.
>
> With multi queue NICS, vast majority of napi->poll() invocations handle only 
> one packet.
> Unfortunately we can not really increase interrupt mitigations (ethtool -c) 
> on NIC without sacrificing latencies.
At one point when I was working on the original batching patches, I tried
 making them skip batching if poll() hadn't used up the entire NAPI budget
 (as a signal that we're not BW-constrained), but it didn't seem to yield
 any benefit.  However I could try it again, or try checking the list
 length and handling packets singly if it's less than some threshold...?

If napi->poll() is only handling one packet, surely GRO can't do anything
 useful either?  (AIUI at the end of the poll the GRO lists get flushed.)
 Is it maybe a sign that you're just spreading over too many queues??

-Ed

Re: [PATCH net] sctp: not allow to set asoc prsctp_enable by sockopt

2018-11-15 Thread Neil Horman

On Thu, Nov 15, 2018 at 03:22:21PM -0200, Marcelo Ricardo Leitner wrote:
> On Thu, Nov 15, 2018 at 07:14:28PM +0800, Xin Long wrote:
> > As rfc7496#section4.5 says about SCTP_PR_SUPPORTED:
> > 
> >This socket option allows the enabling or disabling of the
> >negotiation of PR-SCTP support for future associations.  For existing
> >associations, it allows one to query whether or not PR-SCTP support
> >was negotiated on a particular association.
> > 
> > It means only sctp sock's prsctp_enable can be set.
> > 
> > Note that for the limitation of SCTP_{CURRENT|ALL}_ASSOC, we will
> > add it when introducing SCTP_{FUTURE|CURRENT|ALL}_ASSOC for linux
> > sctp in another patchset.
> > 
> > Fixes: 28aa4c26fce2 ("sctp: add SCTP_PR_SUPPORTED on sctp sockopt")
> > Reported-by: Ying Xu 
> > Signed-off-by: Xin Long 
> > ---
> >  net/sctp/socket.c | 13 +++--
> >  1 file changed, 3 insertions(+), 10 deletions(-)
> > 
> > diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> > index 739f3e5..e9b8232 100644
> > --- a/net/sctp/socket.c
> > +++ b/net/sctp/socket.c
> > @@ -3940,7 +3940,6 @@ static int sctp_setsockopt_pr_supported(struct sock 
> > *sk,
> > unsigned int optlen)
> >  {
> > struct sctp_assoc_value params;
> > -   struct sctp_association *asoc;
> > int retval = -EINVAL;
> >  
> > if (optlen != sizeof(params))
> > @@ -3951,16 +3950,10 @@ static int sctp_setsockopt_pr_supported(struct sock 
> > *sk,
> > goto out;
> > }
> >  
> > -   asoc = sctp_id2assoc(sk, params.assoc_id);
> > -   if (asoc) {
> > -   asoc->prsctp_enable = !!params.assoc_value;
> > -   } else if (!params.assoc_id) {
> > -   struct sctp_sock *sp = sctp_sk(sk);
> > -
> > -   sp->ep->prsctp_enable = !!params.assoc_value;
> > -   } else {
> > +   if (sctp_style(sk, UDP) && sctp_id2assoc(sk, params.assoc_id))
> 
> This would allow using a non-existent assoc id on UDP-style sockets to
> set it at the socket, which is not expected. It should be more like:
> 
> + if (sctp_style(sk, UDP) && params.assoc_id)
How do you see that to be the case? sctp_id2assoc will return NULL if an
association isn't found, so the use of sctp_id2assoc should work just fine.
Just checking params.assoc_id would instead fail the setting of any association
id that isn't 0, which I don't think is what we want at all.

Neil

> 
> > goto out;
> > -   }
> > +
> > +   sctp_sk(sk)->ep->prsctp_enable = !!params.assoc_value;
> >  
> > retval = 0;
> >  
> > -- 
> > 2.1.0
> > 
>

Re: DSA support for Marvell 88e6065 switch

2018-11-15 Thread Andrew Lunn

On Thu, Nov 15, 2018 at 08:51:11PM +0100, Pavel Machek wrote:
> Hi!
> 
> I'm trying to create support for Marvell 88e6065 switch... and it
> seems like drivers/net/dsa supports everything, but this model.
> 
> Did someone work with this hardware before? Any idea if it would be
> more suitable to support by existing 88e6060 code, or if 88e6xxx code
> should serve as a base?

Hi Pavel

The 88e6xxx should be extended to support this. I think you will find
a lot of the building blocks are already in the driver. Compare the
various implementations of the functions in the mv88e6xxx_ops to what
the datasheet says for the registers, and pick those that match.

Andrew

Re: [PATCH v3 net-next 0/4] net: batched receive in GRO path

2018-11-15 Thread Eric Dumazet

On 11/15/2018 10:43 AM, Edward Cree wrote:

Most of the packet isn't touched and thus won't be brought into cache.
> Only the headers of each packet (worst-case let's say 256 bytes) will
>  be touched during batch processing, that's 16kB.

You assume perfect use of the caches, but part of the cache has collisions.

I am alarmed by the complexity added, for example in GRO, considering
that we also added GRO for UDP.

I dunno, can you show us for example if a reassembly workload can benefit
from all this stuff ?

Paolo Abeni sure will be interested knowing if we can get a 20% increase for 
this
IP defrag workloads.

If you present numbers for traffic that GRO handles just fine, it does not
really make sense, unless your plan maybe is to remove GRO completely ?

We have observed at Google a constant increase of cpu cycles spent for TCP_RR
on latest kernels. The gap is now about 20% with kernels from two years ago,
and I could not yet find a faulty commit. It seems we add little overhead after
another, and every patch author is convinced he is doing the right thing.

With multi queue NICS, vast majority of napi->poll() invocations handle only 
one packet.
Unfortunately we can not really increase interrupt mitigations (ethtool -c) 
on NIC without sacrificing latencies.

DSA support for Marvell 88e6065 switch

2018-11-15 Thread Pavel Machek

Hi!

I'm trying to create support for Marvell 88e6065 switch... and it
seems like drivers/net/dsa supports everything, but this model.

Did someone work with this hardware before? Any idea if it would be
more suitable to support by existing 88e6060 code, or if 88e6xxx code
should serve as a base?

Thanks,
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature

Re: [PATCH net] ipv6: fix a dst leak when removing its exception

2018-11-15 Thread David Ahern

On 11/13/18 8:48 AM, Xin Long wrote:
> These is no need to hold dst before calling rt6_remove_exception_rt().
> The call to dst_hold_safe() in ip6_link_failure() was for ip6_del_rt(),
> which has been removed in Commit 93531c674315 ("net/ipv6: separate
> handling of FIB entries from dst based routes"). Otherwise, it will
> cause a dst leak.
> 
> This patch is to simply remove the dst_hold_safe() call before calling
> rt6_remove_exception_rt() and also do the same in ip6_del_cached_rt().
> It's safe, because the removal of the exception that holds its dst's
> refcnt is protected by rt6_exception_lock.
> 
> Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst 
> based routes")
> Fixes: 23fb93a4d3f1 ("net/ipv6: Cleanup exception and cache route handling")
> Reported-by: Li Shuang 
> Signed-off-by: Xin Long 
> ---
>  net/ipv6/route.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 

Ok, I see now. commit ad65a2f05695 add the dst_hold_safe with
ip6_del_rt. ip6_del_rt called ip6_rt_put to release the reference taken
by the hold_safe. Those paths are gone now.

Reviewed-by: David Ahern

Re: [PATCH net] net_sched: sch_fq: ensure maxrate fq parameter applies to EDT flows

2018-11-15 Thread David Miller

From: Eric Dumazet 
Date: Mon, 12 Nov 2018 16:17:16 -0800

> When EDT conversion happened, fq lost the ability to enfore a maxrate
> for all flows. It kept it for non EDT flows.
> 
> This commit restores the functionality.
> 
> Tested:
> 
> tc qd replace dev eth0 root fq maxrate 500Mbit
> netperf -P0 -H host -- -O THROUGHPUT
> 489.75
> 
> Fixes: ab408b6dc744 ("tcp: switch tcp and sch_fq to new earliest departure 
> time model")
> Signed-off-by: Eric Dumazet 

Applied, thanks Eric.

Re: [net-next PATCH v4] net: sched: cls_flower: Classify packets using port ranges

2018-11-15 Thread David Miller

From: Amritha Nambiar 
Date: Mon, 12 Nov 2018 16:15:55 -0800

> Added support in tc flower for filtering based on port ranges.
> 
> Example:
 ...
> v4:
> 1. Added condition before setting port key.
> 2. Organized setting and dumping port range keys into functions
>and added validation of input range.
> 
> v3:
> 1. Moved new fields in UAPI enum to the end of enum.
> 2. Removed couple of empty lines.
> 
> v2:
> Addressed Jiri's comments:
> 1. Added separate functions for dst and src comparisons.
> 2. Removed endpoint enum.
> 3. Added new bit TCA_FLOWER_FLAGS_RANGE to decide normal/range
>   lookup.
> 4. Cleaned up fl_lookup function.
> 
> Signed-off-by: Amritha Nambiar 

Applied.

Re: [Patch net-next v2] net: dump more useful information in netdev_rx_csum_fault()

2018-11-15 Thread David Miller

From: Cong Wang 
Date: Mon, 12 Nov 2018 14:47:18 -0800

> Currently netdev_rx_csum_fault() only shows a device name,
> we need more information about the skb for debugging csum
> failures.
> 
> Sample output:
> 
>  ens3: hw csum failure
>  dev features: 0x00014b89
>  skb len=84 data_len=0 pkt_type=0 gso_size=0 gso_type=0 nr_frags=0 
> ip_summed=0 csum=0 csum_complete_sw=0 csum_valid=0 csum_level=0
> 
> Note, I use pr_err() just to be consistent with the existing one.
> 
> Signed-off-by: Cong Wang 

Applied, thanks Cong.

Re: [PATCHv2 net-net] net: dsa: mv88e6xxx: Work around mv886e6161 SERDES missing MII_PHYSID2

2018-11-15 Thread David Miller

From: Andrew Lunn 
Date: Mon, 12 Nov 2018 18:51:01 +0100

> We already have a workaround for a couple of switches whose internal
> PHYs only have the Marvel OUI, but no model number. We detect such
> PHYs and give them the 6390 ID as the model number. However the
> mv88e6161 has two SERDES interfaces in the same address range as its
> internal PHYs. These suffer from the same problem, the Marvell OUI,
> but no model number. As a result, these SERDES interfaces were getting
> the same PHY ID as the mv88e6390, even though they are not PHYs, and
> the Marvell PHY driver was trying to drive them.
> 
> Add a special case to stop this from happen.
> 
> Reported-by: Chris Healy 
> Signed-off-by: Andrew Lunn 

Applied to net-next.

[PATCH iproute2] testsuite: drop old kernel configs

2018-11-15 Thread Stephen Hemminger

The testsuite directory had a directory of ancient kernel configs.

Signed-off-by: Stephen Hemminger 
---
 testsuite/configs/all-2.4|  848 -
 testsuite/configs/all-no-act | 1499 -
 testsuite/configs/all-police-act | 1504 --
 3 files changed, 3851 deletions(-)
 delete mode 100644 testsuite/configs/all-2.4
 delete mode 100644 testsuite/configs/all-no-act
 delete mode 100644 testsuite/configs/all-police-act

diff --git a/testsuite/configs/all-2.4 b/testsuite/configs/all-2.4
deleted file mode 100644
index cc4131c2313f..
--- a/testsuite/configs/all-2.4
+++ /dev/null
@@ -1,848 +0,0 @@
-#
-# Automatically generated by make menuconfig: don't edit
-#
-CONFIG_X86=y
-# CONFIG_SBUS is not set
-CONFIG_UID16=y
-
-#
-# Code maturity level options
-#
-CONFIG_EXPERIMENTAL=y
-
-#
-# Loadable module support
-#
-CONFIG_MODULES=y
-CONFIG_MODVERSIONS=y
-CONFIG_KMOD=y
-
-#
-# Processor type and features
-#
-# CONFIG_M386 is not set
-# CONFIG_M486 is not set
-# CONFIG_M586 is not set
-# CONFIG_M586TSC is not set
-# CONFIG_M586MMX is not set
-# CONFIG_M686 is not set
-# CONFIG_MPENTIUMIII is not set
-CONFIG_MPENTIUM4=y
-# CONFIG_MK6 is not set
-# CONFIG_MK7 is not set
-# CONFIG_MK8 is not set
-# CONFIG_MELAN is not set
-# CONFIG_MCRUSOE is not set
-# CONFIG_MWINCHIPC6 is not set
-# CONFIG_MWINCHIP2 is not set
-# CONFIG_MWINCHIP3D is not set
-# CONFIG_MCYRIXIII is not set
-# CONFIG_MVIAC3_2 is not set
-CONFIG_X86_WP_WORKS_OK=y
-CONFIG_X86_INVLPG=y
-CONFIG_X86_CMPXCHG=y
-CONFIG_X86_XADD=y
-CONFIG_X86_BSWAP=y
-CONFIG_X86_POPAD_OK=y
-# CONFIG_RWSEM_GENERIC_SPINLOCK is not set
-CONFIG_RWSEM_XCHGADD_ALGORITHM=y
-CONFIG_X86_L1_CACHE_SHIFT=7
-CONFIG_X86_HAS_TSC=y
-CONFIG_X86_GOOD_APIC=y
-CONFIG_X86_PGE=y
-CONFIG_X86_USE_PPRO_CHECKSUM=y
-CONFIG_X86_F00F_WORKS_OK=y
-CONFIG_X86_MCE=y
-# CONFIG_TOSHIBA is not set
-# CONFIG_I8K is not set
-# CONFIG_MICROCODE is not set
-# CONFIG_X86_MSR is not set
-# CONFIG_X86_CPUID is not set
-# CONFIG_EDD is not set
-CONFIG_NOHIGHMEM=y
-# CONFIG_HIGHMEM4G is not set
-# CONFIG_HIGHMEM64G is not set
-# CONFIG_HIGHMEM is not set
-# CONFIG_MATH_EMULATION is not set
-# CONFIG_MTRR is not set
-CONFIG_SMP=y
-CONFIG_NR_CPUS=32
-# CONFIG_X86_NUMA is not set
-# CONFIG_X86_TSC_DISABLE is not set
-CONFIG_X86_TSC=y
-CONFIG_HAVE_DEC_LOCK=y
-
-#
-# General setup
-#
-CONFIG_NET=y
-CONFIG_X86_IO_APIC=y
-CONFIG_X86_LOCAL_APIC=y
-CONFIG_PCI=y
-# CONFIG_PCI_GOBIOS is not set
-# CONFIG_PCI_GODIRECT is not set
-CONFIG_PCI_GOANY=y
-CONFIG_PCI_BIOS=y
-CONFIG_PCI_DIRECT=y
-CONFIG_ISA=y
-CONFIG_PCI_NAMES=y
-# CONFIG_EISA is not set
-# CONFIG_MCA is not set
-CONFIG_HOTPLUG=y
-
-#
-# PCMCIA/CardBus support
-#
-CONFIG_PCMCIA=y
-CONFIG_CARDBUS=y
-# CONFIG_TCIC is not set
-# CONFIG_I82092 is not set
-# CONFIG_I82365 is not set
-
-#
-# PCI Hotplug Support
-#
-# CONFIG_HOTPLUG_PCI is not set
-# CONFIG_HOTPLUG_PCI_COMPAQ is not set
-# CONFIG_HOTPLUG_PCI_COMPAQ_NVRAM is not set
-# CONFIG_HOTPLUG_PCI_IBM is not set
-# CONFIG_HOTPLUG_PCI_SHPC is not set
-# CONFIG_HOTPLUG_PCI_SHPC_POLL_EVENT_MODE is not set
-# CONFIG_HOTPLUG_PCI_SHPC_PHPRM_LEGACY is not set
-# CONFIG_HOTPLUG_PCI_PCIE is not set
-# CONFIG_HOTPLUG_PCI_PCIE_POLL_EVENT_MODE is not set
-CONFIG_SYSVIPC=y
-# CONFIG_BSD_PROCESS_ACCT is not set
-CONFIG_SYSCTL=y
-CONFIG_KCORE_ELF=y
-# CONFIG_KCORE_AOUT is not set
-CONFIG_BINFMT_AOUT=y
-CONFIG_BINFMT_ELF=y
-CONFIG_BINFMT_MISC=y
-# CONFIG_OOM_KILLER is not set
-CONFIG_PM=y
-# CONFIG_APM is not set
-
-#
-# ACPI Support
-#
-# CONFIG_ACPI is not set
-CONFIG_ACPI_BOOT=y
-
-#
-# Memory Technology Devices (MTD)
-#
-# CONFIG_MTD is not set
-
-#
-# Parallel port support
-#
-# CONFIG_PARPORT is not set
-
-#
-# Plug and Play configuration
-#
-CONFIG_PNP=y
-CONFIG_ISAPNP=y
-
-#
-# Block devices
-#
-CONFIG_BLK_DEV_FD=y
-# CONFIG_BLK_DEV_XD is not set
-# CONFIG_PARIDE is not set
-# CONFIG_BLK_CPQ_DA is not set
-# CONFIG_BLK_CPQ_CISS_DA is not set
-# CONFIG_CISS_SCSI_TAPE is not set
-# CONFIG_CISS_MONITOR_THREAD is not set
-# CONFIG_BLK_DEV_DAC960 is not set
-# CONFIG_BLK_DEV_UMEM is not set
-# CONFIG_BLK_DEV_SX8 is not set
-# CONFIG_BLK_DEV_LOOP is not set
-# CONFIG_BLK_DEV_NBD is not set
-# CONFIG_BLK_DEV_RAM is not set
-# CONFIG_BLK_DEV_INITRD is not set
-# CONFIG_BLK_STATS is not set
-
-#
-# Multi-device support (RAID and LVM)
-#
-# CONFIG_MD is not set
-# CONFIG_BLK_DEV_MD is not set
-# CONFIG_MD_LINEAR is not set
-# CONFIG_MD_RAID0 is not set
-# CONFIG_MD_RAID1 is not set
-# CONFIG_MD_RAID5 is not set
-# CONFIG_MD_MULTIPATH is not set
-# CONFIG_BLK_DEV_LVM is not set
-
-#
-# Networking options
-#
-CONFIG_PACKET=y
-# CONFIG_PACKET_MMAP is not set
-# CONFIG_NETLINK_DEV is not set
-# CONFIG_NETFILTER is not set
-# CONFIG_FILTER is not set
-CONFIG_UNIX=y
-CONFIG_INET=y
-CONFIG_IP_MULTICAST=y
-CONFIG_IP_ADVANCED_ROUTER=y
-CONFIG_IP_MULTIPLE_TABLES=y
-CONFIG_IP_ROUTE_NAT=y
-CONFIG_IP_ROUTE_MULTIPATH=y
-CONFIG_IP_ROUTE_TOS=y
-# CONFIG_IP_ROUTE_VERBOSE is not

[PATCH iproute2-next] drop support for IPX

2018-11-15 Thread Stephen Hemminger

IPX has been depracted then removed from upstream kernels.
Drop support from ip route as well.

Signed-off-by: Stephen Hemminger 
---
 Makefile   |  3 --
 include/utils.h| 10 -
 ip/ip.c|  4 +-
 ip/iproute.c   |  2 +-
 lib/ipx_ntop.c | 71 ---
 lib/ipx_pton.c | 97 --
 lib/utils.c|  2 -
 man/man8/ip-route.8.in |  2 +-
 man/man8/ip.8  |  9 +---
 9 files changed, 5 insertions(+), 195 deletions(-)
 delete mode 100644 lib/ipx_ntop.c
 delete mode 100644 lib/ipx_pton.c

diff --git a/Makefile b/Makefile
index b7488addc6f2..7d62468c6638 100644
--- a/Makefile
+++ b/Makefile
@@ -43,9 +43,6 @@ DEFINES+=-DCONFDIR=\"$(CONFDIR)\" \
 #options for decnet
 ADDLIB+=dnet_ntop.o dnet_pton.o
 
-#options for ipx
-ADDLIB+=ipx_ntop.o ipx_pton.o
-
 #options for mpls
 ADDLIB+=mpls_ntop.o mpls_pton.o
 
diff --git a/include/utils.h b/include/utils.h
index bf6dea23df66..12c003c874c4 100644
--- a/include/utils.h
+++ b/include/utils.h
@@ -116,13 +116,6 @@ struct dn_naddr
 unsigned char a_addr[DN_MAXADDL];
 };
 
-#define IPX_NODE_LEN 6
-
-struct ipx_addr {
-   u_int32_t ipx_net;
-   u_int8_t  ipx_node[IPX_NODE_LEN];
-};
-
 #ifndef AF_MPLS
 # define AF_MPLS 28
 #endif
@@ -207,9 +200,6 @@ int inet_addr_match_rta(const inet_prefix *m, const struct 
rtattr *rta);
 const char *dnet_ntop(int af, const void *addr, char *str, size_t len);
 int dnet_pton(int af, const char *src, void *addr);
 
-const char *ipx_ntop(int af, const void *addr, char *str, size_t len);
-int ipx_pton(int af, const char *src, void *addr);
-
 const char *mpls_ntop(int af, const void *addr, char *str, size_t len);
 int mpls_pton(int af, const char *src, void *addr, size_t alen);
 
diff --git a/ip/ip.c b/ip/ip.c
index c324120f9fc5..11dbed72842f 100644
--- a/ip/ip.c
+++ b/ip/ip.c
@@ -53,7 +53,7 @@ static void usage(void)
 "   vrf | sr }\n"
 "   OPTIONS := { -V[ersion] | -s[tatistics] | -d[etails] | -r[esolve] |\n"
 "-h[uman-readable] | -iec | -j[son] | -p[retty] |\n"
-"-f[amily] { inet | inet6 | ipx | dnet | mpls | bridge | 
link } |\n"
+"-f[amily] { inet | inet6 | dnet | mpls | bridge | link } 
|\n"
 "-4 | -6 | -I | -D | -M | -B | -0 |\n"
 "-l[oops] { maximum-addr-flush-attempts } | -br[ief] |\n"
 "-o[neline] | -t[imestamp] | -ts[hort] | -b[atch] 
[filename] |\n"
@@ -225,8 +225,6 @@ int main(int argc, char **argv)
preferred_family = AF_INET6;
} else if (strcmp(opt, "-0") == 0) {
preferred_family = AF_PACKET;
-   } else if (strcmp(opt, "-I") == 0) {
-   preferred_family = AF_IPX;
} else if (strcmp(opt, "-D") == 0) {
preferred_family = AF_DECnet;
} else if (strcmp(opt, "-M") == 0) {
diff --git a/ip/iproute.c b/ip/iproute.c
index b039f35b0ccd..26f7cd8915bf 100644
--- a/ip/iproute.c
+++ b/ip/iproute.c
@@ -83,7 +83,7 @@ static void usage(void)
"INFO_SPEC := NH OPTIONS FLAGS [ nexthop NH ]...\n"
"NH := [ encap ENCAPTYPE ENCAPHDR ] [ via [ FAMILY ] ADDRESS 
]\n"
"   [ dev STRING ] [ weight NUMBER ] NHFLAGS\n"
-   "FAMILY := [ inet | inet6 | ipx | dnet | mpls | bridge | link 
]\n"
+   "FAMILY := [ inet | inet6 | dnet | mpls | bridge | link ]\n"
"OPTIONS := FLAGS [ mtu NUMBER ] [ advmss NUMBER ] [ as [ to ] 
ADDRESS ]\n"
"   [ rtt TIME ] [ rttvar TIME ] [ reordering NUMBER 
]\n"
"   [ window NUMBER ] [ cwnd NUMBER ] [ initcwnd NUMBER 
]\n"
diff --git a/lib/ipx_ntop.c b/lib/ipx_ntop.c
deleted file mode 100644
index 80b8a34e1a70..
--- a/lib/ipx_ntop.c
+++ /dev/null
@@ -1,71 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include 
-#include 
-#include 
-#include 
-
-#include "utils.h"
-
-static __inline__ int do_digit(char *str, u_int32_t addr, u_int32_t scale, 
size_t *pos, size_t len)
-{
-   u_int32_t tmp = addr >> (scale * 4);
-
-   if (*pos == len)
-   return 1;
-
-   tmp &= 0x0f;
-   if (tmp > 9)
-   *str = tmp + 'A' - 10;
-   else
-   *str = tmp + '0';
-   (*pos)++;
-
-   return 0;
-}
-
-static const char *ipx_ntop1(const struct ipx_addr *addr, char *str, size_t 
len)
-{
-   int i;
-   size_t pos = 0;
-
-   if (len == 0)
-   return str;
-
-   for(i = 7; i >= 0; i--)
-   if (do_digit(str + pos, ntohl(addr->ipx_net), i, , len))
-   return str;
-
-   if (pos == len)
-   return str;
-
-   *(str + pos) = '.';
-   pos++;
-
-   for(i = 0; i < 6; i++) {
-   if (do_digit(str + pos, addr->ipx_node[i], 1, , len))
-

Re: [net-next 00/11][pull request] 100GbE Intel Wired LAN Driver Updates 2018-11-13

2018-11-15 Thread David Miller

From: Jeff Kirsher 
Date: Tue, 13 Nov 2018 10:32:28 -0800

> This series contains updates to the ice driver only.

Pulled, thanks Jeff.

Re: [PATCH net] ipv6: fix a dst leak when removing its exception

2018-11-15 Thread Mika Penttilä


On 15.11.2018 20.17, David Ahern wrote:
> On 11/14/18 11:23 PM, Xin Long wrote:
>> On Thu, Nov 15, 2018 at 3:33 PM David Ahern  wrote:
>>> On 11/14/18 11:03 AM, David Ahern wrote:
 On 11/13/18 8:48 AM, Xin Long wrote:
> These is no need to hold dst before calling rt6_remove_exception_rt().
> The call to dst_hold_safe() in ip6_link_failure() was for ip6_del_rt(),
> which has been removed in Commit 93531c674315 ("net/ipv6: separate
> handling of FIB entries from dst based routes"). Otherwise, it will
> cause a dst leak.
>
> This patch is to simply remove the dst_hold_safe() call before calling
> rt6_remove_exception_rt() and also do the same in ip6_del_cached_rt().
> It's safe, because the removal of the exception that holds its dst's
> refcnt is protected by rt6_exception_lock.
>
> Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst 
> based routes")
> Fixes: 23fb93a4d3f1 ("net/ipv6: Cleanup exception and cache route 
> handling")
> Reported-by: Li Shuang 
> Signed-off-by: Xin Long 
> ---
>  net/ipv6/route.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
 was this problem actually hit or is this patch based on a code analysis?

>>> I ask because I have not been able to reproduce the leak using existing
>>> tests (e.g., pmtu) that I know create exceptions.
>>>
>>> If this problem was hit, it would be good to get a test case for it.
>> The attachment is the ip6_dst.sh with IPVS.
>>
>> # sh ip6_dst.sh
>>
>> But this one triggers the kernel warnings caused by 2 places:
>>unregister_netdevice: waiting for br0 to become free. Usage count = 3
>>
>> 1. one is IPVS, I just posted the fix:
>> https://patchwork.ozlabs.org/patch/998123/  [1]
>> 2. the other one is IPv6,
>> ip6_link_failure() will be hit.
>>
>> So to make this reproduce clearly, you may want to apply
>> patch [1] firstly.
>>
> Thanks for the script. It does not replicate the problem using net-next
> tree as of
>
> commit 6d5db6c37929cb0a84e64ba0590a74593e5ce3b8
> Merge: 15cef30974c5 bd3b5d462add
> Author: David S. Miller 
> Date:   Wed Nov 14 08:51:28 2018 -0800
>
> Merge branch 'nfp-abm-track-all-Qdiscs'
>
>
> I would be really surprised if the fib6_info change introduced a need to
> change the dst hold's for exception routes. I am not seeing the
> connection, so I really want to see it reproduced.
>
> Thanks


Maybe it's not 100% reproducer then, but I think the fix is obviously right.

--Mika

Re: pending-fixes build: 1 failures 5 warnings (v4.20-rc2-261-g59a9ebee0952)

2018-11-15 Thread Mark Brown

On Thu, Nov 15, 2018 at 05:00:30PM +, Build bot for Mark Brown wrote:

Today's pending-fixes branch fails to build an arm allmodconfig due to:

>   arm-allmodconfig
> ../drivers/net/ethernet/qlogic/qed/qed_rdma.h:186:79: error: expected ';' 
> before '}' token
> ../drivers/net/ethernet/qlogic/qed/qed_rdma.h:186:79: error: expected ';' 
> before '}' token
> ../drivers/net/ethernet/qlogic/qed/qed_rdma.h:186:79: error: expected ';' 
> before '}' token
> ../drivers/net/ethernet/qlogic/qed/qed_rdma.h:186:79: error: expected ';' 
> before '}' token
> ../drivers/net/ethernet/qlogic/qed/qed_rdma.h:186:79: error: expected ';' 
> before '}' token

caused by 291d57f67d2449737 (qed: Fix rdma_info structure allocation) -
there's a simple typo in the !QED_RDMA stubs that were added.

signature.asc
Description: PGP signature

Re: [PATCH v3 net-next 0/4] net: batched receive in GRO path

2018-11-15 Thread Edward Cree

On 15/11/18 07:22, Eric Dumazet wrote:
> On 11/14/2018 10:07 AM, Edward Cree wrote:
>> Conclusion:
>> * TCP b/w is 16.5% faster for traffic which cannot be coalesced by GRO.
> But only for traffic that actually was perfect GRO candidate, right ?
>
> Now what happens if all the packets you are batching are hitting different 
> TCP sockets ?
The batch is already split up by the time it hits TCP sockets; batching
 currently only goes as far as ip_sublist_rcv_finish() which calls
 dst_input(skb) in a loop.  So as long as the packets are all for the
 same dst IP, we should get all of this gain.
If the packets have different dst IP addresses then we split the batch
 slightly earlier, in ip_list_rcv_finish(), but that won't make very
 much difference, I expect we'll still get most of this gain.  There is
 a lot of the stack (layer 2 stuff, taps, etc.) that we still traverse
 as a batch.

> By the time we build a list of 64 packets, the first packets in the list wont 
> be anymore
> in L1 cache (32 KB 8-way associative typically), and we will probably have 
> cache trashing.
Most of the packet isn't touched and thus won't be brought into cache.
Only the headers of each packet (worst-case let's say 256 bytes) will
 be touched during batch processing, that's 16kB.  And not all at once
 i.e. by the time we touch the later cachelines of a packet we'll be
 done with the earlier ones except maybe in cases where GRO decides
 very late on that it can't coalesce.
And since the alternative is thrashing of the I$ cache, I don't think
 there's an a priori argument that this will hurt — and my tests seem
 to indicate that it's OK and that we gain more from better I$ usage
 than we lose from worse D$ usage patterns.
If you think there are cases in which the latter will dominate, please
 suggest some tests that will embody them; I'm happy to keep running
 experiments.  Also you could come up with an analogue of patch #2 for
 whatever HW you have (it shouldn't be difficult) allowing you to run
 your own tests (e.g. if you have bigger/more powerful test rigs than
 I have access to ;-)

-Ed

Re: [PATCH net] ipv6: fix a dst leak when removing its exception

2018-11-15 Thread David Ahern

On 11/14/18 11:23 PM, Xin Long wrote:
> On Thu, Nov 15, 2018 at 3:33 PM David Ahern  wrote:
>>
>> On 11/14/18 11:03 AM, David Ahern wrote:
>>> On 11/13/18 8:48 AM, Xin Long wrote:
 These is no need to hold dst before calling rt6_remove_exception_rt().
 The call to dst_hold_safe() in ip6_link_failure() was for ip6_del_rt(),
 which has been removed in Commit 93531c674315 ("net/ipv6: separate
 handling of FIB entries from dst based routes"). Otherwise, it will
 cause a dst leak.

 This patch is to simply remove the dst_hold_safe() call before calling
 rt6_remove_exception_rt() and also do the same in ip6_del_cached_rt().
 It's safe, because the removal of the exception that holds its dst's
 refcnt is protected by rt6_exception_lock.

 Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst 
 based routes")
 Fixes: 23fb93a4d3f1 ("net/ipv6: Cleanup exception and cache route 
 handling")
 Reported-by: Li Shuang 
 Signed-off-by: Xin Long 
 ---
  net/ipv6/route.c | 7 +++
  1 file changed, 3 insertions(+), 4 deletions(-)
>>>
>>> was this problem actually hit or is this patch based on a code analysis?
>>>
>>
>> I ask because I have not been able to reproduce the leak using existing
>> tests (e.g., pmtu) that I know create exceptions.
>>
>> If this problem was hit, it would be good to get a test case for it.
> The attachment is the ip6_dst.sh with IPVS.
> 
> # sh ip6_dst.sh
> 
> But this one triggers the kernel warnings caused by 2 places:
>unregister_netdevice: waiting for br0 to become free. Usage count = 3
> 
> 1. one is IPVS, I just posted the fix:
> https://patchwork.ozlabs.org/patch/998123/  [1]
> 2. the other one is IPv6,
> ip6_link_failure() will be hit.
> 
> So to make this reproduce clearly, you may want to apply
> patch [1] firstly.
> 

Thanks for the script. It does not replicate the problem using net-next
tree as of

commit 6d5db6c37929cb0a84e64ba0590a74593e5ce3b8
Merge: 15cef30974c5 bd3b5d462add
Author: David S. Miller 
Date:   Wed Nov 14 08:51:28 2018 -0800

Merge branch 'nfp-abm-track-all-Qdiscs'


I would be really surprised if the fib6_info change introduced a need to
change the dst hold's for exception routes. I am not seeing the
connection, so I really want to see it reproduced.

Thanks

Re: [PATCH net V2] cxgb4: fix thermal zone build error

2018-11-15 Thread Randy Dunlap

On 11/15/18 2:06 AM, Ganesh Goudar wrote:
> with CONFIG_THERMAL=m and cxgb4 as built-in build fails, and
> 'commit e70a57fa59bb ("cxgb4: fix thermal configuration dependencies")'
> tries to fix it but when cxgb4i is made built-in build fails again,
> use IS_REACHABLE instead of IS_ENABLED to fix the issue.
> 
> Fixes: e70a57fa59bb (cxgb4: fix thermal configuration dependencies)
> Reported-by: Randy Dunlap 
> Signed-off-by: Ganesh Goudar 

Acked-by: Randy Dunlap 

Thanks.

> ---
> V2: Fixing spelling mistake and avoid preprocessor conditionals.
> ---
>  drivers/net/ethernet/chelsio/Kconfig| 1 -
>  drivers/net/ethernet/chelsio/cxgb4/Makefile | 4 +---
>  drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 4 ++--
>  3 files changed, 3 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/ethernet/chelsio/Kconfig 
> b/drivers/net/ethernet/chelsio/Kconfig
> index 75c1c5e..e2cdfa7 100644
> --- a/drivers/net/ethernet/chelsio/Kconfig
> +++ b/drivers/net/ethernet/chelsio/Kconfig
> @@ -67,7 +67,6 @@ config CHELSIO_T3
>  config CHELSIO_T4
>   tristate "Chelsio Communications T4/T5/T6 Ethernet support"
>   depends on PCI && (IPV6 || IPV6=n)
> - depends on THERMAL || !THERMAL
>   select FW_LOADER
>   select MDIO
>   select ZLIB_DEFLATE
> diff --git a/drivers/net/ethernet/chelsio/cxgb4/Makefile 
> b/drivers/net/ethernet/chelsio/cxgb4/Makefile
> index 78e5d17..91d8a88 100644
> --- a/drivers/net/ethernet/chelsio/cxgb4/Makefile
> +++ b/drivers/net/ethernet/chelsio/cxgb4/Makefile
> @@ -12,6 +12,4 @@ cxgb4-objs := cxgb4_main.o l2t.o smt.o t4_hw.o sge.o 
> clip_tbl.o cxgb4_ethtool.o
>  cxgb4-$(CONFIG_CHELSIO_T4_DCB) +=  cxgb4_dcb.o
>  cxgb4-$(CONFIG_CHELSIO_T4_FCOE) +=  cxgb4_fcoe.o
>  cxgb4-$(CONFIG_DEBUG_FS) += cxgb4_debugfs.o
> -ifdef CONFIG_THERMAL
> -cxgb4-objs += cxgb4_thermal.o
> -endif
> +cxgb4-$(CONFIG_THERMAL) += cxgb4_thermal.o
> diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c 
> b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
> index 05a4692..d49db46 100644
> --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
> +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
> @@ -5863,7 +5863,7 @@ static int init_one(struct pci_dev *pdev, const struct 
> pci_device_id *ent)
>   if (!is_t4(adapter->params.chip))
>   cxgb4_ptp_init(adapter);
>  
> - if (IS_ENABLED(CONFIG_THERMAL) &&
> + if (IS_REACHABLE(CONFIG_THERMAL) &&
>   !is_t4(adapter->params.chip) && (adapter->flags & FW_OK))
>   cxgb4_thermal_init(adapter);
>  
> @@ -5932,7 +5932,7 @@ static void remove_one(struct pci_dev *pdev)
>  
>   if (!is_t4(adapter->params.chip))
>   cxgb4_ptp_stop(adapter);
> - if (IS_ENABLED(CONFIG_THERMAL))
> + if (IS_REACHABLE(CONFIG_THERMAL))
>   cxgb4_thermal_remove(adapter);
>  
>   /* If we allocated filters, free up state associated with any
> 


-- 
~Randy

Re: [PATCH net V2 0/5] net/smc: fixes 2018-11-12

2018-11-15 Thread David Miller

From: Ursula Braun 
Date: Thu, 15 Nov 2018 13:11:15 +0100

> v1->v2:
>do not define 8-byte alignment for union smcd_cdc_cursor in
>patch 4/5 "net/smc: atomic SMCD cursor handling"

This is even worse.

The atomic64_t must be properly 8 byte aligned, else it will
crash the kernel when an atomic operation is attempted on it.

You have a situation where your struct attributes are entirely
incompatible.  If the parent struct is __packed, you absolutely
cannot align the atomic64_t in the child structure properly.

One more time, __packed makes correctness here impossible.

I've warned strongly in the past to avoid __packed.

Now you have to untangle this mess somehow.

Re: [PATCH net V2] cxgb4: fix thermal zone build error

2018-11-15 Thread David Miller

From: Ganesh Goudar 
Date: Thu, 15 Nov 2018 15:36:21 +0530

> with CONFIG_THERMAL=m and cxgb4 as built-in build fails, and
> 'commit e70a57fa59bb ("cxgb4: fix thermal configuration dependencies")'
> tries to fix it but when cxgb4i is made built-in build fails again,
> use IS_REACHABLE instead of IS_ENABLED to fix the issue.
> 
> Fixes: e70a57fa59bb (cxgb4: fix thermal configuration dependencies)
> Reported-by: Randy Dunlap 
> Signed-off-by: Ganesh Goudar 
> ---
> V2: Fixing spelling mistake and avoid preprocessor conditionals.

Applied.

Re: [PATCHv2] MAINTAINERS: Replace Vince Bridgers as Altera TSE maintainer

2018-11-15 Thread David Miller

From: thor.tha...@linux.intel.com
Date: Mon, 12 Nov 2018 11:50:56 -0600

> From: Thor Thayer 
> 
> Vince has moved to a different role. Replace him as Altera
> TSE maintainer.
> 
> Signed-off-by: Thor Thayer 
> Acked-by: Vince Bridgers 
> Acked-by: Alan Tull 
> ---
> v2  Include netdev and David Miller

Applied.

Re: [PATCH net 0/6] bnxt_en: Bug fixes.

2018-11-15 Thread David Miller

From: Michael Chan 
Date: Thu, 15 Nov 2018 03:25:36 -0500

> Most of the bug fixes are related to the new 57500 chips, including some
> initialization and counter fixes, disabling RDMA support, and a
> workaround for occasional missing interrupts.  The last patch from
> Vasundhara fixes the year/month parameters for firmware coredump.

Series applied, thanks Michael.

Re: [PATCH net-next] net: phy: check for implementation of both callbacks in phy_drv_supports_irq

2018-11-15 Thread David Miller

From: Heiner Kallweit 
Date: Mon, 12 Nov 2018 21:16:06 +0100

> Now that the icplus driver has been fixed all PHY drivers supporting
> interrupts have both callbacks (config_intr and ack_interrupt)
> implemented - as it should be. Therefore phy_drv_supports_irq()
> can be changed now to check for both callbacks being implemented.
> 
> Signed-off-by: Heiner Kallweit 

Applied, thanks Heiner.

[PATCH iproute2] genl: remove dead code

2018-11-15 Thread Stephen Hemminger

The function genl_ctrl_resolve_family is defined but never used
in current code.

Signed-off-by: Stephen Hemminger 
---
 genl/ctrl.c   | 71 ---
 genl/genl_utils.h |  2 --
 2 files changed, 73 deletions(-)

diff --git a/genl/ctrl.c b/genl/ctrl.c
index 616ab435..0fb464b01cfb 100644
--- a/genl/ctrl.c
+++ b/genl/ctrl.c
@@ -38,77 +38,6 @@ static int usage(void)
return -1;
 }
 
-int genl_ctrl_resolve_family(const char *family)
-{
-   struct rtnl_handle rth;
-   int ret = 0;
-   struct {
-   struct nlmsghdr n;
-   struct genlmsghdr   g;
-   charbuf[4096];
-   } req = {
-   .n.nlmsg_len = NLMSG_LENGTH(GENL_HDRLEN),
-   .n.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK,
-   .n.nlmsg_type = GENL_ID_CTRL,
-   .g.cmd = CTRL_CMD_GETFAMILY,
-   };
-   struct nlmsghdr *nlh = 
-   struct genlmsghdr *ghdr = 
-   struct nlmsghdr *answer = NULL;
-
-   if (rtnl_open_byproto(, 0, NETLINK_GENERIC) < 0) {
-   fprintf(stderr, "Cannot open generic netlink socket\n");
-   exit(1);
-   }
-
-   addattr_l(nlh, 128, CTRL_ATTR_FAMILY_NAME, family, strlen(family) + 1);
-
-   if (rtnl_talk(, nlh, ) < 0) {
-   fprintf(stderr, "Error talking to the kernel\n");
-   goto errout;
-   }
-
-   {
-   struct rtattr *tb[CTRL_ATTR_MAX + 1];
-   int len = answer->nlmsg_len;
-   struct rtattr *attrs;
-
-   if (answer->nlmsg_type !=  GENL_ID_CTRL) {
-   fprintf(stderr, "Not a controller message, nlmsg_len=%d 
"
-   "nlmsg_type=0x%x\n", answer->nlmsg_len, 
answer->nlmsg_type);
-   goto errout;
-   }
-
-   if (ghdr->cmd != CTRL_CMD_NEWFAMILY) {
-   fprintf(stderr, "Unknown controller command %d\n", 
ghdr->cmd);
-   goto errout;
-   }
-
-   len -= NLMSG_LENGTH(GENL_HDRLEN);
-
-   if (len < 0) {
-   fprintf(stderr, "wrong controller message len %d\n", 
len);
-   free(answer);
-   return -1;
-   }
-
-   attrs = (struct rtattr *) ((char *) answer + 
NLMSG_LENGTH(GENL_HDRLEN));
-   parse_rtattr(tb, CTRL_ATTR_MAX, attrs, len);
-
-   if (tb[CTRL_ATTR_FAMILY_ID] == NULL) {
-   fprintf(stderr, "Missing family id TLV\n");
-   goto errout;
-   }
-
-   ret = rta_getattr_u16(tb[CTRL_ATTR_FAMILY_ID]);
-   }
-
-errout:
-   free(answer);
-   rtnl_close();
-   return ret;
-}
-
 static void print_ctrl_cmd_flags(FILE *fp, __u32 fl)
 {
fprintf(fp, "\n\t\tCapabilities (0x%x):\n ", fl);
diff --git a/genl/genl_utils.h b/genl/genl_utils.h
index cc1f3fb76596..a8d433a9574f 100644
--- a/genl/genl_utils.h
+++ b/genl/genl_utils.h
@@ -13,6 +13,4 @@ struct genl_util
int (*print_genlopt)(struct nlmsghdr *n, void *arg);
 };
 
-int genl_ctrl_resolve_family(const char *family);
-
 #endif
-- 
2.17.1

Re: [PATCH net] sctp: not allow to set asoc prsctp_enable by sockopt

2018-11-15 Thread Marcelo Ricardo Leitner

On Thu, Nov 15, 2018 at 07:14:28PM +0800, Xin Long wrote:
> As rfc7496#section4.5 says about SCTP_PR_SUPPORTED:
> 
>This socket option allows the enabling or disabling of the
>negotiation of PR-SCTP support for future associations.  For existing
>associations, it allows one to query whether or not PR-SCTP support
>was negotiated on a particular association.
> 
> It means only sctp sock's prsctp_enable can be set.
> 
> Note that for the limitation of SCTP_{CURRENT|ALL}_ASSOC, we will
> add it when introducing SCTP_{FUTURE|CURRENT|ALL}_ASSOC for linux
> sctp in another patchset.
> 
> Fixes: 28aa4c26fce2 ("sctp: add SCTP_PR_SUPPORTED on sctp sockopt")
> Reported-by: Ying Xu 
> Signed-off-by: Xin Long 
> ---
>  net/sctp/socket.c | 13 +++--
>  1 file changed, 3 insertions(+), 10 deletions(-)
> 
> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> index 739f3e5..e9b8232 100644
> --- a/net/sctp/socket.c
> +++ b/net/sctp/socket.c
> @@ -3940,7 +3940,6 @@ static int sctp_setsockopt_pr_supported(struct sock *sk,
>   unsigned int optlen)
>  {
>   struct sctp_assoc_value params;
> - struct sctp_association *asoc;
>   int retval = -EINVAL;
>  
>   if (optlen != sizeof(params))
> @@ -3951,16 +3950,10 @@ static int sctp_setsockopt_pr_supported(struct sock 
> *sk,
>   goto out;
>   }
>  
> - asoc = sctp_id2assoc(sk, params.assoc_id);
> - if (asoc) {
> - asoc->prsctp_enable = !!params.assoc_value;
> - } else if (!params.assoc_id) {
> - struct sctp_sock *sp = sctp_sk(sk);
> -
> - sp->ep->prsctp_enable = !!params.assoc_value;
> - } else {
> + if (sctp_style(sk, UDP) && sctp_id2assoc(sk, params.assoc_id))

This would allow using a non-existent assoc id on UDP-style sockets to
set it at the socket, which is not expected. It should be more like:

+   if (sctp_style(sk, UDP) && params.assoc_id)

>   goto out;
> - }
> +
> + sctp_sk(sk)->ep->prsctp_enable = !!params.assoc_value;
>  
>   retval = 0;
>  
> -- 
> 2.1.0
>

Re: Ethernet on my CycloneV broke since 4.9.124

2018-11-15 Thread Dinh Nguyen




On 11/15/18 9:50 AM, Clément Péron wrote:
> Hi Dinh,
> 
> Did you upstream the patch on linux-stable ?
> 

Not yet...

Dinh
> On Fri, 2 Nov 2018 at 11:02, Clément Péron  wrote:
>>
>> Hi Dinh,
>>
>> On Wed, 31 Oct 2018 at 23:02, Dinh Nguyen  wrote:
>>>
>>> Hi Clement,
>>>
>>> On 10/31/2018 10:36 AM, Clément Péron wrote:
 Hi Dinh,

 On Wed, 31 Oct 2018 at 15:42, Dinh Nguyen  wrote:
>
> Hi Clement,
>
> On 10/31/2018 08:01 AM, Clément Péron wrote:
>> Hi,
>>
>> The patch "net: stmmac: socfpga: add additional ocp reset line for
>> Stratix10" introduce in 4.9.124 broke the ethernet on my CycloneV
>> board.
>>
>> When I boot i have this issue :
>>
>> socfpga-dwmac ff702000.ethernet: error getting reset control of ocp -2
>> socfpga-dwmac: probe of ff702000.ethernet failed with error -2
>>
>> Reverting the commit : 6f37f7b62baa6a71d7f3f298acb64de51275e724 fix the 
>> issue.
>>
>
> Are you sure? I just booted v4.9.124 and did not see any errors. The
> error should not appear because the commit is using
> devm_reset_control_get_optional().

 I'm booting on 4.9.130 actually, Agree with you that
 devm_reset_control_get_optional should not failed but checking other
 usage of this helper
 https://elixir.bootlin.com/linux/v4.9.135/source/drivers/i2c/busses/i2c-mv64xxx.c#L824
 https://elixir.bootlin.com/linux/v4.9.135/source/drivers/crypto/sunxi-ss/sun4i-ss-core.c#L259
 Show that they don't check for errors except for PROBE_DEFER

>>>
>>> I made a mistake, I was booting linux-next. I am seeing the error with
>>> v4.9.124. It's due to this commit not getting backported:
>>>
>>> "bb475230b8e59a reset: make optional functions really optional"
>>>
>>> I have backported the patch and is available here if you like to take a
>>> look:
>>
>> Thanks, works fine on my board too.
>> Regards,
>> Clement
>>
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux.git/log/?h=v4.9.124_optional_reset
>>>
>>> Dinh

[PATCH net 0/3] mlx4 fixes for 4.20-rc

2018-11-15 Thread Tariq Toukan

Hi Dave,

This patchset includes small fixes for mlx4_core driver.

First patch by Jack zeros a field in a FW communication
command, to match the FW spec.
Please queue it to -stable >= v3.17.

In the second patch I zero-initialize a variable to silence
a compliation warning.
Please queue it to -stable >= v3.19.

Third patch by Aya replaces int fields with unsigned int,
to fix a UBSAN warning.
Please queue it to -stable >= v3.13.

Series generated against net commit:
db8ddde766ad Merge branch 'qed-Miscellaneous-bug-fixes'

Thanks,
Tariq.


Aya Levin (1):
  net/mlx4: Fix UBSAN warning of signed integer overflow

Jack Morgenstein (1):
  net/mlx4_core: Zero out lkey field in SW2HW_MPT fw command

Tariq Toukan (1):
  net/mlx4_core: Fix uninitialized variable compilation warning

 drivers/net/ethernet/mellanox/mlx4/alloc.c | 2 +-
 drivers/net/ethernet/mellanox/mlx4/mlx4.h  | 4 ++--
 drivers/net/ethernet/mellanox/mlx4/mr.c| 1 +
 3 files changed, 4 insertions(+), 3 deletions(-)

-- 
1.8.3.1

1 2 >

1 - 100 of 160 matches

Mail list logo