Re: [net-intel-i40e] question about assignment overwrite

2017-05-17 Thread Jeff Kirsher
On Wed, 2017-05-17 at 15:48 -0500, Gustavo A. R. Silva wrote:
> While looking into Coverity ID 1408956 I ran into the following
> piece  
> of code at drivers/net/ethernet/intel/i40e/i40e_main.c:8807:
> 
> 8807    if (pf->hw.mac.type == I40E_MAC_X722) {
> 8808    pf->flags |= I40E_FLAG_RSS_AQ_CAPABLE
> 8809 | I40E_FLAG_128_QP_RSS_CAPABLE
> 8810 | I40E_FLAG_HW_ATR_EVICT_CAPABLE
> 8811 | I40E_FLAG_OUTER_UDP_CSUM_CAPABLE
> 8812 | I40E_FLAG_WB_ON_ITR_CAPABLE
> 8813 |
> I40E_FLAG_MULTIPLE_TCP_UDP_RSS_PCTYPE
> 8814 | I40E_FLAG_NO_PCI_LINK_CHECK
> 8815 | I40E_FLAG_USE_SET_LLDP_MIB
> 8816 | I40E_FLAG_GENEVE_OFFLOAD_CAPABLE
> 8817 | I40E_FLAG_PTP_L4_CAPABLE
> 8818 | I40E_FLAG_WOL_MC_MAGIC_PKT_WAKE;
> 8819    } else if ((pf->hw.aq.api_maj_ver > 1) ||
> 8820   ((pf->hw.aq.api_maj_ver == 1) &&
> 8821    (pf->hw.aq.api_min_ver > 4))) {
> 8822    /* Supported in FW API version higher than 1.4 */
> 8823    pf->flags |= I40E_FLAG_GENEVE_OFFLOAD_CAPABLE;
> 8824    pf->flags = I40E_FLAG_HW_ATR_EVICT_CAPABLE;
> 8825    } else {
> 8826    pf->flags = I40E_FLAG_HW_ATR_EVICT_CAPABLE;
> 8827    }
> 
> The issue here is that the assignment at line 8823 is overwritten
> by  
> the code at line 8824.
> 
> I'm suspicious that line 8824 should be remove and a patch like the  
> following can be applied:
> 
> index d5c9c9e..48ffa73 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
> @@ -8821,7 +8821,6 @@ static int i40e_sw_init(struct i40e_pf *pf)
>  (pf->hw.aq.api_min_ver > 4))) {
>  /* Supported in FW API version higher than 1.4 */
>  pf->flags |= I40E_FLAG_GENEVE_OFFLOAD_CAPABLE;
> -   pf->flags = I40E_FLAG_HW_ATR_EVICT_CAPABLE;
>  } else {
>  pf->flags = I40E_FLAG_HW_ATR_EVICT_CAPABLE;
>  }
> 
> What do you think?

This issue is already fixed in my dev-queue branch on my next-queue
tree.

signature.asc
Description: This is a digitally signed message part


Re: [PATCH 0/7] crypto: aesni: provide generic gcm(aes)

2017-05-17 Thread Herbert Xu
On Fri, Apr 28, 2017 at 06:11:55PM +0200, Sabrina Dubroca wrote:
> The current aesni AES-GCM implementation only offers support for
> rfc4106(gcm(aes)).  This makes some things a little bit simpler
> (handling of associated data and authentication tag), but it means
> that non-IPsec users of gcm(aes) have to rely on
> gcm_base(ctr-aes-aesni,ghash-clmulni), which is much slower.
> 
> This patchset adds handling of all valid authentication tag lengths
> and of any associated data length to the assembly code, and exposes a
> generic gcm(aes) AEAD algorithm to the crypto API.
> 
> With these patches, performance of MACsec on a single core increases
> by 40% (from 4.5Gbps to around 6.3Gbps).

All patches applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH 1/2] wcn36xx: Pass used skb to ieee80211_tx_status()

2017-05-17 Thread Bjorn Andersson
On Wed 17 May 06:14 PDT 2017, Johannes Berg wrote:

> On Thu, 2017-05-04 at 13:13 +, Kalle Valo wrote:
> > 
> > > > This code intentionally checked if TX status was requested, and
> > > > if not then it doesn't go to the effort of building it.
> > > > 
> > > 
> > > What I'm finding puzzling is the fact that the only caller of
> > > ieee80211_led_tx() is ieee80211_tx_status() and it seems like
> > > drivers, such as ath10k, call this for each packet handled - but
> > > I'm likely missing something.
> 
> Yes, many drivers do call it for each packet, and as such, this
> deficiency was never noted.
> 
> > > > As it is with your patch, it'll go and report the TX status
> > > > without any
> > > > TX status information - which is handled in
> > > > wcn36xx_dxe_tx_ack_ind()
> > > > for those frames needing it.
> > > > 
> > > 
> > > Right, it doesn't sound desired. However, during normal operation
> > > I'm not seeing IEEE80211_TX_CTL_REQ_TX_STATUS being set and as such
> > > ieee80211_led_tx() is never called.
> > 
> > So what's the conclusion? How do we get leds working?
> 
> Well, frankly, I never thought the TX LED was a super good idea - but
> it had been supported by the original code IIRC, so never removed. Some
> people like frantic blinking I guess ;-)
> 

It seems very important to a lot of people...


But if ieee80211_free_txskb() is the counterpart of
ieee80211_tx_status() then we should be able to push the
ieee80211_led_tx() call down into ieee80211_report_used_skb() and handle
both cases.

The ieee80211_free_txskb() seems to be used in various cases where we
discard skbs, but perhaps this is not an issue in reality.

> But I think the problem also applies to the throughput trigger thing,
> so perhaps we need to stick some LED feedback calls into other places,
> like _noskb() or provide an extra way to do it?
> 

Looking around it seems that we either have a call to free_txskb() or
one of the tx_status(); where the _noskb() would need some special
handling. Are there others or would it be reasonable to add a call in
this one "special" case?

Regards,
Bjorn


RE: [PATCH net 5/5] qede: Split PF/VF ndos.

2017-05-17 Thread Mintz, Yuval
> > Looking at my alternatives for solving this, I can't see any 'good'
> > options - it seems mightily unorthodox to modify net_device_ops, I.e.,
> > add/remove an NDO function-pointer based on some device property, and
> > having multiple ops declared for the sake of a single feature seems
> > unscalable.
> 
> Multiple ops is the only thing that works right now.

What about something like - 

diff --git a/drivers/net/ethernet/qlogic/qede/qede_filter.c 
b/drivers/net/ethernet/qlogic/qede/qede_filter.c
index 333876c..7450c8b 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_filter.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_filter.c
@@ -1028,6 +1028,13 @@ int qede_xdp(struct net_device *dev, struct netdev_xdp 
*xdp)
 {
struct qede_dev *edev = netdev_priv(dev);

+   /* Not all VFs can support XDP; But we can't fail the query */
+   if (() {
+   if (xdp->command == XDP_QUERY_PROG)
+   return 0;
+   return -EOPNOTSUPP;
+   }
+
switch (xdp->command) {
case XDP_SETUP_PROG:
return qede_xdp_set(edev, xdp->prog);
--

Other than making it a bit more difficult to use the generic XDP [since the VF 
would have the NDO],
I think it would behave as it should.


Re: [PATCH net-next] drivers: net: xgene: Check all RGMII phy mode variants

2017-05-17 Thread Quan Nguyen
On Thu, May 18, 2017 at 3:26 AM, Andrew Lunn  wrote:
>> +bool is_xgene_enet_phy_mode_rgmii(struct net_device *ndev)
>> +{
>> + struct xgene_enet_pdata *pdata = netdev_priv(ndev);
>> + int phy_mode = pdata->phy_mode;
>> + bool ret;
>> +
>> + ret = phy_mode == PHY_INTERFACE_MODE_RGMII ||
>> +   phy_mode == PHY_INTERFACE_MODE_RGMII_ID ||
>> +   phy_mode == PHY_INTERFACE_MODE_RGMII_RXID ||
>> +   phy_mode == PHY_INTERFACE_MODE_RGMII_TXID;
>> +
>> + return ret;
>> +}
>
> include/linux/phy.h:
>
> /**
>  * phy_interface_is_rgmii - Convenience function for testing if a PHY 
> interface
>  * is RGMII (all variants)
>  * @phydev: the phy_device struct
>  */
> static inline bool phy_interface_is_rgmii(struct phy_device *phydev)
> {
> return phydev->interface >= PHY_INTERFACE_MODE_RGMII &&
> phydev->interface <= PHY_INTERFACE_MODE_RGMII_TXID;
> };
>

Hi Andrew,

Our purpose is to handle our internal pdata->phy_mode, so
phy_interface_is_rgmii(phydev) seems not to fit.
Instead, we're working on the below:

+bool is_xgene_enet_phy_mode_rgmii(struct net_device *ndev)
+{
+   struct xgene_enet_pdata *pdata = netdev_priv(ndev);
+
+   return pdata->phy_mode >= PHY_INTERFACE_MODE_RGMII &&
+  pdata->phy_mode <= PHY_INTERFACE_MODE_RGMII_TXID;
+}
+


Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-17 Thread Leon Romanovsky
On Wed, May 17, 2017 at 08:07:00PM -0400, Doug Ledford wrote:
> On 5/17/2017 6:37 PM, Parav Pandit wrote:
> > Hi Doug,
> >
>
> >> It would have been better with AF_INET/AF_INET6 and an option to enable
> >> SMC than AF_SMC.  The first implementation simply assumes AF_INET in
> >> the presence of AF_SMC.  When IPv6 support is added, some sort of
> >> guessing logic will have to be put in place to try and determine if an
> >> AF_SMC address is actually AF_INET or AF_INET6 since we won't have a
> >> guaranteed way of telling.  Apps can use struct sockaddr_storage as their
> >> normal element to stick the address into, and could rely on the kernel to
> >> interpret it properly based on the AF_INET/AF_INET6 differentiation, and
> >> this breaks that.  The RFC gives *some* thought to adding IPv6 in the
> >> future, but not a lot.  It may be that the answer is that in the future, 
> >> IPv6
> >> support is enabled by making the IPv6 API be
> >> AF_INET6 + setsockopt(SMC) or the equivalent.  If that's the case, then I
> >> would suggest making the later API specifically call out AF_INET +
> >> setsockopt(SMC) be identical to AF_SMC.
> >>
> >
> > What are the shortcomings in my proposal [1] which I am reiterating below.
> > Bart also suggested to define new stream protocol for SMC similar to SCTP.
> >
> > (a) address family should be either AF_INET or AF_INET6
> > (b) socket() API to introduce new protocol as PROTO_SMC for in socket() 
> > system call.
> >
> > With this there is no additional setsockop() needed.
> >
> > With this - user space applications, do getaddrinfo() with hint as
> > hints.ai_family  = AF_INET;
> > hints.ai_socktype = SOCK_STREAM;
> >
> > getaddrinfo() returns back both the protocols TCP and SMC.
> > Famous database application such as Redis client iterates over all entries 
> > of getaddrinfo() and establishes connection to servers.
> >
> > There are few advantages of this interface.
> > 1. No change in any makefile of applications needed - unless user wants to 
> > specify explicitly that it wants to force SMC protocol.
> > 2. No need to do LD_LIBRARY_PRELOAD, (which won't work anyway because 
> > bind() connect() of SMC checks for AF_INET).
> > 3. No major changes to glibc to process AF_SMC differently
> > glibc references IPPROTO_TCP at 22 places. (compare to AF_INET at 140+ 
> > places).
> > A lot simpler test matrix for glibc for new protocol
> > 5. No need to recompile applications, as long as getaddrinfo returns all 
> > streaming protocols (TCP, SMC)
> > 6. for applications to make use of setsockopt() it needs another knob and 
> > hint from other places, which can be avoided because SMC TCP option 
> > negotiates with remote end
> >
> > And representing new protocol as new protocol for a given address family 
> > appears correct, compare to setting socket options.
> >
> > Tools like CRIU or similar tool might find a race conditions - when it 
> > queries socket option, SMC was not set, but later on SMC was set, and does 
> > incorrect handling.
> > Setting socket() with SMC protocol makes it easier to understand in this 
> > area as well.
>
> I have no problem with the proposal in itself, but as IBM released this
> publication and did their own implementation prior to submitting things
> upstream, and as there might exist in the field implementations, it
> depends on whether we wish to call those in the field implementations
> experimental and break them as we go to a final implementation of
> version 1, or if we consider version 1 baked.  I'm fine breaking it.
> After all, that's what happened with XRC back in the day and Mellanox
> learned a valuable lesson about upstream first.  I have no problem with
> IBM learning that same lesson IMO.  So, I find your proposal, including
> breaking the API of the version 1 implementation just taken into the
> kernel before it has had time to fully sit and gel, acceptable.

Doug,
I am amused that after your excellent summary you are not calling to
this v1 of protocol in its real word: BROKEN.

Current implementation and RFC are DOA (dead-on-arrival) and are not
usable for ALL RDMA key players (Mellanox, Intel, Chelsio, ...) and
ALL RDMA major users. It doesn't fit into existing infrastructure
(DB, HPC, glibc, e.t.c.) and doesn't even fulfill the expected from
cover letter (doesn't work with LD_PRELOAD_*).

The fact that IBM implemented it and used it doesn't mean that they
can't continue to leave it with out-of-tree solution for a little
bit more time till usable version will come.

>
> But this is where we kind of need a judgment from on high, and why I
> Cc:ed Linus on this thread.  Any input on this issue Linus?

It should be dropped/moved_to_staging as soon as possible, before
UAPI started to spread.

>
> > I have additional proposal for link groups, resource creation area. I will 
> > take that up after this discussion.
>
> Look forward to hearing it.
>
> > [1] https://patchwork.kernel.org/patch/9719375/
>
>
> --
> Doug Ledford 

[[PATCH v1]] hdlcdrv: fix divide error bug if bitrate is 0

2017-05-17 Thread Firo Yang
The divisor s->par.bitrate will always be 0 until initialized by
ndo_open() and hdlcdrv_open().

In order to fix this divide zero error, check whether the netdevice was
opened by ndo_open() before performing divide.And we also check the the
value of bitrate in case of bad setting of it.

Reported-by: Dmitry Vyukov 
Signed-off-by: Firo Yang 
---
v0->v1:
Reviewed by walter harms .
Return ENODEV instead of EPERM if !netif_running(dev)
Check if s->par.bitrate > 0.

 drivers/net/hamradio/hdlcdrv.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/hamradio/hdlcdrv.c b/drivers/net/hamradio/hdlcdrv.c
index 8c3633c..b0f417f 100644
--- a/drivers/net/hamradio/hdlcdrv.c
+++ b/drivers/net/hamradio/hdlcdrv.c
@@ -576,6 +576,10 @@ static int hdlcdrv_ioctl(struct net_device *dev, struct 
ifreq *ifr, int cmd)
case HDLCDRVCTL_CALIBRATE:
if(!capable(CAP_SYS_RAWIO))
return -EPERM;
+   if (!netif_running(dev))
+   return -ENODEV;
+   if (!(s->par.bitrate > 0))
+   return -EINVAL;
if (bi.data.calibrate > INT_MAX / s->par.bitrate)
return -EINVAL;
s->hdlctx.calibrate = bi.data.calibrate * s->par.bitrate / 16;
-- 
2.7.4



Re: [PATCH] hdlcdrv: fix divide error bug if bitrate is 0

2017-05-17 Thread Firo Yang
On Wed, May 17, 2017 at 06:08:11PM +0200, walter harms wrote:
>
>
>Am 17.05.2017 15:42, schrieb Firo Yang:
>> On Wed, May 17, 2017 at 02:59:39PM +0200, walter harms wrote:
>>>
>>>
>>> Am 17.05.2017 14:35, schrieb Firo Yang:
 The divisor s->par.bitrate will always be 0 until initialized by
 ndo_open() and hdlcdrv_open().

 In order to fix this divide zero error, check whether the netdevice
 was opened by ndo_open() before performing divide.

 Reported-by: Dmitry Vyukov 
 Signed-off-by: Firo Yang 
 ---
  drivers/net/hamradio/hdlcdrv.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/drivers/net/hamradio/hdlcdrv.c 
 b/drivers/net/hamradio/hdlcdrv.c
 index 8c3633c..3c783fd 100644
 --- a/drivers/net/hamradio/hdlcdrv.c
 +++ b/drivers/net/hamradio/hdlcdrv.c
 @@ -574,7 +574,7 @@ static int hdlcdrv_ioctl(struct net_device *dev, 
 struct ifreq *ifr, int cmd)
break;  
  
case HDLCDRVCTL_CALIBRATE:
 -  if(!capable(CAP_SYS_RAWIO))
 +  if (!capable(CAP_SYS_RAWIO) || !netif_running(dev))
return -EPERM;
if (bi.data.calibrate > INT_MAX / s->par.bitrate)
return -EINVAL;
>>>
>>> I would still check for s->par.bitrate > 0 later changes may affect the 
>>> setting of it
>>> and it is much more obvious.
>> 
>> I think 0 is not valid value for bitrate, so we should check it in
>> other places, like what ser12_open() did:
>> 429 if (bc->baud < 300 || bc->baud > 4800) {
>> 430 printk(KERN_INFO "baycom_ser_fdx: invalid baudrate "
>> 431 "(300...4800)\n");
>> 432 return -EINVAL;
>> 433 }
>> ...
>> 440 bc->hdrv.par.bitrate = bc->baud;
>
>
>I do not want to say you change is not valid but i have learned that it is 
>better to
>have an obvious check that to rely on hidden knowledge.
I agree with this.
>
>
>> 
>>>
>>> Also perhaps !netif_running(dev) should better return ENODEV.
>> 
>> However, the 'dev' truly exists in this circumstance.
>> 
>
>yes and i do not feel good with that but "no permission" will lead
>any enduser into a search for user rights.
Indeed, ENODEV is more informative to enduser.
I will send a update patch.

Thanks,
Firo
>
>
>
>re,
> wh
>
>
>> Thanks,
>> Firo
>> 
>>>
>>>
>>> just my 2 cents,
>>> re,
>>> wh
>>>


Re: [net:master 9/12] net/ipv6/ip6_offload.c:120:7-21: WARNING: Unsigned expression compared with zero: unfrag_ip6hlen < 0 (fwd)

2017-05-17 Thread David Miller
From: Julia Lawall 
Date: Thu, 18 May 2017 10:01:07 +0800 (SGT)

> It may be worth checking on these.  The code context is shown in the first
> case (line 120).  For the others, at least it gives the line numbers.
 ...
>>> net/ipv6/ip6_offload.c:120:7-21: WARNING: Unsigned expression compared with 
>>> zero: unfrag_ip6hlen < 0
> --
>>> net/ipv6/ip6_output.c:601:5-9: WARNING: Unsigned expression compared with 
>>> zero: hlen < 0
> --
>>> net/ipv6/udp_offload.c:94:6-20: WARNING: Unsigned expression compared with 
>>> zero: unfrag_ip6hlen < 0

Sigh, this is worse than the bug the commit introducing these warnings
was trying to fix.

I've pushed the following:


ipv6: Check ip6_find_1stfragopt() return value properly.

Do not use unsigned variables to see if it returns a negative
error or not.

Fixes: 2423496af35d ("ipv6: Prevent overrun when parsing v6 header options")
Reported-by: Julia Lawall 
Signed-off-by: David S. Miller 
---
 net/ipv6/ip6_offload.c | 9 -
 net/ipv6/ip6_output.c  | 7 +++
 net/ipv6/udp_offload.c | 8 +---
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c
index eab36ab..280268f 100644
--- a/net/ipv6/ip6_offload.c
+++ b/net/ipv6/ip6_offload.c
@@ -63,7 +63,6 @@ static struct sk_buff *ipv6_gso_segment(struct sk_buff *skb,
const struct net_offload *ops;
int proto;
struct frag_hdr *fptr;
-   unsigned int unfrag_ip6hlen;
unsigned int payload_len;
u8 *prevhdr;
int offset = 0;
@@ -116,10 +115,10 @@ static struct sk_buff *ipv6_gso_segment(struct sk_buff 
*skb,
skb->network_header = (u8 *)ipv6h - skb->head;
 
if (udpfrag) {
-   unfrag_ip6hlen = ip6_find_1stfragopt(skb, );
-   if (unfrag_ip6hlen < 0)
-   return ERR_PTR(unfrag_ip6hlen);
-   fptr = (struct frag_hdr *)((u8 *)ipv6h + 
unfrag_ip6hlen);
+   int err = ip6_find_1stfragopt(skb, );
+   if (err < 0)
+   return ERR_PTR(err);
+   fptr = (struct frag_hdr *)((u8 *)ipv6h + err);
fptr->frag_off = htons(offset);
if (skb->next)
fptr->frag_off |= htons(IP6_MF);
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 01deecd..d4a31be 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -597,11 +597,10 @@ int ip6_fragment(struct net *net, struct sock *sk, struct 
sk_buff *skb,
int ptr, offset = 0, err = 0;
u8 *prevhdr, nexthdr = 0;
 
-   hlen = ip6_find_1stfragopt(skb, );
-   if (hlen < 0) {
-   err = hlen;
+   err = ip6_find_1stfragopt(skb, );
+   if (err < 0)
goto fail;
-   }
+   hlen = err;
nexthdr = *prevhdr;
 
mtu = ip6_skb_dst_mtu(skb);
diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
index b348cff..a2267f8 100644
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -29,6 +29,7 @@ static struct sk_buff *udp6_ufo_fragment(struct sk_buff *skb,
u8 frag_hdr_sz = sizeof(struct frag_hdr);
__wsum csum;
int tnl_hlen;
+   int err;
 
mss = skb_shinfo(skb)->gso_size;
if (unlikely(skb->len <= mss))
@@ -90,9 +91,10 @@ static struct sk_buff *udp6_ufo_fragment(struct sk_buff *skb,
/* Find the unfragmentable header and shift it left by 
frag_hdr_sz
 * bytes to insert fragment header.
 */
-   unfrag_ip6hlen = ip6_find_1stfragopt(skb, );
-   if (unfrag_ip6hlen < 0)
-   return ERR_PTR(unfrag_ip6hlen);
+   err = ip6_find_1stfragopt(skb, );
+   if (err < 0)
+   return ERR_PTR(err);
+   unfrag_ip6hlen = err;
nexthdr = *prevhdr;
*prevhdr = NEXTHDR_FRAGMENT;
unfrag_len = (skb_network_header(skb) - skb_mac_header(skb)) +
-- 
2.7.4



Re: [PATCH net] bpf: adjust verifier heuristics

2017-05-17 Thread David Miller
From: Daniel Borkmann 
Date: Thu, 18 May 2017 03:00:06 +0200

> Current limits with regards to processing program paths do not
> really reflect today's needs anymore due to programs becoming
> more complex and verifier smarter, keeping track of more data
> such as const ALU operations, alignment tracking, spilling of
> PTR_TO_MAP_VALUE_ADJ registers, and other features allowing for
> smarter matching of what LLVM generates.
> 
> This also comes with the side-effect that we result in fewer
> opportunities to prune search states and thus often need to do
> more work to prove safety than in the past due to different
> register states and stack layout where we mismatch. Generally,
> it's quite hard to determine what caused a sudden increase in
> complexity, it could be caused by something as trivial as a
> single branch somewhere at the beginning of the program where
> LLVM assigned a stack slot that is marked differently throughout
> other branches and thus causing a mismatch, where verifier
> then needs to prove safety for the whole rest of the program.
> Subsequently, programs with even less than half the insn size
> limit can get rejected. We noticed that while some programs
> load fine under pre 4.11, they get rejected due to hitting
> limits on more recent kernels. We saw that in the vast majority
> of cases (90+%) pruning failed due to register mismatches. In
> case of stack mismatches, majority of cases failed due to
> different stack slot types (invalid, spill, misc) rather than
> differences in spilled registers.
> 
> This patch makes pruning more aggressive by also adding markers
> that sit at conditional jumps as well. Currently, we only mark
> jump targets for pruning. For example in direct packet access,
> these are usually error paths where we bail out. We found that
> adding these markers, it can reduce number of processed insns
> by up to 30%. Another option is to ignore reg->id in probing
> PTR_TO_MAP_VALUE_OR_NULL registers, which can help pruning
> slightly as well by up to 7% observed complexity reduction as
> stand-alone. Meaning, if a previous path with register type
> PTR_TO_MAP_VALUE_OR_NULL for map X was found to be safe, then
> in the current state a PTR_TO_MAP_VALUE_OR_NULL register for
> the same map X must be safe as well. Last but not least the
> patch also adds a scheduling point and bumps the current limit
> for instructions to be processed to a more adequate value.
> 
> Signed-off-by: Daniel Borkmann 
> Acked-by: Alexei Starovoitov 

Ok, applied, but we should continue trying to make the pruner
more effective.



Re: [PATCH] ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets

2017-05-17 Thread Peter Dawson
>> This fix addresses two problems in the way the DSCP field is
>> formulated on the  encapsulating header of IPv6 tunnels.
>> This fix addresses Bug 195661.
>> https://bugzilla.kernel.org/show_bug.cgi?id=195661
>>
>> 1) The IPv6 tunneling code was manipulating the DSCP field of the
>> encapsulating  packet using the 32b flowlabel. Since the flowlabel is
>> only the lower 20b it  was incorrect to assume that the upper 12b
>> containing the DSCP and ECN fields  would remain intact when
>> formulating the encapsulating header. This fix  handles the 'inherit'
>> and 'fixed-value' DSCP cases explicitly using the extant  dsfield u8 
>> variable.
>>
>> 2) The use of INET_ECN_encapsulate(0, dsfield) in ip6_tnl_xmit was
>> incorrect  and resulted in the DSCP value always being set to 0.
>> ---
>>  net/ipv6/ip6_gre.c| 18 ++
>>  net/ipv6/ip6_tunnel.c | 28 +---
>>  2 files changed, 27 insertions(+), 19 deletions(-)
>
> This patch looks correct, but has trivial style issues like using spaces 
> instead of tabs and other junk.  Please run checkpatch.pl, look at the 
> results and resubmit.

Thanks.  I've run checkpatch.pl over this updated file with no errors.
Regards
Peter
From f12adb265f683d53ae071725e657dab17d26ed8d Mon Sep 17 00:00:00 2001
From: Peter Dawson 
Date: Thu, 18 May 2017 11:52:22 +1000
Subject: [PATCH] ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated
 packets

This fix addresses two problems in the way the DSCP field is formulated
 on the encapsulating header of IPv6 tunnels.
This fix addresses Bug 195661.

1) The IPv6 tunneling code was manipulating the DSCP field of the
 encapsulating packet using the 32b flowlabel. Since the flowlabel is
 only the lower 20b it was incorrect to assume that the upper 12b
 containing the DSCP and ECN fields would remain intact when formulating
 the encapsulating header. This fix handles the 'inherit' and
 'fixed-value' DSCP cases explicitly using the extant dsfield u8 variable.

2) The use of INET_ECN_encapsulate(0, dsfield) in ip6_tnl_xmit was
 incorrect and resulted in the DSCP value always being set to 0.

Signed-off-by: Peter Dawson 
---
 net/ipv6/ip6_gre.c| 13 +++--
 net/ipv6/ip6_tunnel.c | 21 +
 2 files changed, 20 insertions(+), 14 deletions(-)

diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index 8d128ba..0c5b4caa 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -537,11 +537,10 @@ static inline int ip6gre_xmit_ipv4(struct sk_buff *skb, struct net_device *dev)
 
 	memcpy(, >fl.u.ip6, sizeof(fl6));
 
-	dsfield = ipv4_get_dsfield(iph);
-
 	if (t->parms.flags & IP6_TNL_F_USE_ORIG_TCLASS)
-		fl6.flowlabel |= htonl((__u32)iph->tos << IPV6_TCLASS_SHIFT)
-	  & IPV6_TCLASS_MASK;
+		dsfield = ipv4_get_dsfield(iph);
+	else
+		dsfield = ip6_tclass(t->parms.flowinfo);
 	if (t->parms.flags & IP6_TNL_F_USE_ORIG_FWMARK)
 		fl6.flowi6_mark = skb->mark;
 	else
@@ -598,9 +597,11 @@ static inline int ip6gre_xmit_ipv6(struct sk_buff *skb, struct net_device *dev)
 
 	memcpy(, >fl.u.ip6, sizeof(fl6));
 
-	dsfield = ipv6_get_dsfield(ipv6h);
 	if (t->parms.flags & IP6_TNL_F_USE_ORIG_TCLASS)
-		fl6.flowlabel |= (*(__be32 *) ipv6h & IPV6_TCLASS_MASK);
+		dsfield = ipv6_get_dsfield(ipv6h);
+	else
+		dsfield = ip6_tclass(t->parms.flowinfo);
+
 	if (t->parms.flags & IP6_TNL_F_USE_ORIG_FLOWLABEL)
 		fl6.flowlabel |= ip6_flowlabel(ipv6h);
 	if (t->parms.flags & IP6_TNL_F_USE_ORIG_FWMARK)
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 6eb2ae5..7ae6c50 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -1196,7 +1196,7 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield,
 	skb_push(skb, sizeof(struct ipv6hdr));
 	skb_reset_network_header(skb);
 	ipv6h = ipv6_hdr(skb);
-	ip6_flow_hdr(ipv6h, INET_ECN_encapsulate(0, dsfield),
+	ip6_flow_hdr(ipv6h, dsfield,
 		 ip6_make_flowlabel(net, skb, fl6->flowlabel, true, fl6));
 	ipv6h->hop_limit = hop_limit;
 	ipv6h->nexthdr = proto;
@@ -1231,8 +1231,6 @@ ip4ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev)
 	if (tproto != IPPROTO_IPIP && tproto != 0)
 		return -1;
 
-	dsfield = ipv4_get_dsfield(iph);
-
 	if (t->parms.collect_md) {
 		struct ip_tunnel_info *tun_info;
 		const struct ip_tunnel_key *key;
@@ -1246,6 +1244,7 @@ ip4ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev)
 		fl6.flowi6_proto = IPPROTO_IPIP;
 		fl6.daddr = key->u.ipv6.dst;
 		fl6.flowlabel = key->label;
+		dsfield = ip6_tclass(key->label);
 	} else {
 		if (!(t->parms.flags & IP6_TNL_F_IGN_ENCAP_LIMIT))
 			encap_limit = t->parms.encap_limit;
@@ -1254,8 +1253,9 @@ ip4ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev)
 		fl6.flowi6_proto = IPPROTO_IPIP;
 
 		if (t->parms.flags & IP6_TNL_F_USE_ORIG_TCLASS)
-			fl6.flowlabel |= htonl((__u32)iph->tos << IPV6_TCLASS_SHIFT)
-	 & IPV6_TCLASS_MASK;
+			dsfield = ipv4_get_dsfield(iph);
+		else
+			

Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.

2017-05-17 Thread Alexei Starovoitov

On 5/17/17 8:33 AM, Edward Cree wrote:

On 17/05/17 15:00, Edward Cree wrote:

OTOH the 'track known 1s as well' might work in a nice generic way
 and cover all bases, I'll have to experiment a bit with that.

-Ed


So I did some experiments (in Python, script follows) and found that
 indeed this does appear to work, at least for addition and shifts.
The idea is that we have a 0s mask and a 1s mask; for bits that are
 unknown, the 0s mask is set and the 1s mask is cleared.  So a
 completely unknown variable has masks (~0, 0), then if you shift it
 left 2 you get (~3, 0) - just shift both masks.  A constant x has
 masks (x, x) - all the 0s are known 0s and all the 1s are known 1s.
Addition is a bit more complicated: we compute the 'unknown bits'
 mask, by XORing the 0s and 1s masks together, of each addend.  Then
 we add the corresponding masks from each addend together, and force
 the 'unknown' bits to the appropriate values in each mask.
So given (a1, b1) and (a2, b2), we compute m1 = a1 ^ b1,
 m2 = a2 ^ b2, and m = m1 | m2.  Then a = (a1 + a2) | m, and
 b = (b1 + b2) & ~m.
As a worked example, 2 + (x << 2) + 14:
 2 => (2, 2) constant
 x => (~0, 0) unknown
 x << 2 => (~3, 0)
 2 + (x << 2): add (2, 2) with (~3, 0)
m1 = 0, m2 = ~3, m = ~3
a = (2 + ~3) | ~3 = ~1 | ~3 = ~1
b = (2 + 0) & ~~3 = 2 & 3 = 2
so (~1, 2), which means "...xx10"
now add 14: add (~1, 2) with (14, 14)
m1 = ~3, m2 = 0, m = ~3
a = (~1 + 14) | ~3 = 12 | ~3 = ~3
b = (2 + 14) & ~~3 = 16 & 3 = 0
so (~3, 0), which means "...xx00"
and the result is 4-byte aligned.

-Ed

PS. Beware of bugs in the following code; I have only tested it, not
 proved it correct.

--8<--

#!/usr/bin/python2

def cpl(x):
return (~x)&0xff

class AlignedNumber(object):
def __init__(self, mask0=0xff, mask1=0):
"""mask0 has 0s for bits known to be 0, 1s otherwise.
mask1 has 1s for bits known to be 1, 0s otherwise.
Thus a bit which is set in mask0 and cleared in mask1 is an 'unknown'
bit, while a bit which is cleared in mask0 and set in mask1 is a bug.
"""
self.masks = (mask0 & 0xff, mask1 & 0xff)
self.validate()
@classmethod
def const(cls, value):
"""All bits are known, so both masks equal value."""
return cls(value, value)
def validate(self):
# Check for bits 'known' to be both 0 and 1
assert not (cpl(self.masks[0]) & self.masks[1]), self.masks
# Check unknown bits don't follow known bits
assert self.mx | ((self.mx - 1) & 0xff) == 0xff, self.masks
def __str__(self):
return ':'.join(map(bin, self.masks))
def __lshift__(self, sh):
"""Shift both masks left, low bits become 'known 0'"""
return self.__class__(self.masks[0] << sh, self.masks[1] << sh)
def __rshift__(self, sh):
"""Shift 1s into mask0; high bits become 'unknown'.
While strictly speaking they may be known 0 if we're tracking the full
word and doing unsigned shifts, having known bits before unknown bits
breaks the addition code."""
return self.__class__(cpl(cpl(self.masks[0]) >> sh), self.masks[1] >> 
sh)
@property
def mx(self):
"""Mask of unknown bits"""
return self.masks[0] ^ self.masks[1]
def __add__(self, other):
"""OR the mx values together.  Unknown bits could cause carries, so we
just assume that they can carry all the way to the left (thus we keep
our mx masks in the form 1...10...0.
Then, add our 0- and 1-masks, and force the bits of the combined mx
mask to the unknown state."""
if isinstance(other, int):
return self + AlignedNumber.const(other)
assert isinstance(other, AlignedNumber), other
m = self.mx | other.mx
return self.__class__((self.masks[0] + other.masks[0]) | m,
  (self.masks[1] + other.masks[1]) & cpl(m))
def is_aligned(self, bits):
"""We are 2^n-aligned iff the bottom n bits are known-0."""
mask = (1 << bits) - 1
return not (self.masks[0] & mask)

if __name__ == '__main__':
a = AlignedNumber.const(2)
b = AlignedNumber() << 2
c = AlignedNumber.const(14)
print a, b, c
print a + b, a + b + c
assert (a + b + c).is_aligned(2)


that bit was confusing to me.
is_aligned(2) checks that number is 4 byte aligned :)

Would it be easier to represent this logic via (mask_of_unknown, value)
instead of (mask0, mask1) ?
Where mask_of_unknown has 1s for unknown bits and 0 for known bits
in the value. Then arithmetic operations will be easier to visualize.
Like:
(mask1, val1) + (mask2, val2) = (mask1 | mask2, val1 + val2)

Here is your modified script:
#!/usr/bin/python2

class AlignedNumber(object):
def __init__(self, mask=0xff, value=0):
# mask has 1s for unknown bits, 0s for known bits in value
self.masks = (mask & 0xff, value & 0xff)
@classmethod
def const(cls, 

[net:master 9/12] net/ipv6/ip6_offload.c:120:7-21: WARNING: Unsigned expression compared with zero: unfrag_ip6hlen < 0 (fwd)

2017-05-17 Thread Julia Lawall
Hello,

It may be worth checking on these.  The code context is shown in the first
case (line 120).  For the others, at least it gives the line numbers.

julia

-- Forwarded message --
Date: Thu, 18 May 2017 09:07:31 +0800
From: kbuild test robot 
To: kbu...@01.org
Cc: Julia Lawall 
Subject: [net:master 9/12] net/ipv6/ip6_offload.c:120:7-21: WARNING: Unsigned
expression compared with zero: unfrag_ip6hlen < 0

CC: kbuild-...@01.org
CC: netdev@vger.kernel.org
TO: Craig Gallek 

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git master
head:   579f1d926c66cc0bd3bd87b1fe2e091084b07430
commit: 2423496af35d94a87156b063ea5cedffc10a70a1 [9/12] ipv6: Prevent overrun 
when parsing v6 header options
:: branch date: 2 hours ago
:: commit date: 6 hours ago

>> net/ipv6/ip6_offload.c:120:7-21: WARNING: Unsigned expression compared with 
>> zero: unfrag_ip6hlen < 0
--
>> net/ipv6/ip6_output.c:601:5-9: WARNING: Unsigned expression compared with 
>> zero: hlen < 0
--
>> net/ipv6/udp_offload.c:94:6-20: WARNING: Unsigned expression compared with 
>> zero: unfrag_ip6hlen < 0

git remote add net https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git
git remote update net
git checkout 2423496af35d94a87156b063ea5cedffc10a70a1
vim +120 net/ipv6/ip6_offload.c

d1da932e Vlad Yasevich2012-11-15  104
07b26c94 Steffen Klassert 2016-09-19  105   gso_partial = 
!!(skb_shinfo(segs)->gso_type & SKB_GSO_PARTIAL);
07b26c94 Steffen Klassert 2016-09-19  106
d1da932e Vlad Yasevich2012-11-15  107   for (skb = segs; skb; skb = 
skb->next) {
d3e5e006 Eric Dumazet 2013-10-20  108   ipv6h = (struct ipv6hdr 
*)(skb_mac_header(skb) + nhoff);
07b26c94 Steffen Klassert 2016-09-19  109   if (gso_partial)
802ab55a Alexander Duyck  2016-04-10  110   payload_len = 
skb_shinfo(skb)->gso_size +
802ab55a Alexander Duyck  2016-04-10  111 
SKB_GSO_CB(skb)->data_offset +
802ab55a Alexander Duyck  2016-04-10  112 
skb->head - (unsigned char *)(ipv6h + 1);
802ab55a Alexander Duyck  2016-04-10  113   else
802ab55a Alexander Duyck  2016-04-10  114   payload_len = 
skb->len - nhoff - sizeof(*ipv6h);
802ab55a Alexander Duyck  2016-04-10  115   ipv6h->payload_len = 
htons(payload_len);
d3e5e006 Eric Dumazet 2013-10-20  116   skb->network_header = 
(u8 *)ipv6h - skb->head;
d3e5e006 Eric Dumazet 2013-10-20  117
91a48a2e Hannes Frederic Sowa 2014-02-24  118   if (udpfrag) {
d1da932e Vlad Yasevich2012-11-15  119   unfrag_ip6hlen 
= ip6_find_1stfragopt(skb, );
2423496a Craig Gallek 2017-05-16 @120   if 
(unfrag_ip6hlen < 0)
2423496a Craig Gallek 2017-05-16  121   return 
ERR_PTR(unfrag_ip6hlen);
d3e5e006 Eric Dumazet 2013-10-20  122   fptr = (struct 
frag_hdr *)((u8 *)ipv6h + unfrag_ip6hlen);
d1da932e Vlad Yasevich2012-11-15  123   fptr->frag_off 
= htons(offset);
53b24b8f Ian Morris   2015-03-29  124   if (skb->next)
d1da932e Vlad Yasevich2012-11-15  125   
fptr->frag_off |= htons(IP6_MF);
d1da932e Vlad Yasevich2012-11-15  126   offset += 
(ntohs(ipv6h->payload_len) -
d1da932e Vlad Yasevich2012-11-15  127  
sizeof(struct frag_hdr));
d1da932e Vlad Yasevich2012-11-15  128   }

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


Re: [net-realtek-btcoexist] question about identical code for different branches

2017-05-17 Thread Larry Finger

On 05/17/2017 04:52 PM, Gustavo A. R. Silva wrote:


Hello everybody,

While looking into Coverity ID 1362263 I ran into the following piece of code at 
drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.c:1000:


1000void exhalbtc_set_ant_num(struct rtl_priv *rtlpriv, u8 type, u8 ant_num)
1001{
1002if (BT_COEX_ANT_TYPE_PG == type) {
1003gl_bt_coexist.board_info.pg_ant_num = ant_num;
1004gl_bt_coexist.board_info.btdm_ant_num = ant_num;
1005/* The antenna position:
1006 * Main (default) or Aux for pgAntNum=2 && btdmAntNum =1.
1007 * The antenna position should be determined by
1008 * auto-detect mechanism.
1009 * The following is assumed to main,
1010 * and those must be modified
1011 * if y auto-detect mechanism is ready
1012 */
1013if ((gl_bt_coexist.board_info.pg_ant_num == 2) &&
1014(gl_bt_coexist.board_info.btdm_ant_num == 1))
1015gl_bt_coexist.board_info.btdm_ant_pos =
1016   
BTC_ANTENNA_AT_MAIN_PORT;

1017else
1018gl_bt_coexist.board_info.btdm_ant_pos =
1019   
BTC_ANTENNA_AT_MAIN_PORT;

1020} else if (BT_COEX_ANT_TYPE_ANTDIV == type) {
1021gl_bt_coexist.board_info.btdm_ant_num = ant_num;
1022gl_bt_coexist.board_info.btdm_ant_pos =
1023   
BTC_ANTENNA_AT_MAIN_PORT;

1024} else if (type == BT_COEX_ANT_TYPE_DETECTED) {
1025gl_bt_coexist.board_info.btdm_ant_num = ant_num;
1026if (rtlpriv->cfg->mod_params->ant_sel == 1)
1027gl_bt_coexist.board_info.btdm_ant_pos =
1028BTC_ANTENNA_AT_AUX_PORT;
1029else
1030gl_bt_coexist.board_info.btdm_ant_pos =
1031BTC_ANTENNA_AT_MAIN_PORT;
1032}
1033}

The issue is that lines of code 1015-1016 and 1018-1019 are identical for 
different branches.


My question here is if one of those assignments should be modified, or the 
entire _if_ statement replaced and the function refactored?


I'd really appreciate any comment on this.

Thank you!
--
Gustavo A. R. Silva


Gustavo,

Thanks for the notification. I will discuss with Realtek as to the proper fix.

Larry




Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.

2017-05-17 Thread Alexei Starovoitov

On 5/17/17 5:16 PM, David Miller wrote:

BTW, we track something similar already in reg->imm which tracks how
many high order bits are known to be cleared in the register.  It is
used to avoid potential overflow for packet pointer accesses.  I bet
we can subsume that into this facility as well.


yeah, reg->imm tracks zero upper bits in very simplistic way.
For alignment checking it seems we need to track lower zeros
and ones. If Edward's algorithm can be adopted to track both
that would be double win indeed.



Re: [PATCH] ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets

2017-05-17 Thread Stephen Hemminger
On Thu, 18 May 2017 01:11:01 +
"I-Dawson, Peter A"  wrote:

> This fix addresses two problems in the way the DSCP field is formulated on the
>  encapsulating header of IPv6 tunnels.
> This fix addresses Bug 195661. 
> https://bugzilla.kernel.org/show_bug.cgi?id=195661 
> 
> 1) The IPv6 tunneling code was manipulating the DSCP field of the 
> encapsulating
>  packet using the 32b flowlabel. Since the flowlabel is only the lower 20b it
>  was incorrect to assume that the upper 12b containing the DSCP and ECN fields
>  would remain intact when formulating the encapsulating header. This fix
>  handles the 'inherit' and 'fixed-value' DSCP cases explicitly using the 
> extant
>  dsfield u8 variable.
> 
> 2) The use of INET_ECN_encapsulate(0, dsfield) in ip6_tnl_xmit was incorrect
>  and resulted in the DSCP value always being set to 0.
> ---
>  net/ipv6/ip6_gre.c| 18 ++
>  net/ipv6/ip6_tunnel.c | 28 +---
>  2 files changed, 27 insertions(+), 19 deletions(-)

This patch looks correct, but has trivial style issues like using spaces
instead of tabs and other junk.  Please run checkpatch.pl, look at the
results and resubmit.



[PATCH] ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets

2017-05-17 Thread I-Dawson, Peter A
This fix addresses two problems in the way the DSCP field is formulated on the
 encapsulating header of IPv6 tunnels.
This fix addresses Bug 195661. 
https://bugzilla.kernel.org/show_bug.cgi?id=195661 

1) The IPv6 tunneling code was manipulating the DSCP field of the encapsulating
 packet using the 32b flowlabel. Since the flowlabel is only the lower 20b it
 was incorrect to assume that the upper 12b containing the DSCP and ECN fields
 would remain intact when formulating the encapsulating header. This fix
 handles the 'inherit' and 'fixed-value' DSCP cases explicitly using the extant
 dsfield u8 variable.

2) The use of INET_ECN_encapsulate(0, dsfield) in ip6_tnl_xmit was incorrect
 and resulted in the DSCP value always being set to 0.
---
 net/ipv6/ip6_gre.c| 18 ++
 net/ipv6/ip6_tunnel.c | 28 +---
 2 files changed, 27 insertions(+), 19 deletions(-)

diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index 8d128ba..42f51fe 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -537,11 +537,11 @@ static inline int ip6gre_xmit_ipv4(struct sk_buff *skb, 
struct net_device *dev)
 
memcpy(, >fl.u.ip6, sizeof(fl6));
 
-   dsfield = ipv4_get_dsfield(iph);
-
-   if (t->parms.flags & IP6_TNL_F_USE_ORIG_TCLASS)
-   fl6.flowlabel |= htonl((__u32)iph->tos << IPV6_TCLASS_SHIFT)
- & IPV6_TCLASS_MASK;
+   if (t->parms.flags & IP6_TNL_F_USE_ORIG_TCLASS) {
+   dsfield = ipv4_get_dsfield(iph);
+   } else {
+   dsfield = ip6_tclass(t->parms.flowinfo);
+   }
if (t->parms.flags & IP6_TNL_F_USE_ORIG_FWMARK)
fl6.flowi6_mark = skb->mark;
else
@@ -598,9 +598,11 @@ static inline int ip6gre_xmit_ipv6(struct sk_buff *skb, 
struct net_device *dev)
 
memcpy(, >fl.u.ip6, sizeof(fl6));
 
-   dsfield = ipv6_get_dsfield(ipv6h);
-   if (t->parms.flags & IP6_TNL_F_USE_ORIG_TCLASS)
-   fl6.flowlabel |= (*(__be32 *) ipv6h & IPV6_TCLASS_MASK);
+   if (t->parms.flags & IP6_TNL_F_USE_ORIG_TCLASS) {
+   dsfield = ipv6_get_dsfield(ipv6h);
+   } else {
+   dsfield = ip6_tclass(t->parms.flowinfo);
+}
if (t->parms.flags & IP6_TNL_F_USE_ORIG_FLOWLABEL)
fl6.flowlabel |= ip6_flowlabel(ipv6h);
if (t->parms.flags & IP6_TNL_F_USE_ORIG_FWMARK)
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 6eb2ae5..4d45195 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -1196,8 +1196,7 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device 
*dev, __u8 dsfield,
skb_push(skb, sizeof(struct ipv6hdr));
skb_reset_network_header(skb);
ipv6h = ipv6_hdr(skb);
-   ip6_flow_hdr(ipv6h, INET_ECN_encapsulate(0, dsfield),
-ip6_make_flowlabel(net, skb, fl6->flowlabel, true, fl6));
+   ip6_flow_hdr(ipv6h, dsfield, ip6_make_flowlabel(net, skb, 
fl6->flowlabel, true, fl6));
ipv6h->hop_limit = hop_limit;
ipv6h->nexthdr = proto;
ipv6h->saddr = fl6->saddr;
@@ -1231,8 +1230,6 @@ ip4ip6_tnl_xmit(struct sk_buff *skb, struct net_device 
*dev)
if (tproto != IPPROTO_IPIP && tproto != 0)
return -1;
 
-   dsfield = ipv4_get_dsfield(iph);
-
if (t->parms.collect_md) {
struct ip_tunnel_info *tun_info;
const struct ip_tunnel_key *key;
@@ -1246,6 +1243,7 @@ ip4ip6_tnl_xmit(struct sk_buff *skb, struct net_device 
*dev)
fl6.flowi6_proto = IPPROTO_IPIP;
fl6.daddr = key->u.ipv6.dst;
fl6.flowlabel = key->label;
+   dsfield = ip6_tclass(key->label);
} else {
if (!(t->parms.flags & IP6_TNL_F_IGN_ENCAP_LIMIT))
encap_limit = t->parms.encap_limit;
@@ -1253,9 +1251,11 @@ ip4ip6_tnl_xmit(struct sk_buff *skb, struct net_device 
*dev)
memcpy(, >fl.u.ip6, sizeof(fl6));
fl6.flowi6_proto = IPPROTO_IPIP;
 
-   if (t->parms.flags & IP6_TNL_F_USE_ORIG_TCLASS)
-   fl6.flowlabel |= htonl((__u32)iph->tos << 
IPV6_TCLASS_SHIFT)
-& IPV6_TCLASS_MASK;
+   if (t->parms.flags & IP6_TNL_F_USE_ORIG_TCLASS) {
+   dsfield = ipv4_get_dsfield(iph);
+   } else {
+   dsfield = ip6_tclass(t->parms.flowinfo);
+   }
if (t->parms.flags & IP6_TNL_F_USE_ORIG_FWMARK)
fl6.flowi6_mark = skb->mark;
else
@@ -1267,6 +1267,8 @@ ip4ip6_tnl_xmit(struct sk_buff *skb, struct net_device 
*dev)
if (iptunnel_handle_offloads(skb, SKB_GSO_IPXIP6))
return -1;
 
+   dsfield = INET_ECN_encapsulate(dsfield, ipv4_get_dsfield(iph));
+
skb_set_inner_ipproto(skb, IPPROTO_IPIP);
 
err = ip6_tnl_xmit(skb, dev, dsfield, , 

[PATCH net] bpf: adjust verifier heuristics

2017-05-17 Thread Daniel Borkmann
Current limits with regards to processing program paths do not
really reflect today's needs anymore due to programs becoming
more complex and verifier smarter, keeping track of more data
such as const ALU operations, alignment tracking, spilling of
PTR_TO_MAP_VALUE_ADJ registers, and other features allowing for
smarter matching of what LLVM generates.

This also comes with the side-effect that we result in fewer
opportunities to prune search states and thus often need to do
more work to prove safety than in the past due to different
register states and stack layout where we mismatch. Generally,
it's quite hard to determine what caused a sudden increase in
complexity, it could be caused by something as trivial as a
single branch somewhere at the beginning of the program where
LLVM assigned a stack slot that is marked differently throughout
other branches and thus causing a mismatch, where verifier
then needs to prove safety for the whole rest of the program.
Subsequently, programs with even less than half the insn size
limit can get rejected. We noticed that while some programs
load fine under pre 4.11, they get rejected due to hitting
limits on more recent kernels. We saw that in the vast majority
of cases (90+%) pruning failed due to register mismatches. In
case of stack mismatches, majority of cases failed due to
different stack slot types (invalid, spill, misc) rather than
differences in spilled registers.

This patch makes pruning more aggressive by also adding markers
that sit at conditional jumps as well. Currently, we only mark
jump targets for pruning. For example in direct packet access,
these are usually error paths where we bail out. We found that
adding these markers, it can reduce number of processed insns
by up to 30%. Another option is to ignore reg->id in probing
PTR_TO_MAP_VALUE_OR_NULL registers, which can help pruning
slightly as well by up to 7% observed complexity reduction as
stand-alone. Meaning, if a previous path with register type
PTR_TO_MAP_VALUE_OR_NULL for map X was found to be safe, then
in the current state a PTR_TO_MAP_VALUE_OR_NULL register for
the same map X must be safe as well. Last but not least the
patch also adds a scheduling point and bumps the current limit
for instructions to be processed to a more adequate value.

Signed-off-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov 
---
 kernel/bpf/verifier.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 39f2dcb..1eddb71 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -140,7 +140,7 @@ struct bpf_verifier_stack_elem {
struct bpf_verifier_stack_elem *next;
 };
 
-#define BPF_COMPLEXITY_LIMIT_INSNS 65536
+#define BPF_COMPLEXITY_LIMIT_INSNS 98304
 #define BPF_COMPLEXITY_LIMIT_STACK 1024
 
 #define BPF_MAP_PTR_POISON ((void *)0xeB9F + POISON_POINTER_DELTA)
@@ -2640,6 +2640,7 @@ static int check_cfg(struct bpf_verifier_env *env)
env->explored_states[t + 1] = STATE_LIST_MARK;
} else {
/* conditional jump with two edges */
+   env->explored_states[t] = STATE_LIST_MARK;
ret = push_insn(t, t + 1, FALLTHROUGH, env);
if (ret == 1)
goto peek_stack;
@@ -2798,6 +2799,12 @@ static bool states_equal(struct bpf_verifier_env *env,
 rcur->type != NOT_INIT))
continue;
 
+   /* Don't care about the reg->id in this case. */
+   if (rold->type == PTR_TO_MAP_VALUE_OR_NULL &&
+   rcur->type == PTR_TO_MAP_VALUE_OR_NULL &&
+   rold->map_ptr == rcur->map_ptr)
+   continue;
+
if (rold->type == PTR_TO_PACKET && rcur->type == PTR_TO_PACKET 
&&
compare_ptrs_to_packet(rold, rcur))
continue;
@@ -2932,6 +2939,9 @@ static int do_check(struct bpf_verifier_env *env)
goto process_bpf_exit;
}
 
+   if (need_resched())
+   cond_resched();
+
if (log_level > 1 || (log_level && do_print_state)) {
if (log_level > 1)
verbose("%d:", insn_idx);
-- 
1.9.3



Re: [PATCH net v2] selftests/bpf: fix broken build due to types.h

2017-05-17 Thread Yonghong Song



On 5/17/17 4:01 PM, David Miller wrote:

From: Daniel Borkmann 
Date: Thu, 18 May 2017 00:57:04 +0200


On 05/18/2017 12:18 AM, Yonghong Song wrote:

Commit 0a5539f66133 ("bpf: Provide a linux/types.h override
for bpf selftests.") caused a build failure for
tools/testing/selftest/bpf
because of some missing types:
  $ make -C tools/testing/selftests/bpf/
  ...
  In file included from
  /home/yhs/work/net-next/tools/testing/selftests/bpf/test_pkt_access.c:8:
  ../../../include/uapi/linux/bpf.h:170:3: error: unknown type name
  '__aligned_u64'
  __aligned_u64   key;
  ...
  /usr/include/linux/swab.h:160:8: error: unknown type name
  '__always_inline'
  static __always_inline __u16 __swab16p(const __u16 *p)
  ...
The type __aligned_u64 is defined in linux:include/uapi/linux/types.h.

The fix is to copy missing type definition into
tools/testing/selftests/bpf/include/uapi/linux/types.h.
Adding additional include "string.h" resolves __always_inline issue.

Fixes: 0a5539f66133 ("bpf: Provide a linux/types.h override for bpf
selftests.")
Signed-off-by: Yonghong Song 


Can you elaborate why string.h specifically? Can't we define the
__always_inline ourselves?


That way it comes from compiler.h


Just a little bit correction. The __always_inline is not from 
compiler.h. The compiler.h is inside kernel source tree. Currently,

programs in selftests do not directly referencing kernel header
files (except test_verifier trying to do with alignment)

==
#ifdef HAVE_GENHDR
# include "autoconf.h"
#else
# if defined(__i386) || defined(__x86_64) || defined(__s390x__) || 
defined(__aarch64__)

#  define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS 1
# endif
#endif
=
(The above part may be gone soon with recent alignment tracking patch)

[yhs@localhost include]$ pwd
/usr/include
[yhs@localhost include]$ find . -name "compiler.h"
[yhs@localhost include]$

The __always_inline comes from sys/cdefs.h
sys/cdefs.h:# define __always_inline __inline __attribute__ 
((__always_inline))


The following is the include chain leading to __always_inline:
string.h
   features.h
  sys/cdefs.h

Yes, it is deeply embedded in chain of header files and hard
to figure out intuitively

Yonghong



Probably it would have been better to have the BPF linux/types.h bring
it in.

Sorry I applied this so quickly, I wanted this regression fixed as fast
as possible.



[PATCH v3] net: dsa: b53: Add compatible strings for the Cygnus-family BCM11360.

2017-05-17 Thread Eric Anholt
Cygnus is a small family of SoCs, of which we currently have
devicetree for BCM11360 and BCM58300.  The 11360's B53 is mostly the
same as 58xx, just requiring a tiny bit of setup that was previously
missing.

Signed-off-by: Eric Anholt 
Reviewed-by: Florian Fainelli 
Acked-by: Rob Herring 
---

v2: Reorder the entry in the docs (suggestion by Scott Branden), add
missing '"'
v3: Resend now that 4.12-rc1 is out.

 Documentation/devicetree/bindings/net/dsa/b53.txt | 3 +++
 drivers/net/dsa/b53/b53_srab.c| 2 ++
 2 files changed, 5 insertions(+)

diff --git a/Documentation/devicetree/bindings/net/dsa/b53.txt 
b/Documentation/devicetree/bindings/net/dsa/b53.txt
index d6c6e41648d4..eb679e92d525 100644
--- a/Documentation/devicetree/bindings/net/dsa/b53.txt
+++ b/Documentation/devicetree/bindings/net/dsa/b53.txt
@@ -13,6 +13,9 @@ Required properties:
   "brcm,bcm5397"
   "brcm,bcm5398"
 
+  For the BCM11360 SoC, must be:
+  "brcm,bcm11360-srab" and the mandatory "brcm,cygnus-srab" string
+
   For the BCM5310x SoCs with an integrated switch, must be one of:
   "brcm,bcm53010-srab"
   "brcm,bcm53011-srab"
diff --git a/drivers/net/dsa/b53/b53_srab.c b/drivers/net/dsa/b53/b53_srab.c
index 8a62b6a69703..c37ffd1b6833 100644
--- a/drivers/net/dsa/b53/b53_srab.c
+++ b/drivers/net/dsa/b53/b53_srab.c
@@ -364,6 +364,7 @@ static const struct of_device_id b53_srab_of_match[] = {
{ .compatible = "brcm,bcm53018-srab" },
{ .compatible = "brcm,bcm53019-srab" },
{ .compatible = "brcm,bcm5301x-srab" },
+   { .compatible = "brcm,bcm11360-srab", .data = (void *)BCM58XX_DEVICE_ID 
},
{ .compatible = "brcm,bcm58522-srab", .data = (void *)BCM58XX_DEVICE_ID 
},
{ .compatible = "brcm,bcm58525-srab", .data = (void *)BCM58XX_DEVICE_ID 
},
{ .compatible = "brcm,bcm58535-srab", .data = (void *)BCM58XX_DEVICE_ID 
},
@@ -371,6 +372,7 @@ static const struct of_device_id b53_srab_of_match[] = {
{ .compatible = "brcm,bcm58623-srab", .data = (void *)BCM58XX_DEVICE_ID 
},
{ .compatible = "brcm,bcm58625-srab", .data = (void *)BCM58XX_DEVICE_ID 
},
{ .compatible = "brcm,bcm88312-srab", .data = (void *)BCM58XX_DEVICE_ID 
},
+   { .compatible = "brcm,cygnus-srab", .data = (void *)BCM58XX_DEVICE_ID },
{ .compatible = "brcm,nsp-srab", .data = (void *)BCM58XX_DEVICE_ID },
{ /* sentinel */ },
 };
-- 
2.11.0



Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.

2017-05-17 Thread David Miller
From: Edward Cree 
Date: Wed, 17 May 2017 16:33:27 +0100

> So I did some experiments (in Python, script follows) and found that
>  indeed this does appear to work, at least for addition and shifts.
> The idea is that we have a 0s mask and a 1s mask; for bits that are
>  unknown, the 0s mask is set and the 1s mask is cleared.  So a
>  completely unknown variable has masks (~0, 0), then if you shift it
>  left 2 you get (~3, 0) - just shift both masks.  A constant x has
>  masks (x, x) - all the 0s are known 0s and all the 1s are known 1s.
> Addition is a bit more complicated: we compute the 'unknown bits'
>  mask, by XORing the 0s and 1s masks together, of each addend.  Then
>  we add the corresponding masks from each addend together, and force
>  the 'unknown' bits to the appropriate values in each mask.
> So given (a1, b1) and (a2, b2), we compute m1 = a1 ^ b1,
>  m2 = a2 ^ b2, and m = m1 | m2.  Then a = (a1 + a2) | m, and
>  b = (b1 + b2) & ~m.
> As a worked example, 2 + (x << 2) + 14:
>  2 => (2, 2) constant
>  x => (~0, 0) unknown
>  x << 2 => (~3, 0)
>  2 + (x << 2): add (2, 2) with (~3, 0)
> m1 = 0, m2 = ~3, m = ~3
> a = (2 + ~3) | ~3 = ~1 | ~3 = ~1
> b = (2 + 0) & ~~3 = 2 & 3 = 2
> so (~1, 2), which means "...xx10"
> now add 14: add (~1, 2) with (14, 14)
> m1 = ~3, m2 = 0, m = ~3
> a = (~1 + 14) | ~3 = 12 | ~3 = ~3
> b = (2 + 14) & ~~3 = 16 & 3 = 0
> so (~3, 0), which means "...xx00"
> and the result is 4-byte aligned.

Ok, I'm starting to digest this.  If it works it's a nice universal
way to handle alignment tracking, that's for sure.

So, in C, addition (a += b) is something like:

struct bpf_reg_bits {
u64 zero_bits;
u64 one_bits;
};

static void add_update_bits(struct bpf_reg_bits *a, struct bpf_reg_bits *b)
{
u64 m_zeros, m_ones, m_all;

m_zeros = a->zero_bits ^ b->zero_bits;
m_ones = a->one_bits ^ b->one_bits;
m_all = m_zeros | m_ones;

a->zero_bits = (a->zero_bits + b->zero_bits) | m_all;
a->one_bits = (a->one_bits + b->zero_bits) & ~m_all;
}

Then, is subtraction merely:

static void sub_update_bits(struct bpf_reg_bits *a, struct bpf_reg_bits *b)
{
u64 m_zeros, m_ones, m_all;

m_zeros = a->zero_bits ^ b->zero_bits;
m_ones = a->one_bits ^ b->one_bits;
m_all = m_zeros | m_ones;

a->zero_bits = (a->zero_bits - b->zero_bits) | m_all;
a->one_bits = (a->one_bits - b->zero_bits) & ~m_all;
}

Or is something different needed?

What about multiplication?  AND should be easy too.

BTW, we track something similar already in reg->imm which tracks how
many high order bits are known to be cleared in the register.  It is
used to avoid potential overflow for packet pointer accesses.  I bet
we can subsume that into this facility as well.

Thanks for all of your suggestions so far.


Re: [PATCH 2/6] wl1251: Use request_firmware_prefer_user() for loading NVS calibration data

2017-05-17 Thread Bjorn Andersson
On Wed 17 May 05:53 PDT 2017, Pali Roh?r wrote:

> On Wednesday 17 May 2017 14:06:06 Johannes Berg wrote:
> > On Tue, 2017-05-16 at 01:13 +0200, Luis R. Rodriguez wrote:
> > > > > Now for N900 case there is a similar scenario
> > > > > alhtough it has additional requirement to go to user-space due to
> > > > > need to use a proprietary library to obtain the NVS calibration
> > > > > data. My thought: Why should firmware_class care?
> > > 
> > > Agreed.
> > 
> > In fact, why should the *driver* care either? IOW - why should
> > "request_firmware_prefer_user()" even exist?
> 
> There are default/example NVS data, which are stored in /lib/firmware
> and installed by linux-firmware package. Those example calibration data
> should not be used for real usage, but Pavel told us that on N900 they
> are enough for working WIFI connection. They does not contain valid MAC
> address, so kernel should generate some (random?).
> 
> So kernel driver should get NVS calibration data from userspace (which
> know how where to get or how to prepare them) and in case userspace do
> not have it, then we can try fallback to those example data (as people
> reported us they can be useful instead of non-working WIFI).
> 

We're going to see a similar case with the Qualcomm DB410c WiFi soon,
where there is default calibration for the chip (wcn3620) but specific
calibration data for the particular board or product using this chip.

As with your case we expect to have a "generic" calibration file
integrated in linux-firmware, but providing means to supporting
device-specific calibration is probably going to be requested shortly.


We have however altered the reference design of picking the MAC address
from the calibration data and have the bootloader pass it via DT - so
our calibration data doesn't need to be specific to each unit.

Regards,
Bjorn


Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-17 Thread Doug Ledford
On 5/17/2017 6:37 PM, Parav Pandit wrote:
> Hi Doug,
> 

>> It would have been better with AF_INET/AF_INET6 and an option to enable
>> SMC than AF_SMC.  The first implementation simply assumes AF_INET in
>> the presence of AF_SMC.  When IPv6 support is added, some sort of
>> guessing logic will have to be put in place to try and determine if an
>> AF_SMC address is actually AF_INET or AF_INET6 since we won't have a
>> guaranteed way of telling.  Apps can use struct sockaddr_storage as their
>> normal element to stick the address into, and could rely on the kernel to
>> interpret it properly based on the AF_INET/AF_INET6 differentiation, and
>> this breaks that.  The RFC gives *some* thought to adding IPv6 in the
>> future, but not a lot.  It may be that the answer is that in the future, IPv6
>> support is enabled by making the IPv6 API be
>> AF_INET6 + setsockopt(SMC) or the equivalent.  If that's the case, then I
>> would suggest making the later API specifically call out AF_INET +
>> setsockopt(SMC) be identical to AF_SMC.
>>
> 
> What are the shortcomings in my proposal [1] which I am reiterating below.
> Bart also suggested to define new stream protocol for SMC similar to SCTP.
> 
> (a) address family should be either AF_INET or AF_INET6
> (b) socket() API to introduce new protocol as PROTO_SMC for in socket() 
> system call.
> 
> With this there is no additional setsockop() needed.
> 
> With this - user space applications, do getaddrinfo() with hint as
> hints.ai_family  = AF_INET;
> hints.ai_socktype = SOCK_STREAM;
> 
> getaddrinfo() returns back both the protocols TCP and SMC.
> Famous database application such as Redis client iterates over all entries of 
> getaddrinfo() and establishes connection to servers.
> 
> There are few advantages of this interface.
> 1. No change in any makefile of applications needed - unless user wants to 
> specify explicitly that it wants to force SMC protocol.
> 2. No need to do LD_LIBRARY_PRELOAD, (which won't work anyway because bind() 
> connect() of SMC checks for AF_INET).
> 3. No major changes to glibc to process AF_SMC differently
> glibc references IPPROTO_TCP at 22 places. (compare to AF_INET at 140+ 
> places).
> A lot simpler test matrix for glibc for new protocol
> 5. No need to recompile applications, as long as getaddrinfo returns all 
> streaming protocols (TCP, SMC)
> 6. for applications to make use of setsockopt() it needs another knob and 
> hint from other places, which can be avoided because SMC TCP option 
> negotiates with remote end
> 
> And representing new protocol as new protocol for a given address family 
> appears correct, compare to setting socket options.
> 
> Tools like CRIU or similar tool might find a race conditions - when it 
> queries socket option, SMC was not set, but later on SMC was set, and does 
> incorrect handling.
> Setting socket() with SMC protocol makes it easier to understand in this area 
> as well.

I have no problem with the proposal in itself, but as IBM released this
publication and did their own implementation prior to submitting things
upstream, and as there might exist in the field implementations, it
depends on whether we wish to call those in the field implementations
experimental and break them as we go to a final implementation of
version 1, or if we consider version 1 baked.  I'm fine breaking it.
After all, that's what happened with XRC back in the day and Mellanox
learned a valuable lesson about upstream first.  I have no problem with
IBM learning that same lesson IMO.  So, I find your proposal, including
breaking the API of the version 1 implementation just taken into the
kernel before it has had time to fully sit and gel, acceptable.

But this is where we kind of need a judgment from on high, and why I
Cc:ed Linus on this thread.  Any input on this issue Linus?

> I have additional proposal for link groups, resource creation area. I will 
> take that up after this discussion.

Look forward to hearing it.

> [1] https://patchwork.kernel.org/patch/9719375/


-- 
Doug Ledford 
GPG Key ID: B826A3330E572FDD
Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD



signature.asc
Description: OpenPGP digital signature


Re: [PATCH net v2] selftests/bpf: fix broken build due to types.h

2017-05-17 Thread Daniel Borkmann

On 05/18/2017 01:48 AM, Yonghong Song wrote:

On 5/17/17 3:57 PM, Daniel Borkmann wrote:

On 05/18/2017 12:18 AM, Yonghong Song wrote:

Commit 0a5539f66133 ("bpf: Provide a linux/types.h override
for bpf selftests.") caused a build failure for tools/testing/selftest/bpf
because of some missing types:
 $ make -C tools/testing/selftests/bpf/
 ...
 In file included from 
/home/yhs/work/net-next/tools/testing/selftests/bpf/test_pkt_access.c:8:
 ../../../include/uapi/linux/bpf.h:170:3: error: unknown type name 
'__aligned_u64'
 __aligned_u64   key;
 ...
 /usr/include/linux/swab.h:160:8: error: unknown type name '__always_inline'
 static __always_inline __u16 __swab16p(const __u16 *p)
 ...
The type __aligned_u64 is defined in linux:include/uapi/linux/types.h.

The fix is to copy missing type definition into
tools/testing/selftests/bpf/include/uapi/linux/types.h.
Adding additional include "string.h" resolves __always_inline issue.

Fixes: 0a5539f66133 ("bpf: Provide a linux/types.h override for bpf selftests.")
Signed-off-by: Yonghong Song 


Can you elaborate why string.h specifically? Can't we define the
__always_inline ourselves?


Actually, yes, we can define __always_inline ourselves in each bpf program or 
in some common header files used by all selftests bpf
programs.

I use string.h because several other bpf programs
(test_xdp.c test_l4lb.c test_tcp_estats.c) they all have
__always_inline and string.h, so they do not have issue.
I discovered this pattern so I just add string.h into test_pkt_access.c
as well.


Ok, thanks for clarifying.


Re: [PATCH net v2] selftests/bpf: fix broken build due to types.h

2017-05-17 Thread Yonghong Song



On 5/17/17 3:57 PM, Daniel Borkmann wrote:

On 05/18/2017 12:18 AM, Yonghong Song wrote:

Commit 0a5539f66133 ("bpf: Provide a linux/types.h override
for bpf selftests.") caused a build failure for 
tools/testing/selftest/bpf

because of some missing types:
 $ make -C tools/testing/selftests/bpf/
 ...
 In file included from 
/home/yhs/work/net-next/tools/testing/selftests/bpf/test_pkt_access.c:8:
 ../../../include/uapi/linux/bpf.h:170:3: error: unknown type name 
'__aligned_u64'

 __aligned_u64   key;
 ...
 /usr/include/linux/swab.h:160:8: error: unknown type name 
'__always_inline'

 static __always_inline __u16 __swab16p(const __u16 *p)
 ...
The type __aligned_u64 is defined in linux:include/uapi/linux/types.h.

The fix is to copy missing type definition into
tools/testing/selftests/bpf/include/uapi/linux/types.h.
Adding additional include "string.h" resolves __always_inline issue.

Fixes: 0a5539f66133 ("bpf: Provide a linux/types.h override for bpf 
selftests.")

Signed-off-by: Yonghong Song 


Can you elaborate why string.h specifically? Can't we define the
__always_inline ourselves?


Actually, yes, we can define __always_inline ourselves in each bpf 
program or in some common header files used by all selftests bpf

programs.

I use string.h because several other bpf programs
(test_xdp.c test_l4lb.c test_tcp_estats.c) they all have
__always_inline and string.h, so they do not have issue.
I discovered this pattern so I just add string.h into test_pkt_access.c
as well.

string.h provides memset/memcpy prototypes which is used in bpf program 
(test_xdp.c and test_tcp_estats.c, but not test_l4lb.c and 
test_pkt_access.c).


Thanks,

Yonghong



Thanks,
Daniel


Re: [PATCH 2/2] arp: always override existing neigh entries with gratuitous ARP

2017-05-17 Thread Julian Anastasov

Hello,

On Wed, 17 May 2017, Ihar Hrachyshka wrote:

> Currently, when arp_accept is 1, we always override existing neigh
> entries with incoming gratuitous ARP replies. Otherwise, we override
> them only if new replies satisfy _locktime_ conditional (packets arrive
> not earlier than _locktime_ seconds since the last update to the neigh
> entry).
> 
> The idea behind locktime is to pick the very first (=> close) reply
> received in a unicast burst when ARP proxies are used. This helps to
> avoid ARP thrashing where Linux would switch back and forth from one
> proxy to another.
> 
> This logic has nothing to do with gratuitous ARP replies that are
> generally not aligned in time when multiple IP address carriers send
> them into network.
> 
> This patch enforces overriding of existing neigh entries by all incoming
> gratuitous ARP packets, irrespective of their time of arrival. This will
> make the kernel honour all incoming gratuitous ARP packets.
> 
> Signed-off-by: Ihar Hrachyshka 
> ---
>  net/ipv4/arp.c | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c
> index 3f06201..97ea9d8 100644
> --- a/net/ipv4/arp.c
> +++ b/net/ipv4/arp.c
> @@ -858,16 +858,16 @@ static int arp_process(struct net *net, struct sock 
> *sk, struct sk_buff *skb)
>  
>   n = __neigh_lookup(_tbl, , dev, 0);
>  
> - if (IN_DEV_ARP_ACCEPT(in_dev)) {
> - unsigned int addr_type = inet_addr_type_dev_table(net, dev, 
> sip);
> -
> - /* Unsolicited ARP is not accepted by default.

Above line is not related to GARP. I think, it should
stay at its old place, see below. Also, in first patch, line
should be:

/* GARP _replies_ also require target hwaddr to be
 * the same as source.
 */

> -It is possible, that this option should be enabled for some
> -devices (strip is candidate)
> -  */
> + if (n || IN_DEV_ARP_ACCEPT(in_dev)) {
> + addr_type = inet_addr_type_dev_table(net, dev, sip);
>   is_garp = arp_is_garp(dev, addr_type, arp->ar_op,
> sip, tip, sha, tha);

There was something I didn't liked in the old code:
inet_addr_type_dev_table() can be slow and we should call it
only when needed. With some rearranging we can fix it, for
example:

1. arp_is_garp(net,..., int *addr_type) should use:

bool is_garp = tip == sip;

...
if (is_garp) {
*addr_type = inet_addr_type_dev_table(net, dev, sip);
if (*addr_type != RTN_UNICAST)
is_garp = false;
}
return is_garp;

> + }
>  
> + if (IN_DEV_ARP_ACCEPT(in_dev)) {
> + /* It is possible, that this option should be enabled for some
> +  * devices (strip is candidate)
> +  */
>   if (!n &&
>   ((arp->ar_op == htons(ARPOP_REPLY)  &&
>   addr_type == RTN_UNICAST) || is_garp))
> 

2. fill addr_type and do is_garp check first:

if (n || IN_DEV_ARP_ACCEPT(in_dev)) {
addr_type = -1;
is_garp = arp_is_garp(net, dev, addr_type, arp->ar_op,
  sip, tip, sha, tha, _type);
}


/* Unsolicited ARP is not accepted by default.
 * It is possible, that this option should be enabled for some
 * devices (strip is candidate)
 */
if (!n && IN_DEV_ARP_ACCEPT(in_dev) &&
(is_garp ||
 (arp->ar_op == htons(ARPOP_REPLY) &&
  (addr_type == RTN_UNICAST ||
   (addr_type < 0 &&
inet_addr_type_dev_table(net, dev, sip) == RTN_UNICAST)
n = __neigh_lookup(_tbl, , dev, 1);

As result, we will avoid the unneeded
inet_addr_type_dev_table() call for ARP requests (non-GARP)
which are too common to see. May be there is another way
to make this code more nice...

Regards

--
Julian Anastasov 


Re: [PATCH net v2] selftests/bpf: fix broken build due to types.h

2017-05-17 Thread David Miller
From: Daniel Borkmann 
Date: Thu, 18 May 2017 01:15:43 +0200

> Btw, 0day kernel testing bot is from now on running BPF selftests
> as well as part of their regression tests. Fengguang confirmed this
> today after integrating this.

That's great news.


Re: [patch net-next v4 06/10] net: sched: introduce helpers to work with filter chains

2017-05-17 Thread Joe Perches
On Wed, 2017-05-17 at 14:44 +0200, Jiri Pirko wrote:
> Wed, May 17, 2017 at 02:39:05PM CEST, j...@mojatatu.com wrote:
> > On 17-05-17 08:25 AM, Jiri Pirko wrote:
> > > Wed, May 17, 2017 at 02:18:00PM CEST, j...@mojatatu.com wrote:
> > > > On 17-05-17 05:07 AM, Jiri Pirko wrote:
> > > > > From: Jiri Pirko 
> > > > > 
> > > > > Introduce struct tcf_chain object and set of helpers around it. Wraps 
> > > > > up
> > > > > insertion, deletion and search in the filter chain.
> > > > > 
> > > > > Signed-off-by: Jiri Pirko 
> > > > > ---
> > > > 
> > > > [..]
> > > > > +
> > > > > +static void
> > > > > +tcf_chain_filter_chain_ptr_set(struct tcf_chain *chain,
> > > > > +struct tcf_proto __rcu **p_filter_chain)
> > > > > +
> > > > 
> > > > What are the rules for this? Common coding style is:
> > > > static void tcf_chain_filter_chain_ptr_set(struct tcf_chain *chain,
> > > >   struct tcf_proto ..
> > > 
> > > When this would not fit 80 cols (this case), you need to wrap the
> > > text in front of the function name. That is exacly what I did.
> > > 
> > 
> > That i understand.
> > The question is: what does scripture dictate on conflict?
> > Should a function signature always follow coding style and
> > allow for exceeding 80 chars or the 80 chars rules trumps?
> 
> Definitelly 80 chars rules trumps here.

Disagree.
80 columns is just a "strongly preferred" limit.

Clarity for a human reader trumps everything else.



Re: [PATCH net v2] selftests/bpf: fix broken build due to types.h

2017-05-17 Thread Daniel Borkmann

On 05/18/2017 01:01 AM, David Miller wrote:

From: Daniel Borkmann 
Date: Thu, 18 May 2017 00:57:04 +0200


On 05/18/2017 12:18 AM, Yonghong Song wrote:

Commit 0a5539f66133 ("bpf: Provide a linux/types.h override
for bpf selftests.") caused a build failure for
tools/testing/selftest/bpf
because of some missing types:
  $ make -C tools/testing/selftests/bpf/
  ...
  In file included from
  /home/yhs/work/net-next/tools/testing/selftests/bpf/test_pkt_access.c:8:
  ../../../include/uapi/linux/bpf.h:170:3: error: unknown type name
  '__aligned_u64'
  __aligned_u64   key;
  ...
  /usr/include/linux/swab.h:160:8: error: unknown type name
  '__always_inline'
  static __always_inline __u16 __swab16p(const __u16 *p)
  ...
The type __aligned_u64 is defined in linux:include/uapi/linux/types.h.

The fix is to copy missing type definition into
tools/testing/selftests/bpf/include/uapi/linux/types.h.
Adding additional include "string.h" resolves __always_inline issue.

Fixes: 0a5539f66133 ("bpf: Provide a linux/types.h override for bpf
selftests.")
Signed-off-by: Yonghong Song 


Can you elaborate why string.h specifically? Can't we define the
__always_inline ourselves?


That way it comes from compiler.h

Probably it would have been better to have the BPF linux/types.h bring
it in.


Would have made it a bit more clear at least.


Sorry I applied this so quickly, I wanted this regression fixed as fast
as possible.


Ok, no problem.

Btw, 0day kernel testing bot is from now on running BPF selftests
as well as part of their regression tests. Fengguang confirmed this
today after integrating this.


Re: [PATCH net v2] selftests/bpf: fix broken build due to types.h

2017-05-17 Thread David Miller
From: Daniel Borkmann 
Date: Thu, 18 May 2017 00:57:04 +0200

> On 05/18/2017 12:18 AM, Yonghong Song wrote:
>> Commit 0a5539f66133 ("bpf: Provide a linux/types.h override
>> for bpf selftests.") caused a build failure for
>> tools/testing/selftest/bpf
>> because of some missing types:
>>  $ make -C tools/testing/selftests/bpf/
>>  ...
>>  In file included from
>>  /home/yhs/work/net-next/tools/testing/selftests/bpf/test_pkt_access.c:8:
>>  ../../../include/uapi/linux/bpf.h:170:3: error: unknown type name
>>  '__aligned_u64'
>>  __aligned_u64   key;
>>  ...
>>  /usr/include/linux/swab.h:160:8: error: unknown type name
>>  '__always_inline'
>>  static __always_inline __u16 __swab16p(const __u16 *p)
>>  ...
>> The type __aligned_u64 is defined in linux:include/uapi/linux/types.h.
>>
>> The fix is to copy missing type definition into
>> tools/testing/selftests/bpf/include/uapi/linux/types.h.
>> Adding additional include "string.h" resolves __always_inline issue.
>>
>> Fixes: 0a5539f66133 ("bpf: Provide a linux/types.h override for bpf
>> selftests.")
>> Signed-off-by: Yonghong Song 
> 
> Can you elaborate why string.h specifically? Can't we define the
> __always_inline ourselves?

That way it comes from compiler.h

Probably it would have been better to have the BPF linux/types.h bring
it in.

Sorry I applied this so quickly, I wanted this regression fixed as fast
as possible.


Re: [PATCH net v2] selftests/bpf: fix broken build due to types.h

2017-05-17 Thread Daniel Borkmann

On 05/18/2017 12:18 AM, Yonghong Song wrote:

Commit 0a5539f66133 ("bpf: Provide a linux/types.h override
for bpf selftests.") caused a build failure for tools/testing/selftest/bpf
because of some missing types:
 $ make -C tools/testing/selftests/bpf/
 ...
 In file included from 
/home/yhs/work/net-next/tools/testing/selftests/bpf/test_pkt_access.c:8:
 ../../../include/uapi/linux/bpf.h:170:3: error: unknown type name 
'__aligned_u64'
 __aligned_u64   key;
 ...
 /usr/include/linux/swab.h:160:8: error: unknown type name '__always_inline'
 static __always_inline __u16 __swab16p(const __u16 *p)
 ...
The type __aligned_u64 is defined in linux:include/uapi/linux/types.h.

The fix is to copy missing type definition into
tools/testing/selftests/bpf/include/uapi/linux/types.h.
Adding additional include "string.h" resolves __always_inline issue.

Fixes: 0a5539f66133 ("bpf: Provide a linux/types.h override for bpf selftests.")
Signed-off-by: Yonghong Song 


Can you elaborate why string.h specifically? Can't we define the
__always_inline ourselves?

Thanks,
Daniel


[PATCH] ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated

2017-05-17 Thread I-Dawson, Peter A
This fix addresses two problems in the way the DSCP field is formulated on the
 encapsulating header of IPv6 tunnels.
This fix addresses Bug 195661. 
https://bugzilla.kernel.org/show_bug.cgi?id=195661 

1) The IPv6 tunneling code was manipulating the DSCP field of the encapsulating
 packet using the 32b flowlabel. Since the flowlabel is only the lower 20b it
 was incorrect to assume that the upper 12b containing the DSCP and ECN fields
 would remain intact when formulating the encapsulating header. This fix
 handles the 'inherit' and 'fixed-value' DSCP cases explicitly using the extant
 dsfield u8 variable.

2) The use of INET_ECN_encapsulate(0, dsfield) in ip6_tnl_xmit was incorrect
 and resulted in the DSCP value always being set to 0.
---
 net/ipv6/ip6_gre.c| 18 ++
 net/ipv6/ip6_tunnel.c | 28 +---
 2 files changed, 27 insertions(+), 19 deletions(-)

diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index 8d128ba..42f51fe 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -537,11 +537,11 @@ static inline int ip6gre_xmit_ipv4(struct sk_buff *skb, 
struct net_device *dev)
 
memcpy(, >fl.u.ip6, sizeof(fl6));
 
-   dsfield = ipv4_get_dsfield(iph);
-
-   if (t->parms.flags & IP6_TNL_F_USE_ORIG_TCLASS)
-   fl6.flowlabel |= htonl((__u32)iph->tos << IPV6_TCLASS_SHIFT)
- & IPV6_TCLASS_MASK;
+   if (t->parms.flags & IP6_TNL_F_USE_ORIG_TCLASS) {
+   dsfield = ipv4_get_dsfield(iph);
+   } else {
+   dsfield = ip6_tclass(t->parms.flowinfo);
+   }
if (t->parms.flags & IP6_TNL_F_USE_ORIG_FWMARK)
fl6.flowi6_mark = skb->mark;
else
@@ -598,9 +598,11 @@ static inline int ip6gre_xmit_ipv6(struct sk_buff *skb, 
struct net_device *dev)
 
memcpy(, >fl.u.ip6, sizeof(fl6));
 
-   dsfield = ipv6_get_dsfield(ipv6h);
-   if (t->parms.flags & IP6_TNL_F_USE_ORIG_TCLASS)
-   fl6.flowlabel |= (*(__be32 *) ipv6h & IPV6_TCLASS_MASK);
+   if (t->parms.flags & IP6_TNL_F_USE_ORIG_TCLASS) {
+   dsfield = ipv6_get_dsfield(ipv6h);
+   } else {
+   dsfield = ip6_tclass(t->parms.flowinfo);
+}
if (t->parms.flags & IP6_TNL_F_USE_ORIG_FLOWLABEL)
fl6.flowlabel |= ip6_flowlabel(ipv6h);
if (t->parms.flags & IP6_TNL_F_USE_ORIG_FWMARK)
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 6eb2ae5..4d45195 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -1196,8 +1196,7 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device 
*dev, __u8 dsfield,
skb_push(skb, sizeof(struct ipv6hdr));
skb_reset_network_header(skb);
ipv6h = ipv6_hdr(skb);
-   ip6_flow_hdr(ipv6h, INET_ECN_encapsulate(0, dsfield),
-ip6_make_flowlabel(net, skb, fl6->flowlabel, true, fl6));
+   ip6_flow_hdr(ipv6h, dsfield, ip6_make_flowlabel(net, skb, 
fl6->flowlabel, true, fl6));
ipv6h->hop_limit = hop_limit;
ipv6h->nexthdr = proto;
ipv6h->saddr = fl6->saddr;
@@ -1231,8 +1230,6 @@ ip4ip6_tnl_xmit(struct sk_buff *skb, struct net_device 
*dev)
if (tproto != IPPROTO_IPIP && tproto != 0)
return -1;
 
-   dsfield = ipv4_get_dsfield(iph);
-
if (t->parms.collect_md) {
struct ip_tunnel_info *tun_info;
const struct ip_tunnel_key *key;
@@ -1246,6 +1243,7 @@ ip4ip6_tnl_xmit(struct sk_buff *skb, struct net_device 
*dev)
fl6.flowi6_proto = IPPROTO_IPIP;
fl6.daddr = key->u.ipv6.dst;
fl6.flowlabel = key->label;
+   dsfield = ip6_tclass(key->label);
} else {
if (!(t->parms.flags & IP6_TNL_F_IGN_ENCAP_LIMIT))
encap_limit = t->parms.encap_limit;
@@ -1253,9 +1251,11 @@ ip4ip6_tnl_xmit(struct sk_buff *skb, struct net_device 
*dev)
memcpy(, >fl.u.ip6, sizeof(fl6));
fl6.flowi6_proto = IPPROTO_IPIP;
 
-   if (t->parms.flags & IP6_TNL_F_USE_ORIG_TCLASS)
-   fl6.flowlabel |= htonl((__u32)iph->tos << 
IPV6_TCLASS_SHIFT)
-& IPV6_TCLASS_MASK;
+   if (t->parms.flags & IP6_TNL_F_USE_ORIG_TCLASS) {
+   dsfield = ipv4_get_dsfield(iph);
+   } else {
+   dsfield = ip6_tclass(t->parms.flowinfo);
+   }
if (t->parms.flags & IP6_TNL_F_USE_ORIG_FWMARK)
fl6.flowi6_mark = skb->mark;
else
@@ -1267,6 +1267,8 @@ ip4ip6_tnl_xmit(struct sk_buff *skb, struct net_device 
*dev)
if (iptunnel_handle_offloads(skb, SKB_GSO_IPXIP6))
return -1;
 
+   dsfield = INET_ECN_encapsulate(dsfield, ipv4_get_dsfield(iph));
+
skb_set_inner_ipproto(skb, IPPROTO_IPIP);
 
err = ip6_tnl_xmit(skb, dev, dsfield, , 

Re: [PATCH net v2] selftests/bpf: fix broken build due to types.h

2017-05-17 Thread David Miller
From: Yonghong Song 
Date: Wed, 17 May 2017 15:18:05 -0700

> Commit 0a5539f66133 ("bpf: Provide a linux/types.h override
> for bpf selftests.") caused a build failure for tools/testing/selftest/bpf
> because of some missing types:
> $ make -C tools/testing/selftests/bpf/
> ...
> In file included from 
> /home/yhs/work/net-next/tools/testing/selftests/bpf/test_pkt_access.c:8:
> ../../../include/uapi/linux/bpf.h:170:3: error: unknown type name 
> '__aligned_u64'
> __aligned_u64   key;
> ...
> /usr/include/linux/swab.h:160:8: error: unknown type name 
> '__always_inline'
> static __always_inline __u16 __swab16p(const __u16 *p)
> ...
> The type __aligned_u64 is defined in linux:include/uapi/linux/types.h.
> 
> The fix is to copy missing type definition into
> tools/testing/selftests/bpf/include/uapi/linux/types.h.
> Adding additional include "string.h" resolves __always_inline issue.
> 
> Fixes: 0a5539f66133 ("bpf: Provide a linux/types.h override for bpf 
> selftests.")
> Signed-off-by: Yonghong Song 

Applied, thank you.


RE: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-17 Thread Parav Pandit
Hi Doug,

> -Original Message-
> From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-
> ow...@vger.kernel.org] On Behalf Of Doug Ledford
> Sent: Wednesday, May 17, 2017 3:38 PM
> To: David Miller 
> Cc: bart.vanass...@sandisk.com; torva...@linux-foundation.org;
> h...@lst.de; netdev@vger.kernel.org; linux-r...@vger.kernel.org;
> sta...@vger.kernel.org; ubr...@linux.vnet.ibm.com
> Subject: Re: [PATCH] net/smc: mark as BROKEN due to remote memory
> exposure
> 
> On Tue, 2017-05-16 at 15:28 -0400, Doug Ledford wrote:
> > I hadn't realized EXPERIMENTAL was gone.  Which is too bad, because
> > that's entirely appropriate in this case, and would have had the
> > desired side effect of keeping it out of any non-cutting edge distros
> > and warning people of possible API changes.  With EXPERIMENTAL gone,
> > the closest thing we have is drivers/staging, since that tends to
> > imply some of the same consequences.  I know you think BROKEN is
> > overly harsh, but I'm not sure we should just do nothing.  How about
> > we take a few days to let some of the RDMA people closely review the
> > 143 page
> > (egads!) rfc (http://www.rfc-editor.org/info/rfc7609) to see if we
> > think it can be fixed to use multiple link layers with the existing
> > API in place or if it will require something other than AF_SMC.  If we
> > need to break API, then I think we should either fix it ASAP and send
> > that fix to the 4.11 stable series (which probably violates the
> > normative stable patch size/scope) or if the fix will take longer than
> > this kernel cycle, then move it to staging both here and in 4.11
> > stable, and fix it there and then move it back.  Something like that
> > would prevent the kind of API flappage we ought not do
> 
> So, I've skimmed the entire RFC, focusing on things were I needed to.
>  Here's my take on it:
> 
> It would have been better with AF_INET/AF_INET6 and an option to enable
> SMC than AF_SMC.  The first implementation simply assumes AF_INET in
> the presence of AF_SMC.  When IPv6 support is added, some sort of
> guessing logic will have to be put in place to try and determine if an
> AF_SMC address is actually AF_INET or AF_INET6 since we won't have a
> guaranteed way of telling.  Apps can use struct sockaddr_storage as their
> normal element to stick the address into, and could rely on the kernel to
> interpret it properly based on the AF_INET/AF_INET6 differentiation, and
> this breaks that.  The RFC gives *some* thought to adding IPv6 in the
> future, but not a lot.  It may be that the answer is that in the future, IPv6
> support is enabled by making the IPv6 API be
> AF_INET6 + setsockopt(SMC) or the equivalent.  If that's the case, then I
> would suggest making the later API specifically call out AF_INET +
> setsockopt(SMC) be identical to AF_SMC.
> 

What are the shortcomings in my proposal [1] which I am reiterating below.
Bart also suggested to define new stream protocol for SMC similar to SCTP.

(a) address family should be either AF_INET or AF_INET6
(b) socket() API to introduce new protocol as PROTO_SMC for in socket() system 
call.

With this there is no additional setsockop() needed.

With this - user space applications, do getaddrinfo() with hint as
hints.ai_family  = AF_INET;
hints.ai_socktype = SOCK_STREAM;

getaddrinfo() returns back both the protocols TCP and SMC.
Famous database application such as Redis client iterates over all entries of 
getaddrinfo() and establishes connection to servers.

There are few advantages of this interface.
1. No change in any makefile of applications needed - unless user wants to 
specify explicitly that it wants to force SMC protocol.
2. No need to do LD_LIBRARY_PRELOAD, (which won't work anyway because bind() 
connect() of SMC checks for AF_INET).
3. No major changes to glibc to process AF_SMC differently
glibc references IPPROTO_TCP at 22 places. (compare to AF_INET at 140+ places).
A lot simpler test matrix for glibc for new protocol
5. No need to recompile applications, as long as getaddrinfo returns all 
streaming protocols (TCP, SMC)
6. for applications to make use of setsockopt() it needs another knob and hint 
from other places, which can be avoided because SMC TCP option negotiates with 
remote end

And representing new protocol as new protocol for a given address family 
appears correct, compare to setting socket options.

Tools like CRIU or similar tool might find a race conditions - when it queries 
socket option, SMC was not set, but later on SMC was set, and does incorrect 
handling.
Setting socket() with SMC protocol makes it easier to understand in this area 
as well.

I have additional proposal for link groups, resource creation area. I will take 
that up after this discussion.

[1] https://patchwork.kernel.org/patch/9719375/

> The protocol included a version header in the negotation messages.
>  This is good as it allows us to almost immediately start work on version 2
> that fixes the 

Re: [Patch RFC net-next] net: usb: r8152: Fix rx_bytes/tx_bytes to include FCS

2017-05-17 Thread Florian Fainelli
On 05/17/2017 01:23 PM, Andrew Lunn wrote:
> The statistics counters rx_bytes and tx_bytes don't include the
> Ethernet Frame Check Sequence. The hardware appears to calculate it on
> send, so it is not part of the URB passed to the device, and on
> receive it was not being copied into the skb.
> 
> Fix the rx_bytes/tx_bytes counters, and copy the FCS into the skb so
> tools like wireshark can see it.
> 
> Signed-off-by: Andrew Lunn 
> ---
> RFC for the following points:
> 
> I run automatic tests on Ethernet switches supported by DSA. I have a
> test host with many Ethernet interfaces which i connect to switch
> ports. Some are quad Ethernet PCIe cards, using the Intel e1000e
> driver. Some are USB-Ethernet dongles using the asix driver. And i
> have some USB-Ethernet dongles using the r8152. The e1000e, asix and
> DSA itself all give consistent rx_bytes and tx_bytes statistics. The
> r8152 consistently returns counters which are 4 bytes per frame lower,
> which results in some test failures.
> 
> Is it defined somewhere what should be included in rx_bytes/tx_bytes?
> Should the FCS be included?
> 
> This is considered a user API change? The meaning of the counters are
> going to be slightly different after this patch. I could be breaking
> somebody else's tests :-(

I am afraid that we won't be able to enforce a consistent behavior,
because the HW itself is not consistent, both on the NIC and on the
switch side.

For instance, whether your switch's MIB counters do include FCS, Marvell
tags, or not is going to be highly dependent on how the HW is designed,
whether the MIB counter logic is before, or after in the packet ingress
path.

Some NICs also allow passing the CRC frame up the stack (e.g:
NETIF_F_RXFCS), in which case, we should probably report the FCS as part
of skb->len, because that's what the stack is going to be given.

So I am afraid that as far as your tests are going to work, you will
need to have some logic that knows what kind of switch driver/HW is
used, whether to expect FCS/tag lengths reported, just like you will
have to have the same thing on the initiator of the tests

> ---
>  drivers/net/usb/r8152.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
> index ddc62cb69be8..4081a2cd8b1b 100644
> --- a/drivers/net/usb/r8152.c
> +++ b/drivers/net/usb/r8152.c
> @@ -1228,7 +1228,7 @@ static void write_bulk_callback(struct urb *urb)
>   stats->tx_errors += agg->skb_num;
>   } else {
>   stats->tx_packets += agg->skb_num;
> - stats->tx_bytes += agg->skb_len;
> + stats->tx_bytes += agg->skb_len + CRC_SIZE;
>   }
> 
>   spin_lock(>tx_lock);
> @@ -1826,7 +1826,6 @@ static int rx_bottom(struct r8152 *tp, int budget)
>   if (urb->actual_length < len_used)
>   break;
> 
> - pkt_len -= CRC_SIZE;
>   rx_data += sizeof(struct rx_desc);
> 
>   skb = napi_alloc_skb(napi, pkt_len);
> @@ -1850,7 +1849,7 @@ static int rx_bottom(struct r8152 *tp, int budget)
>   }
> 
>  find_next_rx:
> - rx_data = rx_agg_align(rx_data + pkt_len + CRC_SIZE);
> + rx_data = rx_agg_align(rx_data + pkt_len);
>   rx_desc = (struct rx_desc *)rx_data;
>   len_used = (int)(rx_data - (u8 *)agg->head);
>   len_used += sizeof(struct rx_desc);
> --
> 2.11.0
> ---
>  drivers/net/usb/r8152.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
> index ddc62cb69be8..4081a2cd8b1b 100644
> --- a/drivers/net/usb/r8152.c
> +++ b/drivers/net/usb/r8152.c
> @@ -1228,7 +1228,7 @@ static void write_bulk_callback(struct urb *urb)
>   stats->tx_errors += agg->skb_num;
>   } else {
>   stats->tx_packets += agg->skb_num;
> - stats->tx_bytes += agg->skb_len;
> + stats->tx_bytes += agg->skb_len + CRC_SIZE;
>   }
>  
>   spin_lock(>tx_lock);
> @@ -1826,7 +1826,6 @@ static int rx_bottom(struct r8152 *tp, int budget)
>   if (urb->actual_length < len_used)
>   break;
>  
> - pkt_len -= CRC_SIZE;
>   rx_data += sizeof(struct rx_desc);
>  
>   skb = napi_alloc_skb(napi, pkt_len);
> @@ -1850,7 +1849,7 @@ static int rx_bottom(struct r8152 *tp, int budget)
>   }
>  
>  find_next_rx:
> - rx_data = rx_agg_align(rx_data + pkt_len + CRC_SIZE);
> + rx_data = rx_agg_align(rx_data + pkt_len);
>   rx_desc = (struct rx_desc *)rx_data;
>   len_used = (int)(rx_data - (u8 *)agg->head);
>   len_used += sizeof(struct rx_desc);
> 


-- 

[PATCH] r8152: Mark usb_ocp_read() as __maybe_unused

2017-05-17 Thread Matthias Kaehlcke
The function is not used, but is probably kept around for debugging and
symmetry with usb_ocp_write(). Adding the attribute fixes the following
clang warning:

drivers/net/usb/r8152.c:825:5: error: unused function 'usb_ocp_read'
[-Werror,-Wunused-function]

Signed-off-by: Matthias Kaehlcke 
---
 drivers/net/usb/r8152.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index ddc62cb69be8..12776230ab96 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -840,7 +840,7 @@ int pla_ocp_write(struct r8152 *tp, u16 index, u16 byteen, 
u16 size, void *data)
return generic_ocp_write(tp, index, byteen, size, data, MCU_TYPE_PLA);
 }
 
-static inline
+static inline __maybe_unused
 int usb_ocp_read(struct r8152 *tp, u16 index, u16 size, void *data)
 {
return generic_ocp_read(tp, index, size, data, MCU_TYPE_USB);
-- 
2.13.0.303.g4ebf302169-goog



[PATCH] net1080: Mark nc_dump_ttl() as __maybe_unused

2017-05-17 Thread Matthias Kaehlcke
The function is not used, but it looks useful for debugging. Adding the
attribute fixes the following clang warning:

drivers/net/usb/net1080.c:271:20: error: unused function
'nc_dump_ttl' [-Werror,-Wunused-function]

Signed-off-by: Matthias Kaehlcke 
---
 drivers/net/usb/net1080.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/usb/net1080.c b/drivers/net/usb/net1080.c
index 4cbdb1307f3e..7ade2119f462 100644
--- a/drivers/net/usb/net1080.c
+++ b/drivers/net/usb/net1080.c
@@ -268,7 +268,7 @@ static inline void nc_dump_status(struct usbnet *dev, u16 
status)
 #defineTTL_OTHER(ttl)  (0x00ff & (ttl >> 8))
 #define MK_TTL(this,other) ((u16)(((other)<<8)|(0x00ff&(this
 
-static inline void nc_dump_ttl(struct usbnet *dev, u16 ttl)
+static inline void __maybe_unused nc_dump_ttl(struct usbnet *dev, u16 ttl)
 {
netif_dbg(dev, link, dev->net, "net1080 %s-%s ttl 0x%x this = %d, other 
= %d\n",
  dev->udev->bus->bus_name, dev->udev->devpath,
-- 
2.13.0.303.g4ebf302169-goog



[PATCH net v2] selftests/bpf: fix broken build due to types.h

2017-05-17 Thread Yonghong Song
Commit 0a5539f66133 ("bpf: Provide a linux/types.h override
for bpf selftests.") caused a build failure for tools/testing/selftest/bpf
because of some missing types:
$ make -C tools/testing/selftests/bpf/
...
In file included from 
/home/yhs/work/net-next/tools/testing/selftests/bpf/test_pkt_access.c:8:
../../../include/uapi/linux/bpf.h:170:3: error: unknown type name 
'__aligned_u64'
__aligned_u64   key;
...
/usr/include/linux/swab.h:160:8: error: unknown type name '__always_inline'
static __always_inline __u16 __swab16p(const __u16 *p)
...
The type __aligned_u64 is defined in linux:include/uapi/linux/types.h.

The fix is to copy missing type definition into
tools/testing/selftests/bpf/include/uapi/linux/types.h.
Adding additional include "string.h" resolves __always_inline issue.

Fixes: 0a5539f66133 ("bpf: Provide a linux/types.h override for bpf selftests.")
Signed-off-by: Yonghong Song 
---
 tools/testing/selftests/bpf/include/uapi/linux/types.h | 16 
 tools/testing/selftests/bpf/test_pkt_access.c  |  1 +
 2 files changed, 17 insertions(+)

diff --git a/tools/testing/selftests/bpf/include/uapi/linux/types.h 
b/tools/testing/selftests/bpf/include/uapi/linux/types.h
index fbd16a7..5184184 100644
--- a/tools/testing/selftests/bpf/include/uapi/linux/types.h
+++ b/tools/testing/selftests/bpf/include/uapi/linux/types.h
@@ -3,4 +3,20 @@
 
 #include 
 
+/* copied from linux:include/uapi/linux/types.h */
+#define __bitwise
+typedef __u16 __bitwise __le16;
+typedef __u16 __bitwise __be16;
+typedef __u32 __bitwise __le32;
+typedef __u32 __bitwise __be32;
+typedef __u64 __bitwise __le64;
+typedef __u64 __bitwise __be64;
+
+typedef __u16 __bitwise __sum16;
+typedef __u32 __bitwise __wsum;
+
+#define __aligned_u64 __u64 __attribute__((aligned(8)))
+#define __aligned_be64 __be64 __attribute__((aligned(8)))
+#define __aligned_le64 __le64 __attribute__((aligned(8)))
+
 #endif /* _UAPI_LINUX_TYPES_H */
diff --git a/tools/testing/selftests/bpf/test_pkt_access.c 
b/tools/testing/selftests/bpf/test_pkt_access.c
index 39387bb..6e11ba1 100644
--- a/tools/testing/selftests/bpf/test_pkt_access.c
+++ b/tools/testing/selftests/bpf/test_pkt_access.c
@@ -5,6 +5,7 @@
  * License as published by the Free Software Foundation.
  */
 #include 
+#include 
 #include 
 #include 
 #include 
-- 
2.9.3



[net-realtek-btcoexist] question about identical code for different branches

2017-05-17 Thread Gustavo A. R. Silva


Hello everybody,

While looking into Coverity ID 1362263 I ran into the following piece  
of code at  
drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.c:1000:


1000void exhalbtc_set_ant_num(struct rtl_priv *rtlpriv, u8 type, u8 ant_num)
1001{
1002if (BT_COEX_ANT_TYPE_PG == type) {
1003gl_bt_coexist.board_info.pg_ant_num = ant_num;
1004gl_bt_coexist.board_info.btdm_ant_num = ant_num;
1005/* The antenna position:
1006 * Main (default) or Aux for pgAntNum=2 && btdmAntNum =1.
1007 * The antenna position should be determined by
1008 * auto-detect mechanism.
1009 * The following is assumed to main,
1010 * and those must be modified
1011 * if y auto-detect mechanism is ready
1012 */
1013if ((gl_bt_coexist.board_info.pg_ant_num == 2) &&
1014(gl_bt_coexist.board_info.btdm_ant_num == 1))
1015gl_bt_coexist.board_info.btdm_ant_pos =
1016
BTC_ANTENNA_AT_MAIN_PORT;

1017else
1018gl_bt_coexist.board_info.btdm_ant_pos =
1019
BTC_ANTENNA_AT_MAIN_PORT;

1020} else if (BT_COEX_ANT_TYPE_ANTDIV == type) {
1021gl_bt_coexist.board_info.btdm_ant_num = ant_num;
1022gl_bt_coexist.board_info.btdm_ant_pos =
1023
BTC_ANTENNA_AT_MAIN_PORT;

1024} else if (type == BT_COEX_ANT_TYPE_DETECTED) {
1025gl_bt_coexist.board_info.btdm_ant_num = ant_num;
1026if (rtlpriv->cfg->mod_params->ant_sel == 1)
1027gl_bt_coexist.board_info.btdm_ant_pos =
1028BTC_ANTENNA_AT_AUX_PORT;
1029else
1030gl_bt_coexist.board_info.btdm_ant_pos =
1031BTC_ANTENNA_AT_MAIN_PORT;
1032}
1033}

The issue is that lines of code 1015-1016 and 1018-1019 are identical  
for different branches.


My question here is if one of those assignments should be modified, or  
the entire _if_ statement replaced and the function refactored?


I'd really appreciate any comment on this.

Thank you!
--
Gustavo A. R. Silva






[net-intel-i40e] question about assignment overwrite

2017-05-17 Thread Gustavo A. R. Silva


Hello everybody,

While looking into Coverity ID 1408956 I ran into the following piece  
of code at drivers/net/ethernet/intel/i40e/i40e_main.c:8807:


8807if (pf->hw.mac.type == I40E_MAC_X722) {
8808pf->flags |= I40E_FLAG_RSS_AQ_CAPABLE
8809 | I40E_FLAG_128_QP_RSS_CAPABLE
8810 | I40E_FLAG_HW_ATR_EVICT_CAPABLE
8811 | I40E_FLAG_OUTER_UDP_CSUM_CAPABLE
8812 | I40E_FLAG_WB_ON_ITR_CAPABLE
8813 | I40E_FLAG_MULTIPLE_TCP_UDP_RSS_PCTYPE
8814 | I40E_FLAG_NO_PCI_LINK_CHECK
8815 | I40E_FLAG_USE_SET_LLDP_MIB
8816 | I40E_FLAG_GENEVE_OFFLOAD_CAPABLE
8817 | I40E_FLAG_PTP_L4_CAPABLE
8818 | I40E_FLAG_WOL_MC_MAGIC_PKT_WAKE;
8819} else if ((pf->hw.aq.api_maj_ver > 1) ||
8820   ((pf->hw.aq.api_maj_ver == 1) &&
8821(pf->hw.aq.api_min_ver > 4))) {
8822/* Supported in FW API version higher than 1.4 */
8823pf->flags |= I40E_FLAG_GENEVE_OFFLOAD_CAPABLE;
8824pf->flags = I40E_FLAG_HW_ATR_EVICT_CAPABLE;
8825} else {
8826pf->flags = I40E_FLAG_HW_ATR_EVICT_CAPABLE;
8827}

The issue here is that the assignment at line 8823 is overwritten by  
the code at line 8824.


I'm suspicious that line 8824 should be remove and a patch like the  
following can be applied:


index d5c9c9e..48ffa73 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -8821,7 +8821,6 @@ static int i40e_sw_init(struct i40e_pf *pf)
(pf->hw.aq.api_min_ver > 4))) {
/* Supported in FW API version higher than 1.4 */
pf->flags |= I40E_FLAG_GENEVE_OFFLOAD_CAPABLE;
-   pf->flags = I40E_FLAG_HW_ATR_EVICT_CAPABLE;
} else {
pf->flags = I40E_FLAG_HW_ATR_EVICT_CAPABLE;
}

What do you think?

Thanks!
--
Gustavo A. R. Silva






Re: [PATCH net-next] drivers: net: xgene: Check all RGMII phy mode variants

2017-05-17 Thread Florian Fainelli
On 05/17/2017 02:29 PM, Iyappan Subramanian wrote:
> On Wed, May 17, 2017 at 1:26 PM, Andrew Lunn  wrote:
>>> +bool is_xgene_enet_phy_mode_rgmii(struct net_device *ndev)
>>> +{
>>> + struct xgene_enet_pdata *pdata = netdev_priv(ndev);
>>> + int phy_mode = pdata->phy_mode;
>>> + bool ret;
>>> +
>>> + ret = phy_mode == PHY_INTERFACE_MODE_RGMII ||
>>> +   phy_mode == PHY_INTERFACE_MODE_RGMII_ID ||
>>> +   phy_mode == PHY_INTERFACE_MODE_RGMII_RXID ||
>>> +   phy_mode == PHY_INTERFACE_MODE_RGMII_TXID;
>>> +
>>> + return ret;
>>> +}
>>
>> include/linux/phy.h:
>>
>> /**
>>  * phy_interface_is_rgmii - Convenience function for testing if a PHY 
>> interface
>>  * is RGMII (all variants)
>>  * @phydev: the phy_device struct
>>  */
>> static inline bool phy_interface_is_rgmii(struct phy_device *phydev)
>> {
>> return phydev->interface >= PHY_INTERFACE_MODE_RGMII &&
>> phydev->interface <= PHY_INTERFACE_MODE_RGMII_TXID;
>> };
> 
> Thanks.  I'll use this helper function.

If you can use it, that's great, the reason why I did not recommend
using it before was because it takes a phydev reference and your code
clearly could have run before connecting to the PHY device.

Alternatively, we could introduce a helper function that just checks a
phy_interface_t type, something like:

static inline bool phy_interface_is_rgmii(phy_interface_t mode)
{
...
}

and introduce:

static inline bool phydev_interface_is_rgmii(struct phy_device *phydev)
{
return phy_interface_is_rgmii(phydev->interface);
}

which would use this helper function internally. Or just drop the second
helper, and always pass phydev->interface where needed.
-- 
Florian


[PATCH 1/2] arp: decompose is_garp logic into a separate function

2017-05-17 Thread Ihar Hrachyshka
The code is quite involving already to earn a separate function for
itself. If anything, it helps arp_process readability.

Signed-off-by: Ihar Hrachyshka 
---
 net/ipv4/arp.c | 35 +++
 1 file changed, 23 insertions(+), 12 deletions(-)

diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c
index d54345a..3f06201 100644
--- a/net/ipv4/arp.c
+++ b/net/ipv4/arp.c
@@ -641,6 +641,27 @@ void arp_xmit(struct sk_buff *skb)
 }
 EXPORT_SYMBOL(arp_xmit);
 
+static bool arp_is_garp(struct net_device *dev, int addr_type,
+   __be16 ar_op,
+   __be32 sip, __be32 tip,
+   unsigned char *sha, unsigned char *tha)
+{
+   bool is_garp = tip == sip && addr_type == RTN_UNICAST;
+
+   /* Unsolicited ARP _replies_ also require target hwaddr to be
+* the same as source.
+*/
+   if (is_garp && ar_op == htons(ARPOP_REPLY))
+   is_garp =
+   /* IPv4 over IEEE 1394 doesn't provide target
+* hardware address field in its ARP payload.
+*/
+   tha &&
+   !memcmp(tha, sha, dev->addr_len);
+
+   return is_garp;
+}
+
 /*
  * Process an arp request.
  */
@@ -844,18 +865,8 @@ static int arp_process(struct net *net, struct sock *sk, 
struct sk_buff *skb)
   It is possible, that this option should be enabled for some
   devices (strip is candidate)
 */
-   is_garp = tip == sip && addr_type == RTN_UNICAST;
-
-   /* Unsolicited ARP _replies_ also require target hwaddr to be
-* the same as source.
-*/
-   if (is_garp && arp->ar_op == htons(ARPOP_REPLY))
-   is_garp =
-   /* IPv4 over IEEE 1394 doesn't provide target
-* hardware address field in its ARP payload.
-*/
-   tha &&
-   !memcmp(tha, sha, dev->addr_len);
+   is_garp = arp_is_garp(dev, addr_type, arp->ar_op,
+ sip, tip, sha, tha);
 
if (!n &&
((arp->ar_op == htons(ARPOP_REPLY)  &&
-- 
2.9.3



[PATCH 2/2] arp: always override existing neigh entries with gratuitous ARP

2017-05-17 Thread Ihar Hrachyshka
Currently, when arp_accept is 1, we always override existing neigh
entries with incoming gratuitous ARP replies. Otherwise, we override
them only if new replies satisfy _locktime_ conditional (packets arrive
not earlier than _locktime_ seconds since the last update to the neigh
entry).

The idea behind locktime is to pick the very first (=> close) reply
received in a unicast burst when ARP proxies are used. This helps to
avoid ARP thrashing where Linux would switch back and forth from one
proxy to another.

This logic has nothing to do with gratuitous ARP replies that are
generally not aligned in time when multiple IP address carriers send
them into network.

This patch enforces overriding of existing neigh entries by all incoming
gratuitous ARP packets, irrespective of their time of arrival. This will
make the kernel honour all incoming gratuitous ARP packets.

Signed-off-by: Ihar Hrachyshka 
---
 net/ipv4/arp.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c
index 3f06201..97ea9d8 100644
--- a/net/ipv4/arp.c
+++ b/net/ipv4/arp.c
@@ -858,16 +858,16 @@ static int arp_process(struct net *net, struct sock *sk, 
struct sk_buff *skb)
 
n = __neigh_lookup(_tbl, , dev, 0);
 
-   if (IN_DEV_ARP_ACCEPT(in_dev)) {
-   unsigned int addr_type = inet_addr_type_dev_table(net, dev, 
sip);
-
-   /* Unsolicited ARP is not accepted by default.
-  It is possible, that this option should be enabled for some
-  devices (strip is candidate)
-*/
+   if (n || IN_DEV_ARP_ACCEPT(in_dev)) {
+   addr_type = inet_addr_type_dev_table(net, dev, sip);
is_garp = arp_is_garp(dev, addr_type, arp->ar_op,
  sip, tip, sha, tha);
+   }
 
+   if (IN_DEV_ARP_ACCEPT(in_dev)) {
+   /* It is possible, that this option should be enabled for some
+* devices (strip is candidate)
+*/
if (!n &&
((arp->ar_op == htons(ARPOP_REPLY)  &&
addr_type == RTN_UNICAST) || is_garp))
-- 
2.9.3



[PATCH 0/2] arp: always override existing neigh entries with gratuitous ARP

2017-05-17 Thread Ihar Hrachyshka
Good day.

This patchset is spurred by discussion started at
https://patchwork.ozlabs.org/patch/760372/ where we figured that there is no
real reason for enforcing override by gratuitous ARP packets only when
arp_accept is 1. Same should happen when it's 0 (the default value).

The first patch in the series moves is_garp code into a separate function
preparing the code base for the 2nd patch that actually implements the needed
change.

Ihar Hrachyshka (2):
  arp: decompose is_garp logic into a separate function
  arp: always override existing neigh entries with gratuitous ARP

 net/ipv4/arp.c | 47 +--
 1 file changed, 29 insertions(+), 18 deletions(-)

-- 
2.9.3



[PATCH net-next 5/6] net: bridge: get msgtype from nlmsghdr in mdb ops

2017-05-17 Thread Vivien Didelot
Retrieve the message type from the nlmsghdr structure instead of
hardcoding it in both br_mdb_add and br_mdb_del.

Signed-off-by: Vivien Didelot 
---
 net/bridge/br_mdb.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index a72d5e6f339f..d280b20587cb 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -569,6 +569,7 @@ static int br_mdb_add(struct sk_buff *skb, struct nlmsghdr 
*nlh,
struct net_bridge_port *p;
struct net_bridge_vlan *v;
struct net_bridge *br;
+   int msgtype = nlh->nlmsg_type;
int err;
 
err = br_mdb_parse(skb, nlh, , );
@@ -595,12 +596,12 @@ static int br_mdb_add(struct sk_buff *skb, struct 
nlmsghdr *nlh,
if (br_vlan_enabled(br) && vg && entry->vid == 0) {
list_for_each_entry(v, >vlan_list, vlist) {
entry->vid = v->vid;
-   err = __br_mdb_do(p, entry, RTM_NEWMDB);
+   err = __br_mdb_do(p, entry, msgtype);
if (err)
break;
}
} else {
-   err = __br_mdb_do(p, entry, RTM_NEWMDB);
+   err = __br_mdb_do(p, entry, msgtype);
}
 
return err;
@@ -677,6 +678,7 @@ static int br_mdb_del(struct sk_buff *skb, struct nlmsghdr 
*nlh,
struct net_bridge_port *p;
struct net_bridge_vlan *v;
struct net_bridge *br;
+   int msgtype = nlh->nlmsg_type;
int err;
 
err = br_mdb_parse(skb, nlh, , );
@@ -703,12 +705,12 @@ static int br_mdb_del(struct sk_buff *skb, struct 
nlmsghdr *nlh,
if (br_vlan_enabled(br) && vg && entry->vid == 0) {
list_for_each_entry(v, >vlan_list, vlist) {
entry->vid = v->vid;
-   err = __br_mdb_do(p, entry, RTM_DELMDB);
+   err = __br_mdb_do(p, entry, msgtype);
if (err)
break;
}
} else {
-   err = __br_mdb_do(p, entry, RTM_DELMDB);
+   err = __br_mdb_do(p, entry, msgtype);
}
 
return err;
-- 
2.13.0



[PATCH net-next 2/6] net: bridge: check multicast bridge only once

2017-05-17 Thread Vivien Didelot
__br_mdb_add and __br_mdb_del both check that the bridge is up and
running and that multicast is enabled on it before doing any operation.

Since they can be called multiple times by br_mdb_add and br_mdb_del,
move the check in their caller functions so that it is done just once.

Signed-off-by: Vivien Didelot 
---
 net/bridge/br_mdb.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index bef0331f46a4..d20a01622b20 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -548,9 +548,6 @@ static int __br_mdb_add(struct net_bridge_port *p, struct 
br_mdb_entry *entry)
struct br_ip ip;
int ret;
 
-   if (!netif_running(br->dev) || br->multicast_disabled)
-   return -EINVAL;
-
__mdb_entry_to_br_ip(entry, );
 
spin_lock_bh(>multicast_lock);
@@ -577,6 +574,9 @@ static int br_mdb_add(struct sk_buff *skb, struct nlmsghdr 
*nlh,
 
br = netdev_priv(dev);
 
+   if (!netif_running(br->dev) || br->multicast_disabled)
+   return -EINVAL;
+
/* If vlan filtering is enabled and VLAN is not specified
 * install mdb entry on all vlans configured on the port.
 */
@@ -615,9 +615,6 @@ static int __br_mdb_del(struct net_bridge *br, struct 
br_mdb_entry *entry)
struct br_ip ip;
int err = -EINVAL;
 
-   if (!netif_running(br->dev) || br->multicast_disabled)
-   return -EINVAL;
-
__mdb_entry_to_br_ip(entry, );
 
spin_lock_bh(>multicast_lock);
@@ -672,6 +669,9 @@ static int br_mdb_del(struct sk_buff *skb, struct nlmsghdr 
*nlh,
 
br = netdev_priv(dev);
 
+   if (!netif_running(br->dev) || br->multicast_disabled)
+   return -EINVAL;
+
/* If vlan filtering is enabled and VLAN is not specified
 * delete mdb entry on all vlans configured on the port.
 */
-- 
2.13.0



[PATCH net-next 4/6] net: bridge: add __br_mdb_do

2017-05-17 Thread Vivien Didelot
br_mdb_add and br_mdb_del call respectively __br_mdb_add and
__br_mdb_del and notify the message type on success.

Factorize this in a new __br_mdb_do function which add or delete a
br_mdb_entry depending on the message type, then notify it.

Note that the dev argument is p->br->dev, so no need to pass it twice.

Signed-off-by: Vivien Didelot 
---
 net/bridge/br_mdb.c | 33 +++--
 1 file changed, 23 insertions(+), 10 deletions(-)

diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index 24fb4179..a72d5e6f339f 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -556,6 +556,9 @@ static int __br_mdb_add(struct net_bridge_port *p, struct 
br_mdb_entry *entry)
return ret;
 }
 
+static int __br_mdb_do(struct net_bridge_port *p, struct br_mdb_entry *entry,
+  int msgtype);
+
 static int br_mdb_add(struct sk_buff *skb, struct nlmsghdr *nlh,
  struct netlink_ext_ack *extack)
 {
@@ -592,15 +595,12 @@ static int br_mdb_add(struct sk_buff *skb, struct 
nlmsghdr *nlh,
if (br_vlan_enabled(br) && vg && entry->vid == 0) {
list_for_each_entry(v, >vlan_list, vlist) {
entry->vid = v->vid;
-   err = __br_mdb_add(p, entry);
+   err = __br_mdb_do(p, entry, RTM_NEWMDB);
if (err)
break;
-   __br_mdb_notify(dev, p, entry, RTM_NEWMDB);
}
} else {
-   err = __br_mdb_add(p, entry);
-   if (!err)
-   __br_mdb_notify(dev, p, entry, RTM_NEWMDB);
+   err = __br_mdb_do(p, entry, RTM_NEWMDB);
}
 
return err;
@@ -651,6 +651,22 @@ static int __br_mdb_del(struct net_bridge *br, struct 
br_mdb_entry *entry)
return err;
 }
 
+static int __br_mdb_do(struct net_bridge_port *p, struct br_mdb_entry *entry,
+  int msgtype)
+{
+   int err = -EINVAL;
+
+   if (msgtype == RTM_NEWMDB)
+   err = __br_mdb_add(p, entry);
+   else if (msgtype == RTM_DELMDB)
+   err = __br_mdb_del(p->br, entry);
+
+   if (!err)
+   __br_mdb_notify(p->br->dev, p, entry, msgtype);
+
+   return err;
+}
+
 static int br_mdb_del(struct sk_buff *skb, struct nlmsghdr *nlh,
  struct netlink_ext_ack *extack)
 {
@@ -687,15 +703,12 @@ static int br_mdb_del(struct sk_buff *skb, struct 
nlmsghdr *nlh,
if (br_vlan_enabled(br) && vg && entry->vid == 0) {
list_for_each_entry(v, >vlan_list, vlist) {
entry->vid = v->vid;
-   err = __br_mdb_del(br, entry);
+   err = __br_mdb_do(p, entry, RTM_DELMDB);
if (err)
break;
-   __br_mdb_notify(dev, p, entry, RTM_DELMDB);
}
} else {
-   err = __br_mdb_del(br, entry);
-   if (!err)
-   __br_mdb_notify(dev, p, entry, RTM_DELMDB);
+   err = __br_mdb_do(p, entry, RTM_DELMDB);
}
 
return err;
-- 
2.13.0



[PATCH net-next 3/6] net: bridge: break if __br_mdb_del fails

2017-05-17 Thread Vivien Didelot
Be symmetric with br_mdb_add and break if __br_mdb_del returns an error.

Signed-off-by: Vivien Didelot 
---
 net/bridge/br_mdb.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index d20a01622b20..24fb4179 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -688,8 +688,9 @@ static int br_mdb_del(struct sk_buff *skb, struct nlmsghdr 
*nlh,
list_for_each_entry(v, >vlan_list, vlist) {
entry->vid = v->vid;
err = __br_mdb_del(br, entry);
-   if (!err)
-   __br_mdb_notify(dev, p, entry, RTM_DELMDB);
+   if (err)
+   break;
+   __br_mdb_notify(dev, p, entry, RTM_DELMDB);
}
} else {
err = __br_mdb_del(br, entry);
-- 
2.13.0



[PATCH net-next 6/6] net: bridge: add br_mdb_do

2017-05-17 Thread Vivien Didelot
Now that the two functions br_mdb_add and br_mdb_del are identical, keep
a single function named br_mdb_do and register it for both RTM_NEWMDB
and RTM_DELMDB message types.

Signed-off-by: Vivien Didelot 
---
 net/bridge/br_mdb.c | 61 +
 1 file changed, 5 insertions(+), 56 deletions(-)

diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index d280b20587cb..1a1da25f3727 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -556,57 +556,6 @@ static int __br_mdb_add(struct net_bridge_port *p, struct 
br_mdb_entry *entry)
return ret;
 }
 
-static int __br_mdb_do(struct net_bridge_port *p, struct br_mdb_entry *entry,
-  int msgtype);
-
-static int br_mdb_add(struct sk_buff *skb, struct nlmsghdr *nlh,
- struct netlink_ext_ack *extack)
-{
-   struct net *net = sock_net(skb->sk);
-   struct net_bridge_vlan_group *vg;
-   struct net_device *dev, *pdev;
-   struct br_mdb_entry *entry;
-   struct net_bridge_port *p;
-   struct net_bridge_vlan *v;
-   struct net_bridge *br;
-   int msgtype = nlh->nlmsg_type;
-   int err;
-
-   err = br_mdb_parse(skb, nlh, , );
-   if (err < 0)
-   return err;
-
-   br = netdev_priv(dev);
-
-   if (!netif_running(br->dev) || br->multicast_disabled)
-   return -EINVAL;
-
-   /* If vlan filtering is enabled and VLAN is not specified
-* install mdb entry on all vlans configured on the port.
-*/
-   pdev = __dev_get_by_index(net, entry->ifindex);
-   if (!pdev)
-   return -ENODEV;
-
-   p = br_port_get_rtnl(pdev);
-   if (!p || p->br != br || p->state == BR_STATE_DISABLED)
-   return -EINVAL;
-
-   vg = nbp_vlan_group(p);
-   if (br_vlan_enabled(br) && vg && entry->vid == 0) {
-   list_for_each_entry(v, >vlan_list, vlist) {
-   entry->vid = v->vid;
-   err = __br_mdb_do(p, entry, msgtype);
-   if (err)
-   break;
-   }
-   } else {
-   err = __br_mdb_do(p, entry, msgtype);
-   }
-
-   return err;
-}
-
 static int __br_mdb_del(struct net_bridge *br, struct br_mdb_entry *entry)
 {
struct net_bridge_mdb_htable *mdb;
@@ -668,8 +617,8 @@ static int __br_mdb_do(struct net_bridge_port *p, struct 
br_mdb_entry *entry,
return err;
 }
 
-static int br_mdb_del(struct sk_buff *skb, struct nlmsghdr *nlh,
- struct netlink_ext_ack *extack)
+static int br_mdb_do(struct sk_buff *skb, struct nlmsghdr *nlh,
+struct netlink_ext_ack *extack)
 {
struct net *net = sock_net(skb->sk);
struct net_bridge_vlan_group *vg;
@@ -691,7 +640,7 @@ static int br_mdb_del(struct sk_buff *skb, struct nlmsghdr 
*nlh,
return -EINVAL;
 
/* If vlan filtering is enabled and VLAN is not specified
-* delete mdb entry on all vlans configured on the port.
+* add or delete mdb entry on all vlans configured on the port.
 */
pdev = __dev_get_by_index(net, entry->ifindex);
if (!pdev)
@@ -719,8 +668,8 @@ static int br_mdb_del(struct sk_buff *skb, struct nlmsghdr 
*nlh,
 void br_mdb_init(void)
 {
rtnl_register(PF_BRIDGE, RTM_GETMDB, NULL, br_mdb_dump, NULL);
-   rtnl_register(PF_BRIDGE, RTM_NEWMDB, br_mdb_add, NULL, NULL);
-   rtnl_register(PF_BRIDGE, RTM_DELMDB, br_mdb_del, NULL, NULL);
+   rtnl_register(PF_BRIDGE, RTM_NEWMDB, br_mdb_do, NULL, NULL);
+   rtnl_register(PF_BRIDGE, RTM_DELMDB, br_mdb_do, NULL, NULL);
 }
 
 void br_mdb_uninit(void)
-- 
2.13.0



[PATCH net-next 0/6] net: bridge: factorize MDB new and del functions

2017-05-17 Thread Vivien Didelot
The br_mdb.c file implements both br_mdb_add and br_mdb_del functions, to
handle respectively the RTM_NEWMDB and RTM_DELMDB message types.

But these two functions are really similar. This patch series factorizes
them into a single br_mdb_do rtnl_doit_func function to reduce
boilerplate and simplify the MDB handling code.

Vivien Didelot (6):
  net: bridge: pass net_bridge_port to __br_mdb_add
  net: bridge: check multicast bridge only once
  net: bridge: break if __br_mdb_del fails
  net: bridge: add __br_mdb_do
  net: bridge: get msgtype from nlmsghdr in mdb ops
  net: bridge: add br_mdb_do

 net/bridge/br_mdb.c | 107 +++-
 1 file changed, 31 insertions(+), 76 deletions(-)

-- 
2.13.0



[PATCH net-next 1/6] net: bridge: pass net_bridge_port to __br_mdb_add

2017-05-17 Thread Vivien Didelot
The __br_mdb_add function uses exactly the same mechanism as its caller
br_mdb_add to retrieve the net_bridge_port pointer from br_mdb_entry.

Remove it and pass directly the net_bridge_port pointer to __br_mdb_add.

Signed-off-by: Vivien Didelot 
---
 net/bridge/br_mdb.c | 18 --
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index b0845480a3ae..bef0331f46a4 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -542,25 +542,15 @@ static int br_mdb_add_group(struct net_bridge *br, struct 
net_bridge_port *port,
return 0;
 }
 
-static int __br_mdb_add(struct net *net, struct net_bridge *br,
-   struct br_mdb_entry *entry)
+static int __br_mdb_add(struct net_bridge_port *p, struct br_mdb_entry *entry)
 {
+   struct net_bridge *br = p->br;
struct br_ip ip;
-   struct net_device *dev;
-   struct net_bridge_port *p;
int ret;
 
if (!netif_running(br->dev) || br->multicast_disabled)
return -EINVAL;
 
-   dev = __dev_get_by_index(net, entry->ifindex);
-   if (!dev)
-   return -ENODEV;
-
-   p = br_port_get_rtnl(dev);
-   if (!p || p->br != br || p->state == BR_STATE_DISABLED)
-   return -EINVAL;
-
__mdb_entry_to_br_ip(entry, );
 
spin_lock_bh(>multicast_lock);
@@ -602,13 +592,13 @@ static int br_mdb_add(struct sk_buff *skb, struct 
nlmsghdr *nlh,
if (br_vlan_enabled(br) && vg && entry->vid == 0) {
list_for_each_entry(v, >vlan_list, vlist) {
entry->vid = v->vid;
-   err = __br_mdb_add(net, br, entry);
+   err = __br_mdb_add(p, entry);
if (err)
break;
__br_mdb_notify(dev, p, entry, RTM_NEWMDB);
}
} else {
-   err = __br_mdb_add(net, br, entry);
+   err = __br_mdb_add(p, entry);
if (!err)
__br_mdb_notify(dev, p, entry, RTM_NEWMDB);
}
-- 
2.13.0



Re: [PATCH net-next] drivers: net: xgene: Check all RGMII phy mode variants

2017-05-17 Thread Iyappan Subramanian
On Wed, May 17, 2017 at 1:26 PM, Andrew Lunn  wrote:
>> +bool is_xgene_enet_phy_mode_rgmii(struct net_device *ndev)
>> +{
>> + struct xgene_enet_pdata *pdata = netdev_priv(ndev);
>> + int phy_mode = pdata->phy_mode;
>> + bool ret;
>> +
>> + ret = phy_mode == PHY_INTERFACE_MODE_RGMII ||
>> +   phy_mode == PHY_INTERFACE_MODE_RGMII_ID ||
>> +   phy_mode == PHY_INTERFACE_MODE_RGMII_RXID ||
>> +   phy_mode == PHY_INTERFACE_MODE_RGMII_TXID;
>> +
>> + return ret;
>> +}
>
> include/linux/phy.h:
>
> /**
>  * phy_interface_is_rgmii - Convenience function for testing if a PHY 
> interface
>  * is RGMII (all variants)
>  * @phydev: the phy_device struct
>  */
> static inline bool phy_interface_is_rgmii(struct phy_device *phydev)
> {
> return phydev->interface >= PHY_INTERFACE_MODE_RGMII &&
> phydev->interface <= PHY_INTERFACE_MODE_RGMII_TXID;
> };

Thanks.  I'll use this helper function.

>
> Andrew


Re: [PATCH net-next 1/3] net: dsa: mv88e6xxx: Refactor mv88e6352 SERDES code into an op

2017-05-17 Thread Vivien Didelot
Hi Andrew,

Andrew Lunn  writes:

>  static int mv88e6xxx_phy_page_get(struct mv88e6xxx_chip *chip, int phy, u8 
> page)
>  {
> - if (!mv88e6xxx_has(chip, MV88E6XXX_FLAG_PHY_PAGE))
> - return -EOPNOTSUPP;
> -
>   return mv88e6xxx_phy_write(chip, phy, PHY_PAGE, page);
>  }
>  
> @@ -300,8 +298,8 @@ static void mv88e6xxx_phy_page_put(struct mv88e6xxx_chip 
> *chip, int phy)
>   }
>  }
>  
> -static int mv88e6xxx_phy_page_read(struct mv88e6xxx_chip *chip, int phy,
> -u8 page, int reg, u16 *val)
> +int mv88e6xxx_phy_page_read(struct mv88e6xxx_chip *chip, int phy,
> + u8 page, int reg, u16 *val)
>  {
>   int err;
>  
> @@ -318,8 +316,8 @@ static int mv88e6xxx_phy_page_read(struct mv88e6xxx_chip 
> *chip, int phy,
>   return err;
>  }
>  
> -static int mv88e6xxx_phy_page_write(struct mv88e6xxx_chip *chip, int phy,
> - u8 page, int reg, u16 val)
> +int mv88e6xxx_phy_page_write(struct mv88e6xxx_chip *chip, int phy,
> +  u8 page, int reg, u16 val)

Hum, this patch does a bit too much... Please at least add a first one
providing phy.[ch] for the PHY Registers related functions, then add the
serdes_power operation.

> +#define MV88E6XXX_FLAG_G1_ATU_FIDBIT_ULL(MV88E6XXX_CAP_G1_ATU_FID)

This seems to be added back by mistake.

> @@ -889,6 +868,9 @@ struct mv88e6xxx_ops {
>  struct mv88e6xxx_vtu_entry *entry);
>   int (*vtu_loadpurge)(struct mv88e6xxx_chip *chip,
>struct mv88e6xxx_vtu_entry *entry);
> +
> + /* Power on/off a SERDES interface */
> + int (*serdes_power)(struct mv88e6xxx_chip *chip, int port, bool on);
>  };

While the get_* and set_* ops are an exception, we might want to sort
that at least before the vtu_* ops. But that is a minor comment.

> +#define MV88E6352_ADDR_SERDES0x0f
> +#define MV88E6352_SERDES_PAGE_FIBER  0x01

Why not moving this in the serdes.h header with the others?


Thank you,

  Vivien


Re: [PATCH net-next V5 0/9] vhost_net rx batch dequeuing

2017-05-17 Thread Michael S. Tsirkin
On Wed, May 17, 2017 at 12:14:36PM +0800, Jason Wang wrote:
> This series tries to implement rx batching for vhost-net. This is done
> by batching the dequeuing from skb_array which was exported by
> underlayer socket and pass the sbk back through msg_control to finish
> userspace copying. This is also the requirement for more batching
> implemention on rx path.
> 
> Tests shows at most 7.56% improvment bon rx pps on top of batch
> zeroing and no obvious changes for TCP_STREAM/TCP_RR result.
> 
> Please review.
> 
> Thanks

A surprisingly large gain for such as simple change.  It would be nice
to understand better why this helps - in particular, does the optimal
batch size change if ring is bigger or smaller? But let's merge it
meanwhile.

Series:

Acked-by: Michael S. Tsirkin 



> Changes from V4:
> - drop batch zeroing patch
> - renew the performance numbers
> - move skb pointer array out of vhost_net structure
> 
> Changes from V3:
> - add batch zeroing patch to fix the build warnings
> 
> Changes from V2:
> - rebase to net-next HEAD
> - use unconsume helpers to put skb back on releasing
> - introduce and use vhost_net internal buffer helpers
> - renew performance numbers on top of batch zeroing
> 
> Changes from V1:
> - switch to use for() in __ptr_ring_consume_batched()
> - rename peek_head_len_batched() to fetch_skbs()
> - use skb_array_consume_batched() instead of
>   skb_array_consume_batched_bh() since no consumer run in bh
> - drop the lockless peeking patch since skb_array could be resized, so
>   it's not safe to call lockless one
> 
> Jason Wang (8):
>   skb_array: introduce skb_array_unconsume
>   ptr_ring: introduce batch dequeuing
>   skb_array: introduce batch dequeuing
>   tun: export skb_array
>   tap: export skb_array
>   tun: support receiving skb through msg_control
>   tap: support receiving skb from msg_control
>   vhost_net: try batch dequing from skb array
> 
> Michael S. Tsirkin (1):
>   ptr_ring: add ptr_ring_unconsume
> 
>  drivers/net/tap.c |  25 +++--
>  drivers/net/tun.c |  31 ---
>  drivers/vhost/net.c   | 128 
> +++---
>  include/linux/if_tap.h|   5 ++
>  include/linux/if_tun.h|   5 ++
>  include/linux/ptr_ring.h  | 120 +++
>  include/linux/skb_array.h |  31 +++
>  7 files changed, 327 insertions(+), 18 deletions(-)
> 
> -- 
> 2.7.4


Re: [PATCH net-next 2/3] net: dsa: mv88e6xxx: mv88e6390X SERDES support

2017-05-17 Thread Vivien Didelot
Hi Andrew,

Andrew Lunn  writes:

> -static int mv88e6xxx_phy_read(struct mv88e6xxx_chip *chip, int phy,
> -   int reg, u16 *val)
> +int mv88e6xxx_phy_read(struct mv88e6xxx_chip *chip, int phy,
> +int reg, u16 *val)
>  {
>   int addr = phy; /* PHY devices addresses start at 0x0 */
>   struct mii_bus *bus;
> @@ -265,8 +265,8 @@ static int mv88e6xxx_phy_read(struct mv88e6xxx_chip 
> *chip, int phy,
>   return chip->info->ops->phy_read(chip, bus, addr, reg, val);
>  }
>  
> -static int mv88e6xxx_phy_write(struct mv88e6xxx_chip *chip, int phy,
> -int reg, u16 val)
> +int mv88e6xxx_phy_write(struct mv88e6xxx_chip *chip, int phy,
> + int reg, u16 val)

Please add a very first patch which expose the mv88e6xxx_phy_ (and page_
of patch 1/3) PHY related code in new phy.[ch] files. These are the
missing specific files for the PHY Registers sets ;-)

> + if (chip->info->ops->serdes_power)
> + return chip->info->ops->serdes_power(chip, port, true);
> +
> + return 0;
>  }
>  
>  static int mv88e6xxx_setup_message_port(struct mv88e6xxx_chip *chip, int 
> port)
> @@ -2068,15 +2071,6 @@ static int mv88e6xxx_setup_port(struct mv88e6xxx_chip 
> *chip, int port)
>   if (err)
>   return err;
>  
> - /* If this port is connected to a SerDes, make sure the SerDes is
> -  * powered up.
> -  */
> - if (chip->info->ops->serdes_power) {
> - err = chip->info->ops->serdes_power(chip, port, true);
> - if (err)
> - return err;
> - }
> -

All 3 patches moves this call around. Can we place it right once, the
first time a mv88e6xxx_serdes_power() helper is added?


Thank you,

Vivien


Re: [PATCH net-next 3/3] dsa: mv88e6xxx: Enable/Disable SERDES on port enable/disable

2017-05-17 Thread Vivien Didelot
Hi Andrew,

Good patchset overall, some comments below.

Andrew Lunn  writes:

> @@ -2168,7 +2165,44 @@ static int mv88e6xxx_setup_port(struct mv88e6xxx_chip 
> *chip, int port)
>   /* Default VLAN ID and priority: don't set a default VLAN
>* ID, and set the default packet priority to zero.
>*/
> - return mv88e6xxx_port_write(chip, port, PORT_DEFAULT_VLAN, 0x);
> + err = mv88e6xxx_port_write(chip, port, PORT_DEFAULT_VLAN, 0x);
> + if (err)
> + return err;
> +
> + /* Enable the SERDES interface for DSA and CPU ports. Normal
> +  * ports SERDES are enabled when the port is enabled, thus
> +  * saving a bit of power.
> +  */
> + if ((dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port)) &&
> + chip->info->ops->serdes_power)
> + err = chip->info->ops->serdes_power(chip, port, true);

Please provide a wrapper like this in chip.c in patch 1/3:

static int mv88e6xxx_serdes_power()
{
if (chip->info->ops->serdes_power)
return chip->info->ops->serdes_power();

return 0;
}

So that we don't check chip->info->ops that many times.

> +
> + return err;
> +}
> +
> +static int mv88e6xxx_port_enable(struct dsa_switch *ds, int port,
> +  struct phy_device *phydev)
> +{
> + struct mv88e6xxx_chip *chip = ds->priv;
> + int err = 0;
> +
> + mutex_lock(>reg_lock);
> + if (chip->info->ops->serdes_power)
> + err = chip->info->ops->serdes_power(chip, port, true);
> + mutex_unlock(>reg_lock);
> +
> + return err;
> +}
> +
> +static void mv88e6xxx_port_disable(struct dsa_switch *ds, int port,
> +struct phy_device *phydev)
> +{
> + struct mv88e6xxx_chip *chip = ds->priv;
> +
> + mutex_lock(>reg_lock);
> + if (chip->info->ops->serdes_power)
> + chip->info->ops->serdes_power(chip, port, false);
> + mutex_unlock(>reg_lock);
>  }

Also please split this in 2 patchs, one which setups serdes with the
port, one which implements port_enable/disable DSA ops.

Thanks,

Vivien


Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-17 Thread Doug Ledford
On Tue, 2017-05-16 at 15:28 -0400, Doug Ledford wrote:
> I hadn't realized EXPERIMENTAL was gone.  Which is too bad, because
> that's entirely appropriate in this case, and would have had the
> desired side effect of keeping it out of any non-cutting edge distros
> and warning people of possible API changes.  With EXPERIMENTAL gone,
> the closest thing we have is drivers/staging, since that tends to
> imply
> some of the same consequences.  I know you think BROKEN is overly
> harsh, but I'm not sure we should just do nothing.  How about we take
> a
> few days to let some of the RDMA people closely review the 143 page
> (egads!) rfc (http://www.rfc-editor.org/info/rfc7609) to see if we
> think it can be fixed to use multiple link layers with the existing
> API
> in place or if it will require something other than AF_SMC.  If we
> need
> to break API, then I think we should either fix it ASAP and send that
> fix to the 4.11 stable series (which probably violates the normative
> stable patch size/scope) or if the fix will take longer than this
> kernel cycle, then move it to staging both here and in 4.11 stable,
> and
> fix it there and then move it back.  Something like that would
> prevent
> the kind of API flappage we ought not do

So, I've skimmed the entire RFC, focusing on things were I needed to.
 Here's my take on it:

It would have been better with AF_INET/AF_INET6 and an option to enable
SMC than AF_SMC.  The first implementation simply assumes AF_INET in
the presence of AF_SMC.  When IPv6 support is added, some sort of
guessing logic will have to be put in place to try and determine if an
AF_SMC address is actually AF_INET or AF_INET6 since we won't have a
guaranteed way of telling.  Apps can use struct sockaddr_storage as
their normal element to stick the address into, and could rely on the
kernel to interpret it properly based on the AF_INET/AF_INET6
differentiation, and this breaks that.  The RFC gives *some* thought to
adding IPv6 in the future, but not a lot.  It may be that the answer is
that in the future, IPv6 support is enabled by making the IPv6 API be
AF_INET6 + setsockopt(SMC) or the equivalent.  If that's the case, then
I would suggest making the later API specifically call out AF_INET +
setsockopt(SMC) be identical to AF_SMC.

The protocol included a version header in the negotation messages.
 This is good as it allows us to almost immediately start work on
version 2 that fixes the shortcomings of version 1 while maintaining
back compatibility.

After reading the RFC, I can see why they only support one link layer:
RoCE.  The issue here is that they punted on the hard issue of doing
any sort of work to determine if security restrictions on the TCP
connection should also be applied to the RDMA connection.  The RFC
basically says "the RoCE link needs to have the same physical
restrictions (vlan) as the TCP/IP link so that the security limitations
are the same" because they don't do anything to check them essentially.
 For v2 of the protocol, and for different link layer support, this is
no longer sufficient, so there will be significant work to determine
the security context of specific TCP connections and then make sure
that they meet the security context of the RDMA links allowed.

Additionally, the same is likely to be true in terms of SELinux
options.  The addition of the IB/OPA link layers will throw a bit of a
monkey wrench in things because the SELinux control over those links is
slightly different than a normal TCP/IP SELinux control.  For instance,
you might create rules about processes and ports to make sure that the
httpd daemon can only access specific ports on TCP/IP, but on IB/OPA
you would need to create process and P_Key rules as IB/OPA don't have
the same port level semantics, and it's the P_Key on communications
that is enforced network wide, including in the switches, so
controlling what P_Key a process can send/receive on is your best way
to limit what a process can do.  Translation from one to the other
might be rather difficult to do in any sort of automated fashion.

There might have to be some additional work to get this to properly
account items for the RDMA control group elements that were recently
taken into the kernel.

Finally, the RDMA subsystem is in the process of switching to
structured option processing similar to how netlink does it.  For
version 2 of this protocol, since it will be interacting with the RDMA
core in many ways, will be simpler if it switches the on-wire
negotiation packets to follow the same methods as that will allow reuse
of code that it will have to have for properly dealing with the RDMA
subsystem in the future.

So, I'm fine with it being left as is since you accepted the patch that
makes note of the memory registration insecurity in the Kconfig text.
 The fact that this is a versioned protocol means that we can fix the
things we see as being wrong without having to have it all right from
the very start, it can be done 

Re: [PATCH net-next 0/6] net: phy: marvell: Checkpatch cleanup

2017-05-17 Thread David Miller
From: Andrew Lunn 
Date: Wed, 17 May 2017 03:25:58 +0200

> I will be contributing a few new features to the Marvell PHY driver
> soon. Start by making the code mostly checkpatch clean. There should
> not be any functional changes. Just comments set into the correct
> format, missing blank lines, turn some comparisons around, and
> refactoring to reduce indentation depth.
> 
> There is still one camel in the code, but it actually makes sense, so
> leave it in piece.

Series applied, thanks Andrew.


Re: [PATCH net-next] drivers: net: xgene: Check all RGMII phy mode variants

2017-05-17 Thread Andrew Lunn
> +bool is_xgene_enet_phy_mode_rgmii(struct net_device *ndev)
> +{
> + struct xgene_enet_pdata *pdata = netdev_priv(ndev);
> + int phy_mode = pdata->phy_mode;
> + bool ret;
> +
> + ret = phy_mode == PHY_INTERFACE_MODE_RGMII ||
> +   phy_mode == PHY_INTERFACE_MODE_RGMII_ID ||
> +   phy_mode == PHY_INTERFACE_MODE_RGMII_RXID ||
> +   phy_mode == PHY_INTERFACE_MODE_RGMII_TXID;
> +
> + return ret;
> +}

include/linux/phy.h:

/**
 * phy_interface_is_rgmii - Convenience function for testing if a PHY interface
 * is RGMII (all variants)
 * @phydev: the phy_device struct
 */
static inline bool phy_interface_is_rgmii(struct phy_device *phydev)
{
return phydev->interface >= PHY_INTERFACE_MODE_RGMII &&
phydev->interface <= PHY_INTERFACE_MODE_RGMII_TXID;
};

Andrew


[PATCH v2] e1000e: Don't return uninitialized stats

2017-05-17 Thread Benjamin Poirier
Some statistics passed to ethtool are garbage because e1000e_get_stats64()
doesn't write them, for example: tx_heartbeat_errors. This leaks kernel
memory to userspace and confuses users.

Do like ixgbe and use dev_get_stats() which first zeroes out
rtnl_link_stats64.

Fixes: 5944701df90d ("net: remove useless memset's in drivers get_stats64")
Reported-by: Stefan Priebe 
Signed-off-by: Benjamin Poirier 
---

Notes:
Changes v1->v2:
* add the Fixes tag

 drivers/net/ethernet/intel/e1000e/ethtool.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/e1000e/ethtool.c 
b/drivers/net/ethernet/intel/e1000e/ethtool.c
index e23dbd9190d6..5b4d97570896 100644
--- a/drivers/net/ethernet/intel/e1000e/ethtool.c
+++ b/drivers/net/ethernet/intel/e1000e/ethtool.c
@@ -2072,7 +2072,7 @@ static void e1000_get_ethtool_stats(struct net_device 
*netdev,
 
pm_runtime_get_sync(netdev->dev.parent);
 
-   e1000e_get_stats64(netdev, _stats);
+   dev_get_stats(netdev, _stats);
 
pm_runtime_put_sync(netdev->dev.parent);
 
-- 
2.12.2



[Patch RFC net-next] net: usb: r8152: Fix rx_bytes/tx_bytes to include FCS

2017-05-17 Thread Andrew Lunn
The statistics counters rx_bytes and tx_bytes don't include the
Ethernet Frame Check Sequence. The hardware appears to calculate it on
send, so it is not part of the URB passed to the device, and on
receive it was not being copied into the skb.

Fix the rx_bytes/tx_bytes counters, and copy the FCS into the skb so
tools like wireshark can see it.

Signed-off-by: Andrew Lunn 
---
RFC for the following points:

I run automatic tests on Ethernet switches supported by DSA. I have a
test host with many Ethernet interfaces which i connect to switch
ports. Some are quad Ethernet PCIe cards, using the Intel e1000e
driver. Some are USB-Ethernet dongles using the asix driver. And i
have some USB-Ethernet dongles using the r8152. The e1000e, asix and
DSA itself all give consistent rx_bytes and tx_bytes statistics. The
r8152 consistently returns counters which are 4 bytes per frame lower,
which results in some test failures.

Is it defined somewhere what should be included in rx_bytes/tx_bytes?
Should the FCS be included?

This is considered a user API change? The meaning of the counters are
going to be slightly different after this patch. I could be breaking
somebody else's tests :-(
---
 drivers/net/usb/r8152.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index ddc62cb69be8..4081a2cd8b1b 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -1228,7 +1228,7 @@ static void write_bulk_callback(struct urb *urb)
stats->tx_errors += agg->skb_num;
} else {
stats->tx_packets += agg->skb_num;
-   stats->tx_bytes += agg->skb_len;
+   stats->tx_bytes += agg->skb_len + CRC_SIZE;
}

spin_lock(>tx_lock);
@@ -1826,7 +1826,6 @@ static int rx_bottom(struct r8152 *tp, int budget)
if (urb->actual_length < len_used)
break;

-   pkt_len -= CRC_SIZE;
rx_data += sizeof(struct rx_desc);

skb = napi_alloc_skb(napi, pkt_len);
@@ -1850,7 +1849,7 @@ static int rx_bottom(struct r8152 *tp, int budget)
}

 find_next_rx:
-   rx_data = rx_agg_align(rx_data + pkt_len + CRC_SIZE);
+   rx_data = rx_agg_align(rx_data + pkt_len);
rx_desc = (struct rx_desc *)rx_data;
len_used = (int)(rx_data - (u8 *)agg->head);
len_used += sizeof(struct rx_desc);
--
2.11.0
---
 drivers/net/usb/r8152.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index ddc62cb69be8..4081a2cd8b1b 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -1228,7 +1228,7 @@ static void write_bulk_callback(struct urb *urb)
stats->tx_errors += agg->skb_num;
} else {
stats->tx_packets += agg->skb_num;
-   stats->tx_bytes += agg->skb_len;
+   stats->tx_bytes += agg->skb_len + CRC_SIZE;
}
 
spin_lock(>tx_lock);
@@ -1826,7 +1826,6 @@ static int rx_bottom(struct r8152 *tp, int budget)
if (urb->actual_length < len_used)
break;
 
-   pkt_len -= CRC_SIZE;
rx_data += sizeof(struct rx_desc);
 
skb = napi_alloc_skb(napi, pkt_len);
@@ -1850,7 +1849,7 @@ static int rx_bottom(struct r8152 *tp, int budget)
}
 
 find_next_rx:
-   rx_data = rx_agg_align(rx_data + pkt_len + CRC_SIZE);
+   rx_data = rx_agg_align(rx_data + pkt_len);
rx_desc = (struct rx_desc *)rx_data;
len_used = (int)(rx_data - (u8 *)agg->head);
len_used += sizeof(struct rx_desc);
-- 
2.11.0



Re: drivers/net/hamradio: divide error in hdlcdrv_ioctl

2017-05-17 Thread Alan Cox
On Tue, 16 May 2017 17:05:32 +0200
Andrey Konovalov  wrote:

> Hi,
> 
> I've got the following error report while fuzzing the kernel with syzkaller.
> 
> On commit 2ea659a9ef488125eb46da6eb571de5eae5c43f6 (4.12-rc1).
> 
> A reproducer and .config are attached.

This should fix it.

commit 37b3fa4b617681f00cfa1f76d6d7716cc6d9f79a
Author: Alan Cox 
Date:   Wed May 17 21:04:27 2017 +0100

hdlcdrv: Fix division by zero when bitrate is unset

The code attempts to check for out of range calibration. What it forgets to 
do
is check for the 0 bitrate case. As a result the range check itself oopses 
the
kernel.

Found by Andrey Konovalov using Syzkaller.

Signed-off-by: Alan Cox 

diff --git a/drivers/net/hamradio/hdlcdrv.c b/drivers/net/hamradio/hdlcdrv.c
index 8c3633c..9f34a48 100644
--- a/drivers/net/hamradio/hdlcdrv.c
+++ b/drivers/net/hamradio/hdlcdrv.c
@@ -576,7 +576,7 @@ static int hdlcdrv_ioctl(struct net_device *dev, struct 
ifreq *ifr, int cmd)
case HDLCDRVCTL_CALIBRATE:
if(!capable(CAP_SYS_RAWIO))
return -EPERM;
-   if (bi.data.calibrate > INT_MAX / s->par.bitrate)
+   if (!s->par.bitrate || bi.data.calibrate > INT_MAX / 
s->par.bitrate)
return -EINVAL;
s->hdlctx.calibrate = bi.data.calibrate * s->par.bitrate / 16;
return 0;


[PATCH net-next 1/3] net: dsa: mv88e6xxx: Refactor mv88e6352 SERDES code into an op

2017-05-17 Thread Andrew Lunn
The mv88e6390 family has a different SERDES implementation. Refactor
the mv88e6352 code into an ops function, so we can later add the
mv88e6390 code.

Signed-off-by: Andrew Lunn 
---
 drivers/net/dsa/mv88e6xxx/Makefile|  1 +
 drivers/net/dsa/mv88e6xxx/chip.c  | 62 ++--
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h | 35 +---
 drivers/net/dsa/mv88e6xxx/serdes.c| 77 +++
 drivers/net/dsa/mv88e6xxx/serdes.h| 21 ++
 5 files changed, 122 insertions(+), 74 deletions(-)
 create mode 100644 drivers/net/dsa/mv88e6xxx/serdes.c
 create mode 100644 drivers/net/dsa/mv88e6xxx/serdes.h

diff --git a/drivers/net/dsa/mv88e6xxx/Makefile 
b/drivers/net/dsa/mv88e6xxx/Makefile
index 6edd869c8d6f..1786d14cbc3a 100644
--- a/drivers/net/dsa/mv88e6xxx/Makefile
+++ b/drivers/net/dsa/mv88e6xxx/Makefile
@@ -3,5 +3,6 @@ mv88e6xxx-objs := chip.o
 mv88e6xxx-objs += global1.o
 mv88e6xxx-objs += global1_atu.o
 mv88e6xxx-objs += global1_vtu.o
+mv88e6xxx-objs += serdes.o
 mv88e6xxx-$(CONFIG_NET_DSA_MV88E6XXX_GLOBAL2) += global2.o
 mv88e6xxx-objs += port.o
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index d034d8cd7d22..6adaff3876b7 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -38,6 +38,7 @@
 #include "global1.h"
 #include "global2.h"
 #include "port.h"
+#include "serdes.h"
 
 static void assert_reg_lock(struct mv88e6xxx_chip *chip)
 {
@@ -282,9 +283,6 @@ static int mv88e6xxx_phy_write(struct mv88e6xxx_chip *chip, 
int phy,
 
 static int mv88e6xxx_phy_page_get(struct mv88e6xxx_chip *chip, int phy, u8 
page)
 {
-   if (!mv88e6xxx_has(chip, MV88E6XXX_FLAG_PHY_PAGE))
-   return -EOPNOTSUPP;
-
return mv88e6xxx_phy_write(chip, phy, PHY_PAGE, page);
 }
 
@@ -300,8 +298,8 @@ static void mv88e6xxx_phy_page_put(struct mv88e6xxx_chip 
*chip, int phy)
}
 }
 
-static int mv88e6xxx_phy_page_read(struct mv88e6xxx_chip *chip, int phy,
-  u8 page, int reg, u16 *val)
+int mv88e6xxx_phy_page_read(struct mv88e6xxx_chip *chip, int phy,
+   u8 page, int reg, u16 *val)
 {
int err;
 
@@ -318,8 +316,8 @@ static int mv88e6xxx_phy_page_read(struct mv88e6xxx_chip 
*chip, int phy,
return err;
 }
 
-static int mv88e6xxx_phy_page_write(struct mv88e6xxx_chip *chip, int phy,
-   u8 page, int reg, u16 val)
+int mv88e6xxx_phy_page_write(struct mv88e6xxx_chip *chip, int phy,
+u8 page, int reg, u16 val)
 {
int err;
 
@@ -336,18 +334,6 @@ static int mv88e6xxx_phy_page_write(struct mv88e6xxx_chip 
*chip, int phy,
return err;
 }
 
-static int mv88e6xxx_serdes_read(struct mv88e6xxx_chip *chip, int reg, u16 
*val)
-{
-   return mv88e6xxx_phy_page_read(chip, ADDR_SERDES, SERDES_PAGE_FIBER,
-  reg, val);
-}
-
-static int mv88e6xxx_serdes_write(struct mv88e6xxx_chip *chip, int reg, u16 
val)
-{
-   return mv88e6xxx_phy_page_write(chip, ADDR_SERDES, SERDES_PAGE_FIBER,
-   reg, val);
-}
-
 static void mv88e6xxx_g1_irq_mask(struct irq_data *d)
 {
struct mv88e6xxx_chip *chip = irq_data_get_irq_chip_data(d);
@@ -1951,24 +1937,6 @@ static int mv88e6xxx_switch_reset(struct mv88e6xxx_chip 
*chip)
return mv88e6xxx_software_reset(chip);
 }
 
-static int mv88e6xxx_serdes_power_on(struct mv88e6xxx_chip *chip)
-{
-   u16 val;
-   int err;
-
-   /* Clear Power Down bit */
-   err = mv88e6xxx_serdes_read(chip, MII_BMCR, );
-   if (err)
-   return err;
-
-   if (val & BMCR_PDOWN) {
-   val &= ~BMCR_PDOWN;
-   err = mv88e6xxx_serdes_write(chip, MII_BMCR, val);
-   }
-
-   return err;
-}
-
 static int mv88e6xxx_set_port_mode(struct mv88e6xxx_chip *chip, int port,
   enum mv88e6xxx_frame_mode frame, u16 egress,
   u16 etype)
@@ -2100,21 +2068,13 @@ static int mv88e6xxx_setup_port(struct mv88e6xxx_chip 
*chip, int port)
if (err)
return err;
 
-   /* If this port is connected to a SerDes, make sure the SerDes is not
-* powered down.
+   /* If this port is connected to a SerDes, make sure the SerDes is
+* powered up.
 */
-   if (mv88e6xxx_has(chip, MV88E6XXX_FLAGS_SERDES)) {
-   err = mv88e6xxx_port_read(chip, port, PORT_STATUS, );
+   if (chip->info->ops->serdes_power) {
+   err = chip->info->ops->serdes_power(chip, port, true);
if (err)
return err;
-   reg &= PORT_STATUS_CMODE_MASK;
-   if ((reg == PORT_STATUS_CMODE_100BASE_X) ||
-   (reg == PORT_STATUS_CMODE_1000BASE_X) ||
-   (reg == PORT_STATUS_CMODE_SGMII)) {
-   err = 

Re: [PATCH net-next 00/15] tcp: TCP TS option use 1 ms clock

2017-05-17 Thread David Miller
From: Eric Dumazet 
Date: Tue, 16 May 2017 13:59:59 -0700

> TCP Timestamps option is defined in RFC 7323
> 
> Traditionally on linux, it has been tied to the internal
> 'jiffy' variable, because it had been a cheap and good enough
> generator.
> 
> Unfortunately some distros use HZ=250 or even HZ=100 leading
> to not very useful TCP timestamps.
> 
> For TCP flows in the DC, Google has used usec resolution for more
> than two years with great success [1].
> RCVBUF autotuning is more precise.
> 
> This series converts tp->tcp_mstamp to a plain u64 value storing
> a 1 usec TCP clock.
> 
> This choice will allow us to upstream the 1 usec TS option as
> discussed in IETF 97.
> 
> Kathleen Nichols [2] and others advocate for 1ms TS clocks for
> network analysis. (1ms being the lowest value supported by RFC 7323.)
> 
> [1] 
> https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-tcp-options-for-low-latency-00.pdf
> [2] http://netseminar.stanford.edu/seminars/02_02_17.pdf

Series applied, thanks Eric.


[PATCH net-next 2/3] net: dsa: mv88e6xxx: mv88e6390X SERDES support

2017-05-17 Thread Andrew Lunn
The mv88e6390X family has 8 SERDES lanes. These can be used for 2
10Gbps ports, ports 9 or 10. If these ports are used at slower speeds,
the SERDES lanes become available for other ports for 1000Base-X.

Signed-off-by: Andrew Lunn 
---
 drivers/net/dsa/mv88e6xxx/chip.c  |  28 +++
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h |   4 +
 drivers/net/dsa/mv88e6xxx/serdes.c| 154 ++
 drivers/net/dsa/mv88e6xxx/serdes.h|  24 ++
 4 files changed, 196 insertions(+), 14 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 6adaff3876b7..067f64dcf1b6 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -249,8 +249,8 @@ static struct mii_bus *mv88e6xxx_default_mdio_bus(struct 
mv88e6xxx_chip *chip)
return mdio_bus->bus;
 }
 
-static int mv88e6xxx_phy_read(struct mv88e6xxx_chip *chip, int phy,
- int reg, u16 *val)
+int mv88e6xxx_phy_read(struct mv88e6xxx_chip *chip, int phy,
+  int reg, u16 *val)
 {
int addr = phy; /* PHY devices addresses start at 0x0 */
struct mii_bus *bus;
@@ -265,8 +265,8 @@ static int mv88e6xxx_phy_read(struct mv88e6xxx_chip *chip, 
int phy,
return chip->info->ops->phy_read(chip, bus, addr, reg, val);
 }
 
-static int mv88e6xxx_phy_write(struct mv88e6xxx_chip *chip, int phy,
-  int reg, u16 val)
+int mv88e6xxx_phy_write(struct mv88e6xxx_chip *chip, int phy,
+   int reg, u16 val)
 {
int addr = phy; /* PHY devices addresses start at 0x0 */
struct mii_bus *bus;
@@ -1996,7 +1996,10 @@ static int mv88e6xxx_setup_port_mode(struct 
mv88e6xxx_chip *chip, int port)
if (chip->info->tag_protocol == DSA_TAG_PROTO_EDSA)
return mv88e6xxx_set_port_mode_edsa(chip, port);
 
-   return -EINVAL;
+   if (chip->info->ops->serdes_power)
+   return chip->info->ops->serdes_power(chip, port, true);
+
+   return 0;
 }
 
 static int mv88e6xxx_setup_message_port(struct mv88e6xxx_chip *chip, int port)
@@ -2068,15 +2071,6 @@ static int mv88e6xxx_setup_port(struct mv88e6xxx_chip 
*chip, int port)
if (err)
return err;
 
-   /* If this port is connected to a SerDes, make sure the SerDes is
-* powered up.
-*/
-   if (chip->info->ops->serdes_power) {
-   err = chip->info->ops->serdes_power(chip, port, true);
-   if (err)
-   return err;
-   }
-
/* Port Control 2: don't force a good FCS, set the maximum frame size to
 * 10240 bytes, disable 802.1q tags checking, don't discard tagged or
 * untagged frames on this port, do a destination address lookup on all
@@ -2965,6 +2959,7 @@ static const struct mv88e6xxx_ops mv88e6190_ops = {
.reset = mv88e6352_g1_reset,
.vtu_getnext = mv88e6390_g1_vtu_getnext,
.vtu_loadpurge = mv88e6390_g1_vtu_loadpurge,
+   .serdes_power = mv88e6390_serdes_power,
 };
 
 static const struct mv88e6xxx_ops mv88e6190x_ops = {
@@ -2997,6 +2992,7 @@ static const struct mv88e6xxx_ops mv88e6190x_ops = {
.reset = mv88e6352_g1_reset,
.vtu_getnext = mv88e6390_g1_vtu_getnext,
.vtu_loadpurge = mv88e6390_g1_vtu_loadpurge,
+   .serdes_power = mv88e6390_serdes_power,
 };
 
 static const struct mv88e6xxx_ops mv88e6191_ops = {
@@ -3029,6 +3025,7 @@ static const struct mv88e6xxx_ops mv88e6191_ops = {
.reset = mv88e6352_g1_reset,
.vtu_getnext = mv88e6390_g1_vtu_getnext,
.vtu_loadpurge = mv88e6390_g1_vtu_loadpurge,
+   .serdes_power = mv88e6390_serdes_power,
 };
 
 static const struct mv88e6xxx_ops mv88e6240_ops = {
@@ -3096,6 +3093,7 @@ static const struct mv88e6xxx_ops mv88e6290_ops = {
.reset = mv88e6352_g1_reset,
.vtu_getnext = mv88e6390_g1_vtu_getnext,
.vtu_loadpurge = mv88e6390_g1_vtu_loadpurge,
+   .serdes_power = mv88e6390_serdes_power,
 };
 
 static const struct mv88e6xxx_ops mv88e6320_ops = {
@@ -3321,6 +3319,7 @@ static const struct mv88e6xxx_ops mv88e6390_ops = {
.reset = mv88e6352_g1_reset,
.vtu_getnext = mv88e6390_g1_vtu_getnext,
.vtu_loadpurge = mv88e6390_g1_vtu_loadpurge,
+   .serdes_power = mv88e6390_serdes_power,
 };
 
 static const struct mv88e6xxx_ops mv88e6390x_ops = {
@@ -3355,6 +3354,7 @@ static const struct mv88e6xxx_ops mv88e6390x_ops = {
.reset = mv88e6352_g1_reset,
.vtu_getnext = mv88e6390_g1_vtu_getnext,
.vtu_loadpurge = mv88e6390_g1_vtu_loadpurge,
+   .serdes_power = mv88e6390_serdes_power,
 };
 
 static const struct mv88e6xxx_info mv88e6xxx_table[] = {
diff --git a/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h 
b/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
index 30ccd50832ff..aa74705c46d3 100644
--- a/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
+++ b/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
@@ -928,4 +928,8 @@ int 

[PATCH net-next 3/3] dsa: mv88e6xxx: Enable/Disable SERDES on port enable/disable

2017-05-17 Thread Andrew Lunn
Implement the port enable/disable callbacks, which enable/disable the
SERDES interfaces, if applicable. This should save a bit of
power/heat.

Signed-off-by: Andrew Lunn 
---
 drivers/net/dsa/mv88e6xxx/chip.c | 44 
 1 file changed, 40 insertions(+), 4 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 067f64dcf1b6..c002869acaab 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -1996,9 +1996,6 @@ static int mv88e6xxx_setup_port_mode(struct 
mv88e6xxx_chip *chip, int port)
if (chip->info->tag_protocol == DSA_TAG_PROTO_EDSA)
return mv88e6xxx_set_port_mode_edsa(chip, port);
 
-   if (chip->info->ops->serdes_power)
-   return chip->info->ops->serdes_power(chip, port, true);
-
return 0;
 }
 
@@ -2168,7 +2165,44 @@ static int mv88e6xxx_setup_port(struct mv88e6xxx_chip 
*chip, int port)
/* Default VLAN ID and priority: don't set a default VLAN
 * ID, and set the default packet priority to zero.
 */
-   return mv88e6xxx_port_write(chip, port, PORT_DEFAULT_VLAN, 0x);
+   err = mv88e6xxx_port_write(chip, port, PORT_DEFAULT_VLAN, 0x);
+   if (err)
+   return err;
+
+   /* Enable the SERDES interface for DSA and CPU ports. Normal
+* ports SERDES are enabled when the port is enabled, thus
+* saving a bit of power.
+*/
+   if ((dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port)) &&
+   chip->info->ops->serdes_power)
+   err = chip->info->ops->serdes_power(chip, port, true);
+
+   return err;
+}
+
+static int mv88e6xxx_port_enable(struct dsa_switch *ds, int port,
+struct phy_device *phydev)
+{
+   struct mv88e6xxx_chip *chip = ds->priv;
+   int err = 0;
+
+   mutex_lock(>reg_lock);
+   if (chip->info->ops->serdes_power)
+   err = chip->info->ops->serdes_power(chip, port, true);
+   mutex_unlock(>reg_lock);
+
+   return err;
+}
+
+static void mv88e6xxx_port_disable(struct dsa_switch *ds, int port,
+  struct phy_device *phydev)
+{
+   struct mv88e6xxx_chip *chip = ds->priv;
+
+   mutex_lock(>reg_lock);
+   if (chip->info->ops->serdes_power)
+   chip->info->ops->serdes_power(chip, port, false);
+   mutex_unlock(>reg_lock);
 }
 
 static int mv88e6xxx_g1_set_switch_mac(struct mv88e6xxx_chip *chip, u8 *addr)
@@ -4023,6 +4057,8 @@ static const struct dsa_switch_ops mv88e6xxx_switch_ops = 
{
.get_strings= mv88e6xxx_get_strings,
.get_ethtool_stats  = mv88e6xxx_get_ethtool_stats,
.get_sset_count = mv88e6xxx_get_sset_count,
+   .port_enable= mv88e6xxx_port_enable,
+   .port_disable   = mv88e6xxx_port_disable,
.set_eee= mv88e6xxx_set_eee,
.get_eee= mv88e6xxx_get_eee,
.get_eeprom_len = mv88e6xxx_get_eeprom_len,
-- 
2.11.0



[PATCH net-next 0/3] net: dsa: mv88e6xxx: Add basic SERDES support

2017-05-17 Thread Andrew Lunn
Some of the Marvell switches are SERDES interface, which must be
powered up before packets can be passed. This is particularly true on
the 6390, where the SERDES defaults to down, probably to save power.

This series refactors the existing SERDES support for the 6352, and
adds 6390 support.

Andrew Lunn (3):
  net: dsa: mv88e6xxx: Refactor mv88e6352 SERDES code into an op
  net: dsa: mv88e6xxx: mv88e6390X SERDES support
  dsa: mv88e6xxx: Enable/Disable SERDES on port enable/disable

 drivers/net/dsa/mv88e6xxx/Makefile|   1 +
 drivers/net/dsa/mv88e6xxx/chip.c  | 120 +-
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h |  39 +++---
 drivers/net/dsa/mv88e6xxx/serdes.c| 231 ++
 drivers/net/dsa/mv88e6xxx/serdes.h|  45 +++
 5 files changed, 351 insertions(+), 85 deletions(-)
 create mode 100644 drivers/net/dsa/mv88e6xxx/serdes.c
 create mode 100644 drivers/net/dsa/mv88e6xxx/serdes.h

-- 
2.11.0



[PATCH] sch_dsmark: Fix uninitialized variable warning.

2017-05-17 Thread David Miller

We still need to initialize err to -EINVAL for
the case where 'opt' is NULL in dsmark_init().

Fixes: 6529eaba33f0 ("net: sched: introduce tcf block infractructure")
Signed-off-by: David S. Miller 
---
 net/sched/sch_dsmark.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sched/sch_dsmark.c b/net/sched/sch_dsmark.c
index ba45102..7ccdd82 100644
--- a/net/sched/sch_dsmark.c
+++ b/net/sched/sch_dsmark.c
@@ -333,7 +333,7 @@ static int dsmark_init(struct Qdisc *sch, struct nlattr 
*opt)
 {
struct dsmark_qdisc_data *p = qdisc_priv(sch);
struct nlattr *tb[TCA_DSMARK_MAX + 1];
-   int err;
+   int err = -EINVAL;
u32 default_index = NO_DEFAULT_INDEX;
u16 indices;
int i;
-- 
2.7.4



[PATCH net-next] drivers: net: xgene: Check all RGMII phy mode variants

2017-05-17 Thread Iyappan Subramanian
This patch addresses the review comment from the previous patch set,
by adding a helper function to address all RGMII phy mode variants.

Signed-off-by: Iyappan Subramanian 
Signed-off-by: Quan Nguyen 
---

Review comment reference:
http://www.spinics.net/lists/netdev/msg434649.html
---
 .../net/ethernet/apm/xgene/xgene_enet_ethtool.c|  6 ++---
 drivers/net/ethernet/apm/xgene/xgene_enet_hw.c | 26 +-
 drivers/net/ethernet/apm/xgene/xgene_enet_hw.h |  1 +
 drivers/net/ethernet/apm/xgene/xgene_enet_main.c   | 15 -
 4 files changed, 33 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_ethtool.c 
b/drivers/net/ethernet/apm/xgene/xgene_enet_ethtool.c
index 0fdec78..a0c9ddc 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_ethtool.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_ethtool.c
@@ -127,7 +127,7 @@ static int xgene_get_link_ksettings(struct net_device *ndev,
struct phy_device *phydev = ndev->phydev;
u32 supported;
 
-   if (pdata->phy_mode == PHY_INTERFACE_MODE_RGMII) {
+   if (is_xgene_enet_phy_mode_rgmii(ndev)) {
if (phydev == NULL)
return -ENODEV;
 
@@ -177,7 +177,7 @@ static int xgene_set_link_ksettings(struct net_device *ndev,
struct xgene_enet_pdata *pdata = netdev_priv(ndev);
struct phy_device *phydev = ndev->phydev;
 
-   if (pdata->phy_mode == PHY_INTERFACE_MODE_RGMII) {
+   if (is_xgene_enet_phy_mode_rgmii(ndev)) {
if (!phydev)
return -ENODEV;
 
@@ -304,7 +304,7 @@ static int xgene_set_pauseparam(struct net_device *ndev,
struct phy_device *phydev = ndev->phydev;
u32 oldadv, newadv;
 
-   if (pdata->phy_mode == PHY_INTERFACE_MODE_RGMII ||
+   if (is_xgene_enet_phy_mode_rgmii(ndev) ||
pdata->phy_mode == PHY_INTERFACE_MODE_SGMII) {
if (!phydev)
return -EINVAL;
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c 
b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
index 6ac27c7..00f313d 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
@@ -264,6 +264,20 @@ static void xgene_enet_wr_mcx_csr(struct xgene_enet_pdata 
*pdata,
iowrite32(val, addr);
 }
 
+bool is_xgene_enet_phy_mode_rgmii(struct net_device *ndev)
+{
+   struct xgene_enet_pdata *pdata = netdev_priv(ndev);
+   int phy_mode = pdata->phy_mode;
+   bool ret;
+
+   ret = phy_mode == PHY_INTERFACE_MODE_RGMII ||
+ phy_mode == PHY_INTERFACE_MODE_RGMII_ID ||
+ phy_mode == PHY_INTERFACE_MODE_RGMII_RXID ||
+ phy_mode == PHY_INTERFACE_MODE_RGMII_TXID;
+
+   return ret;
+}
+
 void xgene_enet_wr_mac(struct xgene_enet_pdata *pdata, u32 wr_addr, u32 
wr_data)
 {
void __iomem *addr, *wr, *cmd, *cmd_done;
@@ -272,7 +286,7 @@ void xgene_enet_wr_mac(struct xgene_enet_pdata *pdata, u32 
wr_addr, u32 wr_data)
u32 done;
 
if (pdata->mdio_driver && ndev->phydev &&
-   pdata->phy_mode == PHY_INTERFACE_MODE_RGMII) {
+   is_xgene_enet_phy_mode_rgmii(ndev)) {
struct mii_bus *bus = ndev->phydev->mdio.bus;
 
return xgene_mdio_wr_mac(bus->priv, wr_addr, wr_data);
@@ -326,12 +340,13 @@ static void xgene_enet_rd_mcx_csr(struct xgene_enet_pdata 
*pdata,
 u32 xgene_enet_rd_mac(struct xgene_enet_pdata *pdata, u32 rd_addr)
 {
void __iomem *addr, *rd, *cmd, *cmd_done;
+   struct net_device *ndev = pdata->ndev;
u32 done, rd_data;
u8 wait = 10;
 
-   if (pdata->mdio_driver && pdata->ndev->phydev &&
-   pdata->phy_mode == PHY_INTERFACE_MODE_RGMII) {
-   struct mii_bus *bus = pdata->ndev->phydev->mdio.bus;
+   if (pdata->mdio_driver && ndev->phydev &&
+   is_xgene_enet_phy_mode_rgmii(ndev)) {
+   struct mii_bus *bus = ndev->phydev->mdio.bus;
 
return xgene_mdio_rd_mac(bus->priv, rd_addr);
}
@@ -349,8 +364,7 @@ u32 xgene_enet_rd_mac(struct xgene_enet_pdata *pdata, u32 
rd_addr)
udelay(1);
 
if (!done)
-   netdev_err(pdata->ndev, "mac read failed, addr: %04x\n",
-  rd_addr);
+   netdev_err(ndev, "mac read failed, addr: %04x\n", rd_addr);
 
rd_data = ioread32(rd);
iowrite32(0, cmd);
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h 
b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h
index 5d3e18d..5c54f65 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h
@@ -431,6 +431,7 @@ static inline u16 xgene_enet_get_numslots(u16 id, u32 size)
  size / WORK_DESC_SIZE;
 }
 
+bool is_xgene_enet_phy_mode_rgmii(struct net_device *ndev);
 void xgene_enet_parse_error(struct xgene_enet_desc_ring 

[PATCH net-next 1/3] net: dsa: include dsa.h only once

2017-05-17 Thread Vivien Didelot
The public include/net/dsa.h file is meant for DSA drivers, while all
DSA core files share a common private header net/dsa/dsa_priv.h file.

Ensure that dsa_priv.h is the only DSA core file to include net/dsa.h,
and add a new line to separate absolute and relative headers at the same
time.

Signed-off-by: Vivien Didelot 
---
 net/dsa/dsa.c | 2 +-
 net/dsa/dsa2.c| 2 +-
 net/dsa/dsa_priv.h| 1 +
 net/dsa/legacy.c  | 2 +-
 net/dsa/slave.c   | 2 +-
 net/dsa/switch.c  | 3 ++-
 net/dsa/tag_brcm.c| 2 +-
 net/dsa/tag_dsa.c | 2 +-
 net/dsa/tag_edsa.c| 2 +-
 net/dsa/tag_lan9303.c | 2 +-
 net/dsa/tag_mtk.c | 2 +-
 net/dsa/tag_qca.c | 2 +-
 net/dsa/tag_trailer.c | 2 +-
 13 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index 26130ae438da..54a52f4661b1 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -24,7 +24,7 @@
 #include 
 #include 
 #include 
-#include 
+
 #include "dsa_priv.h"
 
 static struct sk_buff *dsa_slave_notag_xmit(struct sk_buff *skb,
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index 2ac62349ba12..4301f52e4f5a 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -18,7 +18,7 @@
 #include 
 #include 
 #include 
-#include 
+
 #include "dsa_priv.h"
 
 static LIST_HEAD(dsa_switch_trees);
diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index f4a88e485213..904d5476c6e0 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 
 struct dsa_device_ops {
struct sk_buff *(*xmit)(struct sk_buff *skb, struct net_device *dev);
diff --git a/net/dsa/legacy.c b/net/dsa/legacy.c
index bb28b011ba5a..ac4379b8d7ac 100644
--- a/net/dsa/legacy.c
+++ b/net/dsa/legacy.c
@@ -22,7 +22,7 @@
 #include 
 #include 
 #include 
-#include 
+
 #include "dsa_priv.h"
 
 /* switch driver registration ***/
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 77324c483d14..fb13c5d7d587 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -17,13 +17,13 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
 #include 
 #include 
 #include 
+
 #include "dsa_priv.h"
 
 static bool dsa_slave_dev_check(struct net_device *dev);
diff --git a/net/dsa/switch.c b/net/dsa/switch.c
index ca6e26e514f0..f477053308d2 100644
--- a/net/dsa/switch.c
+++ b/net/dsa/switch.c
@@ -12,7 +12,8 @@
 
 #include 
 #include 
-#include 
+
+#include "dsa_priv.h"
 
 static int dsa_switch_bridge_join(struct dsa_switch *ds,
  struct dsa_notifier_bridge_info *info)
diff --git a/net/dsa/tag_brcm.c b/net/dsa/tag_brcm.c
index 658ddee63dc9..9f204f18ada3 100644
--- a/net/dsa/tag_brcm.c
+++ b/net/dsa/tag_brcm.c
@@ -12,7 +12,7 @@
 #include 
 #include 
 #include 
-#include 
+
 #include "dsa_priv.h"
 
 /* This tag length is 4 bytes, older ones were 6 bytes, we do not
diff --git a/net/dsa/tag_dsa.c b/net/dsa/tag_dsa.c
index 1c6633f0de01..3b62a57956a3 100644
--- a/net/dsa/tag_dsa.c
+++ b/net/dsa/tag_dsa.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+
 #include "dsa_priv.h"
 
 #define DSA_HLEN   4
diff --git a/net/dsa/tag_edsa.c b/net/dsa/tag_edsa.c
index d9c668aa5e54..f95cafd05702 100644
--- a/net/dsa/tag_edsa.c
+++ b/net/dsa/tag_edsa.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+
 #include "dsa_priv.h"
 
 #define DSA_HLEN   4
diff --git a/net/dsa/tag_lan9303.c b/net/dsa/tag_lan9303.c
index 70130ed5c21a..afd59330b5f1 100644
--- a/net/dsa/tag_lan9303.c
+++ b/net/dsa/tag_lan9303.c
@@ -14,7 +14,7 @@
 #include 
 #include 
 #include 
-#include 
+
 #include "dsa_priv.h"
 
 /* To define the outgoing port and to discover the incoming port a regular
diff --git a/net/dsa/tag_mtk.c b/net/dsa/tag_mtk.c
index 837cdddb53f0..d1258e84cd71 100644
--- a/net/dsa/tag_mtk.c
+++ b/net/dsa/tag_mtk.c
@@ -13,7 +13,7 @@
  */
 
 #include 
-#include 
+
 #include "dsa_priv.h"
 
 #define MTK_HDR_LEN4
diff --git a/net/dsa/tag_qca.c b/net/dsa/tag_qca.c
index be3b67750ac8..2451007699b7 100644
--- a/net/dsa/tag_qca.c
+++ b/net/dsa/tag_qca.c
@@ -12,7 +12,7 @@
  */
 
 #include 
-#include 
+
 #include "dsa_priv.h"
 
 #define QCA_HDR_LEN2
diff --git a/net/dsa/tag_trailer.c b/net/dsa/tag_trailer.c
index aa05e276ea22..7488ab2932ab 100644
--- a/net/dsa/tag_trailer.c
+++ b/net/dsa/tag_trailer.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+
 #include "dsa_priv.h"
 
 static struct sk_buff *trailer_xmit(struct sk_buff *skb, struct net_device 
*dev)
-- 
2.13.0



[PATCH net-next 3/3] net: dsa: use switchdev_obj_dump_cb_t everywhere

2017-05-17 Thread Vivien Didelot
Now that the DSA public header includes switchdev.h, use the provided
switchdev_obj_dump_cb_t typedef for the object dump callback.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/b53/b53_common.c |  6 +++---
 drivers/net/dsa/b53/b53_priv.h   |  4 ++--
 drivers/net/dsa/dsa_loop.c   |  2 +-
 drivers/net/dsa/mt7530.c |  2 +-
 drivers/net/dsa/mv88e6xxx/chip.c | 10 +-
 drivers/net/dsa/qca8k.c  |  2 +-
 include/net/dsa.h|  6 +++---
 7 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index fbc3eb17c7a3..fa099ed41652 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1055,7 +1055,7 @@ EXPORT_SYMBOL(b53_vlan_del);
 
 int b53_vlan_dump(struct dsa_switch *ds, int port,
  struct switchdev_obj_port_vlan *vlan,
- int (*cb)(struct switchdev_obj *obj))
+ switchdev_obj_dump_cb_t *cb)
 {
struct b53_device *dev = ds->priv;
u16 vid, vid_start = 0, pvid;
@@ -1284,7 +1284,7 @@ static void b53_arl_search_rd(struct b53_device *dev, u8 
idx,
 static int b53_fdb_copy(struct net_device *dev, int port,
const struct b53_arl_entry *ent,
struct switchdev_obj_port_fdb *fdb,
-   int (*cb)(struct switchdev_obj *obj))
+   switchdev_obj_dump_cb_t *cb)
 {
if (!ent->is_valid)
return 0;
@@ -1301,7 +1301,7 @@ static int b53_fdb_copy(struct net_device *dev, int port,
 
 int b53_fdb_dump(struct dsa_switch *ds, int port,
 struct switchdev_obj_port_fdb *fdb,
-int (*cb)(struct switchdev_obj *obj))
+switchdev_obj_dump_cb_t *cb)
 {
struct b53_device *priv = ds->priv;
struct net_device *dev = ds->ports[port].netdev;
diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h
index a9dc90a01438..155a9c48c317 100644
--- a/drivers/net/dsa/b53/b53_priv.h
+++ b/drivers/net/dsa/b53/b53_priv.h
@@ -395,7 +395,7 @@ int b53_vlan_del(struct dsa_switch *ds, int port,
 const struct switchdev_obj_port_vlan *vlan);
 int b53_vlan_dump(struct dsa_switch *ds, int port,
  struct switchdev_obj_port_vlan *vlan,
- int (*cb)(struct switchdev_obj *obj));
+ switchdev_obj_dump_cb_t *cb);
 int b53_fdb_prepare(struct dsa_switch *ds, int port,
const struct switchdev_obj_port_fdb *fdb,
struct switchdev_trans *trans);
@@ -406,7 +406,7 @@ int b53_fdb_del(struct dsa_switch *ds, int port,
const struct switchdev_obj_port_fdb *fdb);
 int b53_fdb_dump(struct dsa_switch *ds, int port,
 struct switchdev_obj_port_fdb *fdb,
-int (*cb)(struct switchdev_obj *obj));
+switchdev_obj_dump_cb_t *cb);
 int b53_mirror_add(struct dsa_switch *ds, int port,
   struct dsa_mall_mirror_tc_entry *mirror, bool ingress);
 void b53_mirror_del(struct dsa_switch *ds, int port,
diff --git a/drivers/net/dsa/dsa_loop.c b/drivers/net/dsa/dsa_loop.c
index 6afab16d13dd..5edf07beb9d2 100644
--- a/drivers/net/dsa/dsa_loop.c
+++ b/drivers/net/dsa/dsa_loop.c
@@ -187,7 +187,7 @@ static int dsa_loop_port_vlan_del(struct dsa_switch *ds, 
int port,
 
 static int dsa_loop_port_vlan_dump(struct dsa_switch *ds, int port,
   struct switchdev_obj_port_vlan *vlan,
-  int (*cb)(struct switchdev_obj *obj))
+  switchdev_obj_dump_cb_t *cb)
 {
struct dsa_loop_priv *ps = ds->priv;
struct mii_bus *bus = ps->bus;
diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
index 1bcbe15870ed..4d2f45153ede 100644
--- a/drivers/net/dsa/mt7530.c
+++ b/drivers/net/dsa/mt7530.c
@@ -853,7 +853,7 @@ mt7530_port_fdb_del(struct dsa_switch *ds, int port,
 static int
 mt7530_port_fdb_dump(struct dsa_switch *ds, int port,
 struct switchdev_obj_port_fdb *fdb,
-int (*cb)(struct switchdev_obj *obj))
+switchdev_obj_dump_cb_t *cb)
 {
struct mt7530_priv *priv = ds->priv;
struct mt7530_fdb _fdb = { 0 };
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 386d878569ed..41de250dbcc3 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -1268,7 +1268,7 @@ static int mv88e6xxx_vtu_loadpurge(struct mv88e6xxx_chip 
*chip,
 
 static int mv88e6xxx_port_vlan_dump(struct dsa_switch *ds, int port,
struct switchdev_obj_port_vlan *vlan,
-   int (*cb)(struct switchdev_obj *obj))
+   switchdev_obj_dump_cb_t *cb)
 {
struct mv88e6xxx_chip *chip = ds->priv;
struct 

[PATCH net-next 0/3] net: dsa: headers cleanup

2017-05-17 Thread Vivien Didelot
The DSA core files share a common private header file. Include the DSA
public header there instead of independently in each core source file.

DSA core and its drivers use switchdev, thus include switchdev.h in the
public DSA header. This allows us to get rid of the forward declaration
and use typedef defined by switchdev.

Vivien Didelot (3):
  net: dsa: include dsa.h only once
  net: dsa: include switchdev.h only once
  net: dsa: use switchdev_obj_dump_cb_t everywhere

 drivers/net/dsa/b53/b53_common.c |  7 +++
 drivers/net/dsa/b53/b53_priv.h   |  4 ++--
 drivers/net/dsa/bcm_sf2.c|  1 -
 drivers/net/dsa/dsa_loop.c   |  3 +--
 drivers/net/dsa/mt7530.c |  3 +--
 drivers/net/dsa/mv88e6xxx/chip.c | 11 +--
 drivers/net/dsa/qca8k.c  |  3 +--
 include/net/dsa.h| 13 -
 net/dsa/dsa.c|  2 +-
 net/dsa/dsa2.c   |  2 +-
 net/dsa/dsa_priv.h   |  1 +
 net/dsa/legacy.c |  2 +-
 net/dsa/slave.c  |  3 +--
 net/dsa/switch.c |  3 ++-
 net/dsa/tag_brcm.c   |  2 +-
 net/dsa/tag_dsa.c|  2 +-
 net/dsa/tag_edsa.c   |  2 +-
 net/dsa/tag_lan9303.c|  2 +-
 net/dsa/tag_mtk.c|  2 +-
 net/dsa/tag_qca.c|  2 +-
 net/dsa/tag_trailer.c|  2 +-
 21 files changed, 31 insertions(+), 41 deletions(-)

-- 
2.13.0



[PATCH net-next 2/3] net: dsa: include switchdev.h only once

2017-05-17 Thread Vivien Didelot
DSA drivers and core use switchdev. Include switchdev.h only once, in
the dsa.h public header, so that inclusion in DSA drivers or forward
declarations of switchdev structures in not necessary anymore.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/b53/b53_common.c | 1 -
 drivers/net/dsa/bcm_sf2.c| 1 -
 drivers/net/dsa/dsa_loop.c   | 1 -
 drivers/net/dsa/mt7530.c | 1 -
 drivers/net/dsa/mv88e6xxx/chip.c | 1 -
 drivers/net/dsa/qca8k.c  | 1 -
 include/net/dsa.h| 7 +--
 net/dsa/slave.c  | 1 -
 8 files changed, 1 insertion(+), 13 deletions(-)

diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index 658a12c888a8..fbc3eb17c7a3 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -29,7 +29,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "b53_regs.h"
 #include "b53_priv.h"
diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 215d41c1e71f..687a8bae5d73 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -28,7 +28,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 #include "bcm_sf2.h"
diff --git a/drivers/net/dsa/dsa_loop.c b/drivers/net/dsa/dsa_loop.c
index a19e1781e9bb..6afab16d13dd 100644
--- a/drivers/net/dsa/dsa_loop.c
+++ b/drivers/net/dsa/dsa_loop.c
@@ -17,7 +17,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 #include "dsa_loop.h"
diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
index b070c167e70f..1bcbe15870ed 100644
--- a/drivers/net/dsa/mt7530.c
+++ b/drivers/net/dsa/mt7530.c
@@ -28,7 +28,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "mt7530.h"
 
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index d034d8cd7d22..386d878569ed 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -32,7 +32,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "mv88e6xxx.h"
 #include "global1.h"
diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c
index 942b9ac7f92a..149f109dbffb 100644
--- a/drivers/net/dsa/qca8k.c
+++ b/drivers/net/dsa/qca8k.c
@@ -18,7 +18,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/include/net/dsa.h b/include/net/dsa.h
index 118a8bd2fd9a..61984895a9ad 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 struct tc_action;
 struct phy_device;
@@ -284,12 +285,6 @@ static inline u8 dsa_upstream_port(struct dsa_switch *ds)
return ds->rtable[dst->cpu_dp->ds->index];
 }
 
-struct switchdev_trans;
-struct switchdev_obj;
-struct switchdev_obj_port_fdb;
-struct switchdev_obj_port_mdb;
-struct switchdev_obj_port_vlan;
-
 #define DSA_NOTIFIER_BRIDGE_JOIN   1
 #define DSA_NOTIFIER_BRIDGE_LEAVE  2
 
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index fb13c5d7d587..91236d602301 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -18,7 +18,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
-- 
2.13.0



Re: nvmet with cxgb4 is broken in v4.12-rc1

2017-05-17 Thread Logan Gunthorpe


On 17/05/17 01:22 PM, Steve Wise wrote:
> Hey Logan, this is with 1.16.43.0 firmware? 

Yes, that's correct.

Logan


Re: [patch net-next v4 00/10] net: sched: introduce multichain support for filters

2017-05-17 Thread David Miller
From: Jiri Pirko 
Date: Wed, 17 May 2017 11:07:53 +0200

> Currently, each classful qdisc holds one chain of filters.
> This chain is traversed and each filter could be matched on, which
> may lead to execution of list of actions. One of such action
> could be "reclassify", which would "reset" the processing of the
> filter chain.
> 
> So this filter chain could be looked at as a flat table.
> Sometimes it is convenient for user to configure a hierarchy
> of tables. Example usecase is encapsulation.
> 
> Hierarchy of tables is a common way how it is done in HW pipelines.
> So it is much more convenient to offload this.
> 
> This patchset contains two major patches:
> 8/10 - This patch introduces the support for having multiple
>chains of filters.
> 10/10 - This patch adds new control action to allow going to specified chain
> 
> The rest of the patches are smaller or bigger depencies of those 2.
> Please see individual patch descriptions for details.

Series applied, thanks Jiri.


RE: nvmet with cxgb4 is broken in v4.12-rc1

2017-05-17 Thread Steve Wise
> Hello,
> 
> Another cxgb4 bug report for you. Testing nvmet over our T62100-LP-CR
> stopped working in v4.12-rc1. We are seeing two oopses: a general
> protection fault in nvmet_rdma_free_rsps and a WQ_MEM_RECLAIM warning.
> These occur after a suspect cxgb4 message. Please see the full dmesg log
> which is attached.
> 
> > [   41.093555] cxgb4 :07:00.4: AE qpid 1034 opcode 0 status 0x1 type 1 
> > len
> 0x0 wrid.hi 0x0 wrid.lo 0x0
> > [   41.093673] nvmet_rdma: received IB QP event: QP access error (3)
> 
> I've bisected to find this commit is the culprit:
> 
> bb58d079: cxgb4: Update IngPad and IngPack values
> 
> A bisect log is also attached. Reverting that commit also fixes the issue.
> 

Hey Logan, this is with 1.16.43.0 firmware? 

Hey Casey/Manoj, is there some other sw or fw change missing for the ingpad 
change?  Or maybe something needed in the firmware config file?

Thanks,

Steve.



Re: [PATCH net-next 0/2] net: dsa: Sort various lists

2017-05-17 Thread David Miller
From: Andrew Lunn 
Date: Tue, 16 May 2017 22:40:06 +0200

> As we gain more DSA drivers and tagging protocols, the lists are
> getting a bit unruly. Do some sorting.

Series applied, thanks Andrew.


Re: [PATCH net 0/2] bnxt_en: DCBX fixes.

2017-05-17 Thread David Miller
From: Michael Chan 
Date: Tue, 16 May 2017 16:39:42 -0400

> 2 bug fixes for the case where the NIC's firmware DCBX agent is
> enabled.  With these fixes, we will return the proper information to
> lldpad.

Series applied, thanks Michael.


Re: [PATCH net] net: fix compile error in skb_orphan_partial()

2017-05-17 Thread David Miller
From: Eric Dumazet 
Date: Tue, 16 May 2017 13:27:53 -0700

> From: Eric Dumazet 
> 
> If CONFIG_INET is not set, net/core/sock.c can not compile :
> 
> net/core/sock.c: In function ‘skb_orphan_partial’:
> net/core/sock.c:1810:2: error: implicit declaration of function
> ‘skb_is_tcp_pure_ack’ [-Werror=implicit-function-declaration]
>   if (skb_is_tcp_pure_ack(skb))
>   ^
> 
> Fix this by always including 
> 
> Fixes: f6ba8d33cfbb ("netem: fix skb_orphan_partial()")
> Signed-off-by: Eric Dumazet 
> Reported-by: Paul Gortmaker 
> Reported-by: Randy Dunlap 
> Reported-by: Stephen Rothwell 

Applied and queued up for -stable, thanks Eric.


Re: [PATCH net-next] ipv6: Prevent overrun when parsing v6 header options

2017-05-17 Thread David Miller
From: Craig Gallek 
Date: Tue, 16 May 2017 14:36:23 -0400

> From: Craig Gallek 
> 
> The KASAN warning repoted below was discovered with a syzkaller
> program.  The reproducer is basically:
>   int s = socket(AF_INET6, SOCK_RAW, NEXTHDR_HOP);
>   send(s, _byte_of_data, 1, MSG_MORE);
>   send(s, _than_mtu_bytes_data, 2000, 0);
> 
> The socket() call sets the nexthdr field of the v6 header to
> NEXTHDR_HOP, the first send call primes the payload with a non zero
> byte of data, and the second send call triggers the fragmentation path.
> 
> The fragmentation code tries to parse the header options in order
> to figure out where to insert the fragment option.  Since nexthdr points
> to an invalid option, the calculation of the size of the network header
> can made to be much larger than the linear section of the skb and data
> is read outside of it.
> 
> This fix makes ip6_find_1stfrag return an error if it detects
> running out-of-bounds.
 ...
> Reported-by: Andrey Konovalov 
> Signed-off-by: Craig Gallek 

Since this is a reasonably serious bug I'm going to apply this
to 'net' and queue it up for -stable.

Thanks.


Re: [PATCH net-next] liquidio: fix insmod failure when multiple NICs are plugged in

2017-05-17 Thread David Miller
From: Felix Manlunas 
Date: Tue, 16 May 2017 11:14:50 -0700

> From: Rick Farrington 
> 
> When multiple liquidio NICs are plugged in, the first insmod of the PF
> driver succeeds.  But after an rmmod, a subsequent insmod fails.  Reason is
> during rmmod, the PF driver resets the Octeon of only one of the NICs; it
> neglects to reset the Octeons of the other NICs.
> 
> Fix the insmod failure by adding the missing Octeon resets at rmmod.  Keep
> a per-NIC refcount that indicates the number of active PFs in a given NIC.
> When the refcount goes to zero, then reset the Octeon of that NIC.
> 
> Signed-off-by: Rick Farrington 
> Signed-off-by: Felix Manlunas 

I'm applying this but I'm really not happy with how this driver handles
probing.

There is no reason to have arbitrary limits to the number of adapters,
everything should be dynamic and allow arbitrary numbers of instances
even if that is not physically possible with current hardware.

Also, the driver should be able to load even if something isn't reset
properly.  A boot loader or some other entity could have programmed
the device or something like a kdump kernel could leave it in an
indeterminate state.

So you must be able to probe the device even if it has not been soft
reset properly.  In fact, the soft reset probably should happen during
probe not during shutdown.


Re: [PATCH net-next] liquidio: fix PF falsely indicating success at setting MAC address of a nonexistent VF

2017-05-17 Thread David Miller
From: Felix Manlunas 
Date: Tue, 16 May 2017 11:28:00 -0700

> In the function assigned to .ndo_set_vf_mac, check the validity of the
> vfidx argument before proceeding to tell the firmware to set the VF MAC
> address.
> 
> Signed-off-by: Felix Manlunas 
> Signed-off-by: Derek Chickles 

Applied.


[PATCH net-next v2] net: fix __skb_try_recv_from_queue to return the old behavior

2017-05-17 Thread Andrei Vagin
This function has to return NULL on a error case, because there is a
separate error variable.

The offset has to be changed only if skb is returned

v2: fix udp code to not use an extra variable

Cc: Paolo Abeni 
Cc: Eric Dumazet 
Cc: David S. Miller 
Fixes: 65101aeca522 ("net/sock: factor out dequeue/peek with offset cod")
Signed-off-by: Andrei Vagin 
---
 net/core/datagram.c | 14 --
 net/ipv4/udp.c  | 12 +++-
 2 files changed, 11 insertions(+), 15 deletions(-)

diff --git a/net/core/datagram.c b/net/core/datagram.c
index a4592b4..bc46118 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -170,20 +170,21 @@ struct sk_buff *__skb_try_recv_from_queue(struct sock *sk,
  struct sk_buff **last)
 {
struct sk_buff *skb;
+   int _off = *off;
 
*last = queue->prev;
skb_queue_walk(queue, skb) {
if (flags & MSG_PEEK) {
-   if (*off >= skb->len && (skb->len || *off ||
+   if (_off >= skb->len && (skb->len || _off ||
 skb->peeked)) {
-   *off -= skb->len;
+   _off -= skb->len;
continue;
}
if (!skb->len) {
skb = skb_set_peeked(skb);
if (unlikely(IS_ERR(skb))) {
*err = PTR_ERR(skb);
-   return skb;
+   return NULL;
}
}
*peeked = 1;
@@ -193,6 +194,7 @@ struct sk_buff *__skb_try_recv_from_queue(struct sock *sk,
if (destructor)
destructor(sk, skb);
}
+   *off = _off;
return skb;
}
return NULL;
@@ -253,8 +255,6 @@ struct sk_buff *__skb_try_recv_datagram(struct sock *sk, 
unsigned int flags,
 
*peeked = 0;
do {
-   int _off = *off;
-
/* Again only user level code calls this function, so nothing
 * interrupt level will suddenly eat the receive_queue.
 *
@@ -263,8 +263,10 @@ struct sk_buff *__skb_try_recv_datagram(struct sock *sk, 
unsigned int flags,
 */
spin_lock_irqsave(>lock, cpu_flags);
skb = __skb_try_recv_from_queue(sk, queue, flags, destructor,
-   peeked, &_off, err, last);
+   peeked, off, , last);
spin_unlock_irqrestore(>lock, cpu_flags);
+   if (error)
+   goto no_packet;
if (skb)
return skb;
 
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 7bd56c9..278e707 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1465,16 +1465,13 @@ struct sk_buff *__skb_recv_udp(struct sock *sk, 
unsigned int flags,
error = -EAGAIN;
*peeked = 0;
do {
-   int _off = *off;
-
spin_lock_bh(>lock);
skb = __skb_try_recv_from_queue(sk, queue, flags,
udp_skb_destructor,
-   peeked, &_off, err,
+   peeked, off, err,
);
if (skb) {
spin_unlock_bh(>lock);
-   *off = _off;
return skb;
}
 
@@ -1488,20 +1485,17 @@ struct sk_buff *__skb_recv_udp(struct sock *sk, 
unsigned int flags,
 * the sk_receive_queue lock if fwd memory scheduling
 * is needed.
 */
-   _off = *off;
spin_lock(_queue->lock);
skb_queue_splice_tail_init(sk_queue, queue);
 
skb = __skb_try_recv_from_queue(sk, queue, flags,
udp_skb_dtor_locked,
-   peeked, &_off, err,
+   peeked, off, err,
);
spin_unlock(_queue->lock);
spin_unlock_bh(>lock);
-   if (skb) {
-   *off = _off;
+   if (skb)
return skb;
-   }
 
 

[PATCH net] cxgb4: update latest firmware version supported

2017-05-17 Thread Ganesh Goudar
Change t4fw_version.h to update latest firmware version
number to 1.16.43.0.

Signed-off-by: Ganesh Goudar 
---
 drivers/net/ethernet/chelsio/cxgb4/t4fw_version.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4fw_version.h 
b/drivers/net/ethernet/chelsio/cxgb4/t4fw_version.h
index fa37644..3549d387 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4fw_version.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4fw_version.h
@@ -37,7 +37,7 @@
 
 #define T4FW_VERSION_MAJOR 0x01
 #define T4FW_VERSION_MINOR 0x10
-#define T4FW_VERSION_MICRO 0x21
+#define T4FW_VERSION_MICRO 0x2B
 #define T4FW_VERSION_BUILD 0x00
 
 #define T4FW_MIN_VERSION_MAJOR 0x01
@@ -46,7 +46,7 @@
 
 #define T5FW_VERSION_MAJOR 0x01
 #define T5FW_VERSION_MINOR 0x10
-#define T5FW_VERSION_MICRO 0x21
+#define T5FW_VERSION_MICRO 0x2B
 #define T5FW_VERSION_BUILD 0x00
 
 #define T5FW_MIN_VERSION_MAJOR 0x00
@@ -55,7 +55,7 @@
 
 #define T6FW_VERSION_MAJOR 0x01
 #define T6FW_VERSION_MINOR 0x10
-#define T6FW_VERSION_MICRO 0x21
+#define T6FW_VERSION_MICRO 0x2B
 #define T6FW_VERSION_BUILD 0x00
 
 #define T6FW_MIN_VERSION_MAJOR 0x00
-- 
2.1.0



Re: 'iw events' stops receiving events after a while on 4.9 + hacks

2017-05-17 Thread Bastian Bittorf
* Johannes Berg  [17.05.2017 20:18]:
> On Wed, 2017-05-17 at 12:08 +0200, Bastian Bittorf wrote:
> > * Ben Greear  [17.05.2017 11:51]:

[...]

> > > kernels, but when testing on 4.9 overnight, I notice that 'iw
> > > events' is not showing any input.  'strace' shows
> > > that it is waiting on recvmsg.  If I start a second 'iw events'
> > > then it will get
> > > wifi events as expected.
> > 
> > me too, also seen on 4.4 - i'am happy for debug ideas.
> 
> I've never seen this.
> 
> Does it happen when it's very long-running? Or when there are lots of
> events?

only a couple of hours. hard to say which is the culprit. here
i run it like:

#!/bin/sh
iw event | while read -r LINE; do
 case "$LINE" in
  *': new station '*) ... ;;
  *': del station '*) ... ;;
 esac
done


The script marks new stations with "touch /tmp/$mac"
and removes this file during 'del station'.

What is interesting: i can recognize, that sometimes and
somehow i have stations in 'iw dev wlanX station dump' which
the script has not seen. When debugging this, the script
does not get any new events. A new started 'iw event' can
see further events without problems.

Hard to say where the error happens. I'am on busybox here.

bye, Bastian


Re: [PATCH net-next] net: dsa: store CPU port pointer in the tree

2017-05-17 Thread David Miller
From: Vivien Didelot 
Date: Tue, 16 May 2017 14:10:33 -0400

> A dsa_switch_tree instance holds a dsa_switch pointer and a port index
> to identify the switch port to which the CPU is attached.
> 
> Now that the DSA layer has a dsa_port structure to hold this data, use
> it to point the switch CPU port.
> 
> This patch simply substitutes s/dst->cpu_switch/dst->cpu_dp->ds/ and
> s/dst->cpu_port/dst->cpu_dp->index/.
> 
> Signed-off-by: Vivien Didelot 

Applied, thanks Vivien.


Re: [RFC V1 1/1] net: cdc_ncm: Reduce memory use when kernel memory low

2017-05-17 Thread David Miller
From: Bjørn Mork 
Date: Tue, 16 May 2017 20:24:10 +0200

> Jim Baxter  writes:
> 
>> The CDC-NCM driver can require large amounts of memory to create
>> skb's and this can be a problem when the memory becomes fragmented.
>>
>> This especially affects embedded systems that have constrained
>> resources but wish to maximise the throughput of CDC-NCM with 16KiB
>> NTB's.
>>
>> The issue is after running for a while the kernel memory can become
>> fragmented and it needs compacting.
>> If the NTB allocation is needed before the memory has been compacted
>> the atomic allocation can fail which can cause increased latency,
>> large re-transmissions or disconnections depending upon the data
>> being transmitted at the time.
>> This situation occurs for less than a second until the kernel has
>> compacted the memory but the failed devices can take a lot longer to
>> recover from the failed TX packets.
>>
>> To ease this temporary situation I modified the CDC-NCM TX path to
>> temporarily switch into a reduced memory mode which allocates an NTB
>> that will fit into a USB_CDC_NCM_NTB_MIN_OUT_SIZE (default 2048 Bytes)
>> sized memory block and only transmit NTB's with a single network frame
>> until the memory situation is resolved.
>> Once the memory is compacted the CDC-NCM data can resume transmitting
>> at the normal tx_max rate once again.
> 
> I must say that I don't like the additional complexity added here.  If
> there are memory issues and you can reduce the buffer size to
> USB_CDC_NCM_NTB_MIN_OUT_SIZE, then why don't you just set a lower tx_max
> buffer size in the first place?
> 
>   echo 2048 > /sys/class/net/wwan0/cdc_ncm/tx_max

When there isn't memory pressure this will hurt performance of
course.

It is a quite common paradigm to back down to 0 order memory requests
when higher order ones fail, so this isn't such a bad change from the
perspective.

However, one negative about it is that when the system is under memory
stress it doesn't help at all to keep attemping high order allocations
when the system hasn't recovered yet.  In fact, this can make it
worse.


Re: [patch net-next 00/12] mlxsw: Preparations for restructuring

2017-05-17 Thread David Miller
From: Jiri Pirko 
Date: Tue, 16 May 2017 19:38:23 +0200

> This patchset doesn't introduce any functional changes and merely meant
> to make the code base more receptive for upcoming restructuring.
> 
> The first six patches mainly shuffle code in order to reduce the scope of
> structs that shouldn't be defined in the main driver header. Most of them
> will be later expanded, so it makes sense to correctly place them now.
> 
> The last patches mostly simplify bridge-related functions, so that they
> could be more easily modified later on.

Looks good, series applied, thanks.


BUG: nvmet with cxgb4 is broken in v4.12-rc1

2017-05-17 Thread Logan Gunthorpe
Hello,

Another cxgb4 bug report for you. Testing nvmet over our T62100-LP-CR
stopped working in v4.12-rc1. We are seeing two oopses: a general
protection fault in nvmet_rdma_free_rsps and a WQ_MEM_RECLAIM warning.
These occur after a suspect cxgb4 message. Please see the full dmesg log
which is attached.

> [   41.093555] cxgb4 :07:00.4: AE qpid 1034 opcode 0 status 0x1 type 1 
> len 0x0 wrid.hi 0x0 wrid.lo 0x0
> [   41.093673] nvmet_rdma: received IB QP event: QP access error (3)

I've bisected to find this commit is the culprit:

bb58d079: cxgb4: Update IngPad and IngPack values

A bisect log is also attached. Reverting that commit also fixes the issue.

Thanks,

Logan
git bisect start
# good: [a351e9b9fc24e982ec2f0e76379a49826036da12] Linux 4.11
git bisect good a351e9b9fc24e982ec2f0e76379a49826036da12
# bad: [2ea659a9ef488125eb46da6eb571de5eae5c43f6] Linux 4.12-rc1
git bisect bad 2ea659a9ef488125eb46da6eb571de5eae5c43f6
# bad: [221656e7c4ce342b99c31eca96c1cbb6d1dce45f] Merge tag 'sound-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
git bisect bad 221656e7c4ce342b99c31eca96c1cbb6d1dce45f
# bad: [8d65b08debc7e62b2c6032d7fe7389d895b92cbc] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
git bisect bad 8d65b08debc7e62b2c6032d7fe7389d895b92cbc
# bad: [cec381919818a9a0cb85600b3c82404bdd38cf36] Merge tag 'mac80211-next-for-davem-2017-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
git bisect bad cec381919818a9a0cb85600b3c82404bdd38cf36
# bad: [5cd8985a19090f2b0ce8700ae3ec19e06a4fc5e9] net-next: dsa: add Mediatek tag RX/TX handler
git bisect bad 5cd8985a19090f2b0ce8700ae3ec19e06a4fc5e9
# bad: [b4f0a66155564aaf7e98492e027efad9f797c244] net: stmmac: fix dma operation mode config for older versions
git bisect bad b4f0a66155564aaf7e98492e027efad9f797c244
# good: [6689da155bdcd17abfe4d3a8b1e245d9ed4b5f2c] net: bcmgenet: manage dma interrupts in napi code
git bisect good 6689da155bdcd17abfe4d3a8b1e245d9ed4b5f2c
# good: [41e95736b30833710c1e77a2877c2d71133450f7] Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
git bisect good 41e95736b30833710c1e77a2877c2d71133450f7
# good: [cccf6f5cdc96ef6fa02f27ecd8073406a402929a] qed: Make qed_iov_mark_vf_flr() return bool
git bisect good cccf6f5cdc96ef6fa02f27ecd8073406a402929a
# bad: [33e85b8dd69e7f2fbf77f04bfc97ea7c76bccf9b] net: stmmac: Restore DT backwards-compatibility
git bisect bad 33e85b8dd69e7f2fbf77f04bfc97ea7c76bccf9b
# bad: [c7cd4c9bf8df87027e739fe66d0a55951f6875d8] mlxsw: spectrum: fix swapped order of arguments packets and bytes
git bisect bad c7cd4c9bf8df87027e739fe66d0a55951f6875d8
# good: [f22913d0b5560ae621e8a7391e9547d5a6fd8893] ixgb: use new API ethtool_{get|set}_link_ksettings
git bisect good f22913d0b5560ae621e8a7391e9547d5a6fd8893
# good: [7ada7ca562714632d0fc9abb24e27cab67bdbf0d] Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
git bisect good 7ada7ca562714632d0fc9abb24e27cab67bdbf0d
# good: [424fa00ea325b0153c321667f39643f68ae9d1b0] net: dwc-xlgmac: include dcbnl.h
git bisect good 424fa00ea325b0153c321667f39643f68ae9d1b0
# bad: [bb58d07964f2f09e133b46541447c567a7306dc1] cxgb4: Update IngPad and IngPack values
git bisect bad bb58d07964f2f09e133b46541447c567a7306dc1
# good: [3588f29e061cef19ac0092e4f6917717fed5b1d4] net: dwc-xlgmac: add module license
git bisect good 3588f29e061cef19ac0092e4f6917717fed5b1d4
# first bad commit: [bb58d07964f2f09e133b46541447c567a7306dc1] cxgb4: Update IngPad and IngPack values
[0.00] Linux version 4.12.0-rc1.spreadpfn (gunthorp@cgy1-donard) (gcc version 4.9.2 (Debian 4.9.2-10) ) #340 SMP Wed May 17 11:27:02 MDT 2017
[0.00] Command line: BOOT_IMAGE=/vmlinuz-4.11.0-next-20170510.spread-pfn+ root=/dev/mapper/cgy1--donard-root ro memmap=2G!14G quiet
[0.00] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
[0.00] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[0.00] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
[0.00] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
[0.00] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0008dbff] usable
[0.00] BIOS-e820: [mem 0x0008dc00-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000e-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0x7dd14fff] usable
[0.00] BIOS-e820: [mem 0x7dd15000-0x7dda3fff] reserved
[0.00] BIOS-e820: [mem 0x7dda4000-0x7deadfff] ACPI data
[0.00] BIOS-e820: [mem 0x7deae000-0x7e0cafff] ACPI NVS
[0.00] BIOS-e820: [mem 0x7e0cb000-0x7f36bfff] reserved
[0.00] BIOS-e820: [mem 0x7f36c000-0x7f7f] 

Re: [PATCH net-next 9/9] nfp: eliminate an if statement in calculation of completed frames

2017-05-17 Thread Jakub Kicinski
On Wed, 17 May 2017 11:07:19 +, David Laight wrote:
> From: Jakub Kicinski
> > Sent: 16 May 2017 01:55
> > Given that our rings are always a power of 2, we can simplify the
> > calculation of number of completed TX descriptors by using masking
> > instead of if statement based on whether the index have wrapped
> > or not.
> > 
> > Signed-off-by: Jakub Kicinski 
> > ---
> >  drivers/net/ethernet/netronome/nfp/nfp_net_common.c | 10 ++
> >  1 file changed, 2 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
> > b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
> > index c64514f8ee65..da83e17b8b20 100644
> > --- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
> > +++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
> > @@ -940,10 +940,7 @@ static void nfp_net_tx_complete(struct nfp_net_tx_ring 
> > *tx_ring)
> > if (qcp_rd_p == tx_ring->qcp_rd_p)
> > return;
> > 
> > -   if (qcp_rd_p > tx_ring->qcp_rd_p)
> > -   todo = qcp_rd_p - tx_ring->qcp_rd_p;
> > -   else
> > -   todo = qcp_rd_p + tx_ring->cnt - tx_ring->qcp_rd_p;
> > +   todo = D_IDX(tx_ring, qcp_rd_p + tx_ring->cnt - tx_ring->qcp_rd_p);  
> 
> I'm not sure you need to add tx_ring->cnt here.
> I bet D_IDX() masks it away.

True, feel free to send a fix, or I will queue up a correction after
other work I have pending.

> > while (todo--) {
> > idx = D_IDX(tx_ring, tx_ring->rd_p++);  
> 
> That '++' looks suspicious.
> I think you need to decide whether you are incrementing pointers into the ring
> or indexes into it.
> Sometimes it is safer to use a non-wrapping index and mask when accessing the 
> entry.
>   entry_ptr = [idx & (RING_SIZE - 1)]
> Ring full is then (read_idx == write_idx + RING_SIZE),
> ring empty (read_idx == write_idx).
> So the index just wrap at (probably)_2^32.

I may be missing the point.  I use a mix of the two, actually, the
software pointers are free running (non-wrapping) but the HW QCP
pointers wrap.  Because HW pointers wrap I always keep one entry on 
the rings empty, see nfp_net_tx_full().


Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.

2017-05-17 Thread David Miller
From: Edward Cree 
Date: Wed, 17 May 2017 18:00:03 +0100

> Did you see the algorithms I posted with two masks?

I'll take a look.


Re: [PATCH v3 net-next 4/7] net: add new control message for incoming HW-timestamped packets

2017-05-17 Thread Richard Cochran
On Wed, May 17, 2017 at 10:11:50AM -0400, Soheil Hassas Yeganeh wrote:
> On Tue, May 16, 2017 at 8:44 AM, Miroslav Lichvar  wrote:
> > +/* SCM_TIMESTAMPING_PKTINFO control message */
> > +struct scm_ts_pktinfo {
> > +   __u32 if_index;
> > +   __u32 pkt_length;
> > +};
> > +
> 
> Given that this structure can change in the future, it might be worth
> considering using TLVs (e.g., netlink attributes), similar to what
> tcp_get_timestamping_opt_stats() does. This would simplify
> adding/removing fields to/from the control message.

[ BTW, please trim your replies. ]

Personally, I dislike TLVs.  Alternatively, you could add some
reserved fields for future use.

Thanks,
Richard


Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.

2017-05-17 Thread Edward Cree
On 17/05/17 17:13, David Miller wrote:
> Both cases are common in real BPF programs.  The offsets really are
> necessary.  It's funny because initially I tried to implement this
> without the auxiliary offset and it simply doesn't work. :-)
> 
> We always have to track when you've seen the offset that cancels out
> the NET_IP_ALIGN.  And as stated it can occur both before and after
> variable offsets have been introduced.
> 
> You have to catch both:
> 
>   ptr += variable;
>   ptr += 14;
> 
> and:
> 
>   ptr += 14;
>   ptr += variable; /* align = 4 */
> 
> And always see at the end that "NET_IP_ALIGN + offsets" will
> be properly 4 byte aligned.

Did you see the algorithms I posted with two masks?  Effectively they
 are tracking the 'quotient' and 'remainder', but in a was that might
 be easier to manipulate (bit-twiddly etc.) than the aux offset.  I
 think they handle both the above cases, in a nicely general way.

-Ed


[PATCH net-next] net: make struct dst_entry::dev first member

2017-05-17 Thread Alexey Dobriyan
struct dst_entry::dev is used most often. Move it so it can be
accessed without imm8 offset on x86_64.

add/remove: 0/0 grow/shrink: 9/239 up/down: 52/-413 (-361)
function old new   delta
dst_rcu_free 126 138 +12
fnhe_flush_routes211 219  +8
rt_set_nexthop   747 754  +7
rt_cache_route85  91  +6
rt6_release  209 215  +6
dst_release  107 111  +4
dst_destroy_rcu   29  33  +4
dn_dst_check_expire  329 333  +4
dn_insert_route  484 485  +1
xfrm_resolve_and_create_bundle  29912990  -1
...
ip_route_me_harder  11631157  -6
__ip_append_data.isra   27302724  -6
ip6_forward 30523045  -7
callforward_do_filter659 651  -8
dst_gc_task  571 549 -22

Signed-off-by: Alexey Dobriyan 
---

 include/net/dst.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -31,9 +31,9 @@
 struct sk_buff;
 
 struct dst_entry {
+   struct net_device   *dev;
struct rcu_head rcu_head;
struct dst_entry*child;
-   struct net_device   *dev;
struct  dst_ops *ops;
unsigned long   _metrics;
unsigned long   expires;


Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.

2017-05-17 Thread David Miller
From: Edward Cree 
Date: Wed, 17 May 2017 15:00:04 +0100

> On 16/05/17 23:53, Alexei Starovoitov wrote:
>> following this line of thinking it feels that it should be possible
>> to get rid of 'aux_off' and 'aux_off_align' and simplify the code.
>> I mean we can always do
>> dst_reg->min_align = min(dst_reg->min_align, src_reg->min_align);
>>
>> and don't use 'off' as part of alignment checks at all.
> Problem with this approach, of course, is that (say) NET_IP_ALIGN +
>  sizeof(ethhdr) = 16 is muchly aligned, whereas if you turn all
>  constants into alignments you think you're only 2-byte aligned.
> I think you have to track exact offsets when you can, and only turn
>  into an alignment when you introduce a variable.
> Of course it can still be fooled by e.g. 2 + (x << 2) + 14, which it
>  will think is only 2-aligned when really it's 4-aligned, but unless
>  you want to start tracking 'bits known to be 1' as well as 'bits
>  known to be 0', I think you just accept that alignment tracking
>  isn't commutative.  The obvious cases (ihl << 2 and so) will work
>  when written the obvious way, unless the compiler does something
>  perverse.
> OTOH the 'track known 1s as well' might work in a nice generic way
>  and cover all bases, I'll have to experiment a bit with that.

Both cases are common in real BPF programs.  The offsets really are
necessary.  It's funny because initially I tried to implement this
without the auxiliary offset and it simply doesn't work. :-)

We always have to track when you've seen the offset that cancels out
the NET_IP_ALIGN.  And as stated it can occur both before and after
variable offsets have been introduced.

You have to catch both:

ptr += variable;
ptr += 14;

and:

ptr += 14;
ptr += variable; /* align = 4 */

And always see at the end that "NET_IP_ALIGN + offsets" will
be properly 4 byte aligned.


Re: [PATCH v4] bridge: netlink: check vlan_default_pvid range

2017-05-17 Thread Sabrina Dubroca
2017-05-17, 09:29:12 +0200, Tobias Jungel wrote:
> Currently it is allowed to set the default pvid of a bridge to a value
> above VLAN_VID_MASK (0xfff). This patch adds a check to br_validate and
> returns -EINVAL in case the pvid is out of bounds.
> 
> Reproduce by calling:
> 
> [root@test ~]# ip l a type bridge
> [root@test ~]# ip l a type dummy
> [root@test ~]# ip l s bridge0 type bridge vlan_filtering 1
> [root@test ~]# ip l s bridge0 type bridge vlan_default_pvid 
> [root@test ~]# ip l s dummy0 master bridge0
> [root@test ~]# bridge vlan
> port  vlan ids
> bridge0 PVID Egress Untagged
> 
> dummy0  PVID Egress Untagged
> 
> Fixes: 0f963b7592ef ("bridge: netlink: add support for default_pvid")
> Acked-by: Nikolay Aleksandrov 
> Signed-off-by: Tobias Jungel 

Acked-by: Sabrina Dubroca 


Thanks,

-- 
Sabrina


Re: [PATCH net-next v2 1/7] skbuff: add stub to help computing crc32c on SCTP packets

2017-05-17 Thread David Miller
From: Davide Caratti 
Date: Tue, 16 May 2017 18:27:45 +0200

> @@ -2243,6 +2243,30 @@ __wsum skb_copy_and_csum_bits(const struct sk_buff 
> *skb, int offset,
>  }
>  EXPORT_SYMBOL(skb_copy_and_csum_bits);
>  
> +static __wsum warn_crc32c_csum_update(const void *buff, int len, __wsum sum)
> +{
> + net_warn_ratelimited(
> + "%s: attempt to compute crc32c without libcrc32c.ko\n",
> + __func__);
> + return 0;
> +}
> +
> +static __wsum warn_crc32c_csum_combine(__wsum csum, __wsum csum2,
> +int offset, int len)
> +{
> + net_warn_ratelimited(
> + "%s: attempt to compute crc32c without libcrc32c.ko\n",
> + __func__);
> + return 0;
> +}
> +
> +const struct skb_checksum_ops *crc32c_csum_stub __read_mostly =
> + &(struct skb_checksum_ops) {
> + .update  = warn_crc32c_csum_update,
> + .combine = warn_crc32c_csum_combine,
> +};
> +EXPORT_SYMBOL(crc32c_csum_stub);

Please, if you are going to do this, declare things in a more
traditional and canonical way:

static const struct skb_checksum_ops default_crc32c_ops = {
.update  = warn_crc32c_csum_update,
.combine = warn_crc32c_csum_combine,
};

const struct skb_checksum_ops *crc32c_csum_stub __read_mostly =
_crc32c_ops;

> +static const struct skb_checksum_ops *crc32c_csum_ops __read_mostly =
> + &(struct skb_checksum_ops) {
> + .update  = sctp_csum_update,
> + .combine = sctp_csum_combine,
> +};

Likewise.


Re: [PATCH] hdlcdrv: fix divide error bug if bitrate is 0

2017-05-17 Thread walter harms


Am 17.05.2017 15:42, schrieb Firo Yang:
> On Wed, May 17, 2017 at 02:59:39PM +0200, walter harms wrote:
>>
>>
>> Am 17.05.2017 14:35, schrieb Firo Yang:
>>> The divisor s->par.bitrate will always be 0 until initialized by
>>> ndo_open() and hdlcdrv_open().
>>>
>>> In order to fix this divide zero error, check whether the netdevice
>>> was opened by ndo_open() before performing divide.
>>>
>>> Reported-by: Dmitry Vyukov 
>>> Signed-off-by: Firo Yang 
>>> ---
>>>  drivers/net/hamradio/hdlcdrv.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/hamradio/hdlcdrv.c b/drivers/net/hamradio/hdlcdrv.c
>>> index 8c3633c..3c783fd 100644
>>> --- a/drivers/net/hamradio/hdlcdrv.c
>>> +++ b/drivers/net/hamradio/hdlcdrv.c
>>> @@ -574,7 +574,7 @@ static int hdlcdrv_ioctl(struct net_device *dev, struct 
>>> ifreq *ifr, int cmd)
>>> break;  
>>>  
>>> case HDLCDRVCTL_CALIBRATE:
>>> -   if(!capable(CAP_SYS_RAWIO))
>>> +   if (!capable(CAP_SYS_RAWIO) || !netif_running(dev))
>>> return -EPERM;
>>> if (bi.data.calibrate > INT_MAX / s->par.bitrate)
>>> return -EINVAL;
>>
>> I would still check for s->par.bitrate > 0 later changes may affect the 
>> setting of it
>> and it is much more obvious.
> 
> I think 0 is not valid value for bitrate, so we should check it in
> other places, like what ser12_open() did:
> 429 if (bc->baud < 300 || bc->baud > 4800) {
> 430 printk(KERN_INFO "baycom_ser_fdx: invalid baudrate "
> 431 "(300...4800)\n");
> 432 return -EINVAL;
> 433 }
> ...
> 440 bc->hdrv.par.bitrate = bc->baud;


I do not want to say you change is not valid but i have learned that it is 
better to
have an obvious check that to rely on hidden knowledge.


> 
>>
>> Also perhaps !netif_running(dev) should better return ENODEV.
> 
> However, the 'dev' truly exists in this circumstance.
> 

yes and i do not feel good with that but "no permission" will lead
any enduser into a search for user rights.



re,
 wh


> Thanks,
> Firo
> 
>>
>>
>> just my 2 cents,
>> re,
>> wh
>>


Re: [PATCH net-next] cxgb4: add new T5 pci device id

2017-05-17 Thread David Miller
From: Ganesh Goudar 
Date: Tue, 16 May 2017 21:39:05 +0530

> Signed-off-by: Ganesh Goudar 

Applied.


Re: [PATCH net-next] cxgb4: reduce resource allocation in kdump kernel

2017-05-17 Thread David Miller
From: Ganesh Goudar 
Date: Tue, 16 May 2017 21:17:42 +0530

> When is_kdump_kernel() is true, reduce memory footprint of
> cxgb4 by using a single "Queue Set".
> 
> Signed-off-by: Ganesh Goudar 

Applied.


  1   2   3   >