Re: [PATCH RFC 1/5] net: phy: sun8i-h3-ephy: Add bindings for Allwinner H3 Ethernet PHY

2016-05-06 Thread Chen-Yu Tsai
Hi,

On Tue, Apr 12, 2016 at 9:38 AM, Chen-Yu Tsai  wrote:
> On Tue, Apr 12, 2016 at 3:23 AM, Florian Fainelli  
> wrote:
>> On 04/04/16 09:22, Chen-Yu Tsai wrote:
>>> The Allwinner H3 SoC incorporates an Ethernet PHY. This is enabled and
>>> configured through a memory mapped hardware register.
>>>
>>> This same register also configures the MAC interface mode and TX clock
>>> source. Also covered by the register, but not supported in these bindings,
>>> are TX/RX clock delay chains and inverters, and an RMII module.
>>>
>>> Signed-off-by: Chen-Yu Tsai 
>>> ---
>>>  .../bindings/net/allwinner,sun8i-h3-ephy.txt   | 44 
>>> ++
>>>  1 file changed, 44 insertions(+)
>>>  create mode 100644 
>>> Documentation/devicetree/bindings/net/allwinner,sun8i-h3-ephy.txt
>>>
>>> diff --git 
>>> a/Documentation/devicetree/bindings/net/allwinner,sun8i-h3-ephy.txt 
>>> b/Documentation/devicetree/bindings/net/allwinner,sun8i-h3-ephy.txt
>>> new file mode 100644
>>> index ..146f227e6d88
>>> --- /dev/null
>>> +++ b/Documentation/devicetree/bindings/net/allwinner,sun8i-h3-ephy.txt
>>> @@ -0,0 +1,44 @@
>>> +* Allwinner H3 E(thernet) PHY
>>> +
>>> +The Allwinner H3 integrates an MII ethernet PHY. As with external PHYs,
>>> +before it can be configured over the MDIO bus and used, certain hardware
>>> +features must be configured, such as the PHY address and LED polarity.
>>
>> Is the internal PHY address really configurable? Not that there is
>> anything wrong with it, this is good.
>
> It is. Things that are configured or provided to a discrete PHY are routed
> to registers in the SoC, things such as PHY address, clocks, resets.
>
>>> +The PHY must also be powered on and brought out of reset.
>>> +
>>> +This is accomplished with regulators and pull-up/downs for external PHYs.
>>> +For this internal PHY, a hardware register is programmed.
>>> +
>>> +The same hardware register also contains clock and interface controls
>>> +for the MAC. This is also present in earlier SoCs, and is covered by
>>> +"allwinner,sun7i-a20-gmac-clk". The controls in the H3 are slightly
>>> +different due to the inclusion of what appears to be an RMII-MII
>>> +bridge.
>>> +
>>> +Required properties:
>>> +- compatible: should be "allwinner,sun8i-h3-ephy"
>>> +- reg: address and length of the register set for the device
>>> +- clocks: A phandle to the reference clock for this device
>>> +- resets: A phandle to the reset control for this device
>>> +
>>> +Ethernet PHY related properties:
>>> +- allwinner,ephy-addr: the MDIO bus address the PHY should respond to.
>>> +If this is not set, the external PHY is used, and
>>> +everything else in this section is ignored.
>>
>> So we are going to put the same value here, and in the actual Ethernet
>> PHY device tree node that should be hanging off the EMAC/MDIO bus
>> controller, this is confusing and error prone.
>
> Yes, that would be an issue when writing the DTS.
>>
>>> +- allwinner,leds-active-low: LEDs are active low. Without this, LEDs are
>>> +  active high.
>>> +
>>> +Ethernet MAC clock related properties:
>>> +- #clock-cells: should be 0
>>> +- clock-output-names: "mac_tx"
>>> +
>>> +Example:
>>> +
>>> +ethernet-phy@01c00030 {
>>> + compatible = "allwinner,sun8i-h3-ephy";
>>> + reg = <0x01c00030 0x4>;
>>
>> Looking at this register space this looks more like an internal PHY SHIM
>> that is needed to be configured before the internal PHY can be access,
>> not a proper Ethernet PHY per-se, see replies in aptch 2.
>>
>> Should not this block be a second cell associated with the Ethernet MAC
>> block? One or the other are not going to be very useful without
>> knowledge of each other.
>
> True. However the lower half of the same register also controls the
> MAC interface mode and TX clock source and delays. This we had a clock
> driver that was used in conjuction with stmmac on earlier SoCs. I was
> hoping to keep that model with the newer EMAC. At the time it was
> argued that what seemed like a clock should be handled by a clock
> driver, instead of just a "syscon". If this is reaching too far to
> handle this new use case, I will happily just provide patches to merge
> this into the MAC.

Maxime, Hans, any thoughts?

It seems like it'd be easier to just fold this into the EMAC driver.
The register is not part of the clock controller in these new SoCs,
so it's nicer than what we had in A20/A31. It's also not just a clock
control, but a bunch of various controls.

ChenYu

> I would like to know how to deal with things like a PHY requiring
> some sort of shim driver, be it an internal one, or an external mfd
> chip that happens to have an Ethernet PHY included? How do we tie
> this into the PHY node under the MDIO bus?


[PATCH net-next] tcp: refactor struct tcp_skb_cb

2016-05-06 Thread Lawrence Brakmo
Refactor tcp_skb_cb to create two overlaping areas to store
state for incoming or outgoing skbs based on comments by
Neal Cardwell to tcp_nv patch:

   AFAICT this patch would not require an increase in the size of
   sk_buff cb[] if it were to take advantage of the fact that the
   tcp_skb_cb header.h4 and header.h6 fields are only used in the packet
   reception code path, and this in_flight field is only used on the
   transmit side.

Signed-off-by: Lawrence Brakmo 
Acked-by: Yuchung Cheng 
---
 include/net/tcp.h | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index dc588c3..c9ab561 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -765,11 +765,16 @@ struct tcp_skb_cb {
unused:6;
__u32   ack_seq;/* Sequence number ACK'd*/
union {
-   struct inet_skb_parmh4;
+   struct {
+   /* There is space for up to 20 bytes */
+   } tx;   /* only used for outgoing skbs */
+   union {
+   struct inet_skb_parmh4;
 #if IS_ENABLED(CONFIG_IPV6)
-   struct inet6_skb_parm   h6;
+   struct inet6_skb_parm   h6;
 #endif
-   } header;   /* For incoming frames  */
+   } header;   /* For incoming skbs */
+   };
 };
 
 #define TCP_SKB_CB(__skb)  ((struct tcp_skb_cb *)&((__skb)->cb[0]))
-- 
2.8.0.rc2



[PATCH net-next] tcp: replace cnt & rtt with struct in pkts_acked()

2016-05-06 Thread Lawrence Brakmo
Replace 2 arguments (cnt and rtt) in the congestion control modules'
pkts_acked() function with a struct. This will allow adding more
information without having to modify existing congestion control
modules (tcp_nv in particular needs bytes in flight when packet
was sent).

As proposed by Neal Cardwell in his comments to the tcp_nv patch.

Signed-off-by: Lawrence Brakmo 
Acked-by: Yuchung Cheng 
---
 include/net/tcp.h   |  7 ++-
 net/ipv4/tcp_bic.c  |  6 +++---
 net/ipv4/tcp_cdg.c  | 14 +++---
 net/ipv4/tcp_cubic.c|  6 +++---
 net/ipv4/tcp_htcp.c | 10 +-
 net/ipv4/tcp_illinois.c | 20 ++--
 net/ipv4/tcp_input.c|  7 +--
 net/ipv4/tcp_lp.c   |  6 +++---
 net/ipv4/tcp_vegas.c|  6 +++---
 net/ipv4/tcp_vegas.h|  2 +-
 net/ipv4/tcp_veno.c |  7 ---
 net/ipv4/tcp_westwood.c |  7 ---
 net/ipv4/tcp_yeah.c |  7 ---
 13 files changed, 58 insertions(+), 47 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 24ec804..dc588c3 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -849,6 +849,11 @@ enum tcp_ca_ack_event_flags {
 
 union tcp_cc_info;
 
+struct ack_sample {
+   u32 pkts_acked;
+   s32 rtt_us;
+};
+
 struct tcp_congestion_ops {
struct list_headlist;
u32 key;
@@ -872,7 +877,7 @@ struct tcp_congestion_ops {
/* new value of cwnd after loss (optional) */
u32  (*undo_cwnd)(struct sock *sk);
/* hook for packet ack accounting (optional) */
-   void (*pkts_acked)(struct sock *sk, u32 num_acked, s32 rtt_us);
+   void (*pkts_acked)(struct sock *sk, const struct ack_sample *sample);
/* get info for inet_diag (optional) */
size_t (*get_info)(struct sock *sk, u32 ext, int *attr,
   union tcp_cc_info *info);
diff --git a/net/ipv4/tcp_bic.c b/net/ipv4/tcp_bic.c
index fd1405d..f469f1b 100644
--- a/net/ipv4/tcp_bic.c
+++ b/net/ipv4/tcp_bic.c
@@ -197,15 +197,15 @@ static void bictcp_state(struct sock *sk, u8 new_state)
 /* Track delayed acknowledgment ratio using sliding window
  * ratio = (15*ratio + sample) / 16
  */
-static void bictcp_acked(struct sock *sk, u32 cnt, s32 rtt)
+static void bictcp_acked(struct sock *sk, const struct ack_sample *sample)
 {
const struct inet_connection_sock *icsk = inet_csk(sk);
 
if (icsk->icsk_ca_state == TCP_CA_Open) {
struct bictcp *ca = inet_csk_ca(sk);
 
-   cnt -= ca->delayed_ack >> ACK_RATIO_SHIFT;
-   ca->delayed_ack += cnt;
+   ca->delayed_ack += sample->pkts_acked - 
+   (ca->delayed_ack >> ACK_RATIO_SHIFT);
}
 }
 
diff --git a/net/ipv4/tcp_cdg.c b/net/ipv4/tcp_cdg.c
index ccce8a5..03725b2 100644
--- a/net/ipv4/tcp_cdg.c
+++ b/net/ipv4/tcp_cdg.c
@@ -294,12 +294,12 @@ static void tcp_cdg_cong_avoid(struct sock *sk, u32 ack, 
u32 acked)
ca->shadow_wnd = max(ca->shadow_wnd, ca->shadow_wnd + incr);
 }
 
-static void tcp_cdg_acked(struct sock *sk, u32 num_acked, s32 rtt_us)
+static void tcp_cdg_acked(struct sock *sk, const struct ack_sample *sample)
 {
struct cdg *ca = inet_csk_ca(sk);
struct tcp_sock *tp = tcp_sk(sk);
 
-   if (rtt_us <= 0)
+   if (sample->rtt_us <= 0)
return;
 
/* A heuristic for filtering delayed ACKs, adapted from:
@@ -307,20 +307,20 @@ static void tcp_cdg_acked(struct sock *sk, u32 num_acked, 
s32 rtt_us)
 * delay and rate based TCP mechanisms." TR 100219A. CAIA, 2010.
 */
if (tp->sacked_out == 0) {
-   if (num_acked == 1 && ca->delack) {
+   if (sample->pkts_acked == 1 && ca->delack) {
/* A delayed ACK is only used for the minimum if it is
 * provenly lower than an existing non-zero minimum.
 */
-   ca->rtt.min = min(ca->rtt.min, rtt_us);
+   ca->rtt.min = min(ca->rtt.min, sample->rtt_us);
ca->delack--;
return;
-   } else if (num_acked > 1 && ca->delack < 5) {
+   } else if (sample->pkts_acked > 1 && ca->delack < 5) {
ca->delack++;
}
}
 
-   ca->rtt.min = min_not_zero(ca->rtt.min, rtt_us);
-   ca->rtt.max = max(ca->rtt.max, rtt_us);
+   ca->rtt.min = min_not_zero(ca->rtt.min, sample->rtt_us);
+   ca->rtt.max = max(ca->rtt.max, sample->rtt_us);
 }
 
 static u32 tcp_cdg_ssthresh(struct sock *sk)
diff --git a/net/ipv4/tcp_cubic.c b/net/ipv4/tcp_cubic.c
index 0ce946e..c99230e 100644
--- a/net/ipv4/tcp_cubic.c
+++ b/net/ipv4/tcp_cubic.c
@@ -437,21 +437,21 @@ static void hystart_update(struct sock *sk, u32 delay)
 /* Track delayed acknowledgment ratio using sliding window
  * ratio = (15*ratio + sample) / 16
  */
-static void bictcp_acked(struct sock *sk, u32 cnt, s32 rtt_us)

Re: [PATCH v3 net-next 00/11] ipv6: Enable GUEoIPv6 and more fixes for v6 tunneling

2016-05-06 Thread Alexander Duyck
On Fri, May 6, 2016 at 7:11 PM, Tom Herbert  wrote:
> On Fri, May 6, 2016 at 7:03 PM, Alexander Duyck
>  wrote:
>> On Fri, May 6, 2016 at 6:57 PM, Tom Herbert  wrote:
>>> On Fri, May 6, 2016 at 6:09 PM, Alexander Duyck
>>>  wrote:
 On Fri, May 6, 2016 at 3:11 PM, Tom Herbert  wrote:
> This patch set:
>   - Fixes GRE6 to process translate flags correctly from configuration
>   - Adds support for GSO and GRO for ip6ip6 and ip4ip6
>   - Add support for FOU and GUE in IPv6
>   - Support GRE, ip6ip6 and ip4ip6 over FOU/GUE
>   - Fixes ip6_input to deal with UDP encapsulations
>   - Some other minor fixes
>
> v2:
>   - Removed a check of GSO types in MPLS
>   - Define GSO type SKB_GSO_IPXIP6 and SKB_GSO_IPXIP4 (based on input
> from Alexander)
>   - Don't define GSO types specifally for IP6IP6 and IP4IP6, above
> fix makes that uncessary
>   - Don't bother clearing encapsulation flag in UDP tunnel segment
> (another item suggested by Alexander).
>
> v3:
>   - Address some minor comments from Alexander
>
> Tested:
>Tested a variety of case, but not the full matrix (which is quite
>large now). Most of the obivous cases (e.g. GRE) work fine. Still
>some issues probably with GSO/GRO being effective in all cases.
>
> - IPv4/GRE/GUE/IPv6 with RCO
>   1 TCP_STREAM
> 6616 Mbps
>   200 TCP_RR
> 1244043 tps
> 141/243/446 90/95/99% latencies
> 86.61% CPU utilization
> - IPv6/GRE/GUE/IPv6 with RCO
>   1 TCP_STREAM
> 6940 Mbps
>   200 TCP_RR
> 1270903 tps
> 138/236/440 90/95/99% latencies
> 87.51% CPU utilization
>
>  - IP6IP6
>   1 TCP_STREAM
> 2576 Mbps
>   200 TCP_RR
> 498981 tps
> 388/498/631 90/95/99% latencies
> 19.75% CPU utilization (1 CPU saturated)
>
>  - IP6IP6/GUE/IPv6 with RCO
>   1 TCP_STREAM
> 1854 Mbps
>   200 TCP_RR
> 1233818 tps
> 143/244/451 90/95/99% latencies
> 87.57 CPU utilization
>
>  - IP4IP6
>   1 TCP_STREAM
>   200 TCP_RR
> 763774 tps
> 250/318/466 90/95/99% latencies
> 35.25% CPU utilization (1 CPU saturated)
>
>  - GRE with keyid
>   200 TCP_RR
> 744173 tps
> 258/332/461 90/95/99% latencies
> 34.59% CPU utilization (1 CPU saturated)

 So I tried testing your patch set and it looks like I cannot get GRE
 working for any netperf test.  If I pop the patches off it is even
 worse since it looks like patch 3 fixes some tunnel flags issues, but
 still doesn't resolve all the issues introduced with b05229f44228
 ("gre6: Cleanup GREv6 transmit path, call common GRE functions").
 Reverting the entire patch seems to resolve the issues, but I will try
 to pick it apart tonight to see if I can find the other issues that
 weren't addressed in this patch series.

>>>
>>> Can you give details about configuration, test you're running, and HW?
>>
>> The issue looks like it may be specific to ip6gretap.  I'm running the
>> test over an i40e adapter, but it shouldn't make much difference.  I'm
>> thinking it may have something to do with the MTU configuration as
>> that is one of the things I am noticing has changed between the
>> working and the broken version of the code.
>>
> I'm not seeing any issue with configuring:
>
> ip link add name tun8 type ip6gretap remote
> 2401:db00:20:911a:face:0:27:0 local 2401:db00:20:911a:face:0:25:0 ttl
> 225
>
> MTU issues would not surprise me with IPv6 though. This is part of the
> area of code that seems drastically different than what IPv4 is doing.

I am also using a key.

ip link add $name type ip6gretap key $net \
local fec0::1 remote $addr6 ttl 225 dev $PF0

Does the device you are using support any kind of checksum offload for
inner headers on GRE tunnels?  It looks like if I turn off checksums
and correct the MTU I can then send traffic without issues.  I'd say
that the Tx cleanup probably introduced 3 regressions.  The first one
you addressed in patch 3 which fixes the flags.  The second being the
fact that the MTU is wrong, and the third being something that
apparently broke checksum and maybe segmentation offload for
ip6gretap.

Really I think the transmit path cleanup should have probably been
broken down into a set of patches rather than slamming it in all in
one block.  I can spend some time next week trying to sort it out if
you don't have any hardware that supports GRE segmentation or checksum
offload.  If worse comes to worse I will just try 

Re: [PATCH v3 net-next 01/11] gso: Remove arbitrary checks for unsupported GSO

2016-05-06 Thread Alexander Duyck
On Fri, May 6, 2016 at 3:11 PM, Tom Herbert  wrote:
> diff --git a/net/mpls/mpls_gso.c b/net/mpls/mpls_gso.c
> index bbcf604..6de1e13 100644
> --- a/net/mpls/mpls_gso.c
> +++ b/net/mpls/mpls_gso.c
> @@ -26,15 +26,6 @@ static struct sk_buff *mpls_gso_segment(struct sk_buff 
> *skb,
> netdev_features_t mpls_features;
> __be16 mpls_protocol;
>
> -   if (unlikely(skb_shinfo(skb)->gso_type &
> -   ~(SKB_GSO_TCPV4 |
> - SKB_GSO_TCPV6 |
> - SKB_GSO_UDP |
> - SKB_GSO_DODGY |
> - SKB_GSO_TCP_FIXEDID |
> - SKB_GSO_TCP_ECN)))
> -   goto out;
> -
> /* Setup inner SKB. */
> mpls_protocol = skb->protocol;
> skb->protocol = skb->inner_protocol;

Actually I just noticed a build warning for this patch.  I guess the
label out isn't used anywhere else so you should probably drop it for
the mpls_gso_segment function.

- Alex


Re: [PATCH v3 net-next 00/11] ipv6: Enable GUEoIPv6 and more fixes for v6 tunneling

2016-05-06 Thread Tom Herbert
On Fri, May 6, 2016 at 7:03 PM, Alexander Duyck
 wrote:
> On Fri, May 6, 2016 at 6:57 PM, Tom Herbert  wrote:
>> On Fri, May 6, 2016 at 6:09 PM, Alexander Duyck
>>  wrote:
>>> On Fri, May 6, 2016 at 3:11 PM, Tom Herbert  wrote:
 This patch set:
   - Fixes GRE6 to process translate flags correctly from configuration
   - Adds support for GSO and GRO for ip6ip6 and ip4ip6
   - Add support for FOU and GUE in IPv6
   - Support GRE, ip6ip6 and ip4ip6 over FOU/GUE
   - Fixes ip6_input to deal with UDP encapsulations
   - Some other minor fixes

 v2:
   - Removed a check of GSO types in MPLS
   - Define GSO type SKB_GSO_IPXIP6 and SKB_GSO_IPXIP4 (based on input
 from Alexander)
   - Don't define GSO types specifally for IP6IP6 and IP4IP6, above
 fix makes that uncessary
   - Don't bother clearing encapsulation flag in UDP tunnel segment
 (another item suggested by Alexander).

 v3:
   - Address some minor comments from Alexander

 Tested:
Tested a variety of case, but not the full matrix (which is quite
large now). Most of the obivous cases (e.g. GRE) work fine. Still
some issues probably with GSO/GRO being effective in all cases.

 - IPv4/GRE/GUE/IPv6 with RCO
   1 TCP_STREAM
 6616 Mbps
   200 TCP_RR
 1244043 tps
 141/243/446 90/95/99% latencies
 86.61% CPU utilization
 - IPv6/GRE/GUE/IPv6 with RCO
   1 TCP_STREAM
 6940 Mbps
   200 TCP_RR
 1270903 tps
 138/236/440 90/95/99% latencies
 87.51% CPU utilization

  - IP6IP6
   1 TCP_STREAM
 2576 Mbps
   200 TCP_RR
 498981 tps
 388/498/631 90/95/99% latencies
 19.75% CPU utilization (1 CPU saturated)

  - IP6IP6/GUE/IPv6 with RCO
   1 TCP_STREAM
 1854 Mbps
   200 TCP_RR
 1233818 tps
 143/244/451 90/95/99% latencies
 87.57 CPU utilization

  - IP4IP6
   1 TCP_STREAM
   200 TCP_RR
 763774 tps
 250/318/466 90/95/99% latencies
 35.25% CPU utilization (1 CPU saturated)

  - GRE with keyid
   200 TCP_RR
 744173 tps
 258/332/461 90/95/99% latencies
 34.59% CPU utilization (1 CPU saturated)
>>>
>>> So I tried testing your patch set and it looks like I cannot get GRE
>>> working for any netperf test.  If I pop the patches off it is even
>>> worse since it looks like patch 3 fixes some tunnel flags issues, but
>>> still doesn't resolve all the issues introduced with b05229f44228
>>> ("gre6: Cleanup GREv6 transmit path, call common GRE functions").
>>> Reverting the entire patch seems to resolve the issues, but I will try
>>> to pick it apart tonight to see if I can find the other issues that
>>> weren't addressed in this patch series.
>>>
>>
>> Can you give details about configuration, test you're running, and HW?
>
> The issue looks like it may be specific to ip6gretap.  I'm running the
> test over an i40e adapter, but it shouldn't make much difference.  I'm
> thinking it may have something to do with the MTU configuration as
> that is one of the things I am noticing has changed between the
> working and the broken version of the code.
>
I'm not seeing any issue with configuring:

ip link add name tun8 type ip6gretap remote
2401:db00:20:911a:face:0:27:0 local 2401:db00:20:911a:face:0:25:0 ttl
225

MTU issues would not surprise me with IPv6 though. This is part of the
area of code that seems drastically different than what IPv4 is doing.

Tom


> - Alex


Re: [PATCH v3 net-next 00/11] ipv6: Enable GUEoIPv6 and more fixes for v6 tunneling

2016-05-06 Thread Alexander Duyck
On Fri, May 6, 2016 at 6:57 PM, Tom Herbert  wrote:
> On Fri, May 6, 2016 at 6:09 PM, Alexander Duyck
>  wrote:
>> On Fri, May 6, 2016 at 3:11 PM, Tom Herbert  wrote:
>>> This patch set:
>>>   - Fixes GRE6 to process translate flags correctly from configuration
>>>   - Adds support for GSO and GRO for ip6ip6 and ip4ip6
>>>   - Add support for FOU and GUE in IPv6
>>>   - Support GRE, ip6ip6 and ip4ip6 over FOU/GUE
>>>   - Fixes ip6_input to deal with UDP encapsulations
>>>   - Some other minor fixes
>>>
>>> v2:
>>>   - Removed a check of GSO types in MPLS
>>>   - Define GSO type SKB_GSO_IPXIP6 and SKB_GSO_IPXIP4 (based on input
>>> from Alexander)
>>>   - Don't define GSO types specifally for IP6IP6 and IP4IP6, above
>>> fix makes that uncessary
>>>   - Don't bother clearing encapsulation flag in UDP tunnel segment
>>> (another item suggested by Alexander).
>>>
>>> v3:
>>>   - Address some minor comments from Alexander
>>>
>>> Tested:
>>>Tested a variety of case, but not the full matrix (which is quite
>>>large now). Most of the obivous cases (e.g. GRE) work fine. Still
>>>some issues probably with GSO/GRO being effective in all cases.
>>>
>>> - IPv4/GRE/GUE/IPv6 with RCO
>>>   1 TCP_STREAM
>>> 6616 Mbps
>>>   200 TCP_RR
>>> 1244043 tps
>>> 141/243/446 90/95/99% latencies
>>> 86.61% CPU utilization
>>> - IPv6/GRE/GUE/IPv6 with RCO
>>>   1 TCP_STREAM
>>> 6940 Mbps
>>>   200 TCP_RR
>>> 1270903 tps
>>> 138/236/440 90/95/99% latencies
>>> 87.51% CPU utilization
>>>
>>>  - IP6IP6
>>>   1 TCP_STREAM
>>> 2576 Mbps
>>>   200 TCP_RR
>>> 498981 tps
>>> 388/498/631 90/95/99% latencies
>>> 19.75% CPU utilization (1 CPU saturated)
>>>
>>>  - IP6IP6/GUE/IPv6 with RCO
>>>   1 TCP_STREAM
>>> 1854 Mbps
>>>   200 TCP_RR
>>> 1233818 tps
>>> 143/244/451 90/95/99% latencies
>>> 87.57 CPU utilization
>>>
>>>  - IP4IP6
>>>   1 TCP_STREAM
>>>   200 TCP_RR
>>> 763774 tps
>>> 250/318/466 90/95/99% latencies
>>> 35.25% CPU utilization (1 CPU saturated)
>>>
>>>  - GRE with keyid
>>>   200 TCP_RR
>>> 744173 tps
>>> 258/332/461 90/95/99% latencies
>>> 34.59% CPU utilization (1 CPU saturated)
>>
>> So I tried testing your patch set and it looks like I cannot get GRE
>> working for any netperf test.  If I pop the patches off it is even
>> worse since it looks like patch 3 fixes some tunnel flags issues, but
>> still doesn't resolve all the issues introduced with b05229f44228
>> ("gre6: Cleanup GREv6 transmit path, call common GRE functions").
>> Reverting the entire patch seems to resolve the issues, but I will try
>> to pick it apart tonight to see if I can find the other issues that
>> weren't addressed in this patch series.
>>
>
> Can you give details about configuration, test you're running, and HW?

The issue looks like it may be specific to ip6gretap.  I'm running the
test over an i40e adapter, but it shouldn't make much difference.  I'm
thinking it may have something to do with the MTU configuration as
that is one of the things I am noticing has changed between the
working and the broken version of the code.

- Alex


Re: [PATCH v3 net-next 00/11] ipv6: Enable GUEoIPv6 and more fixes for v6 tunneling

2016-05-06 Thread Tom Herbert
On Fri, May 6, 2016 at 6:09 PM, Alexander Duyck
 wrote:
> On Fri, May 6, 2016 at 3:11 PM, Tom Herbert  wrote:
>> This patch set:
>>   - Fixes GRE6 to process translate flags correctly from configuration
>>   - Adds support for GSO and GRO for ip6ip6 and ip4ip6
>>   - Add support for FOU and GUE in IPv6
>>   - Support GRE, ip6ip6 and ip4ip6 over FOU/GUE
>>   - Fixes ip6_input to deal with UDP encapsulations
>>   - Some other minor fixes
>>
>> v2:
>>   - Removed a check of GSO types in MPLS
>>   - Define GSO type SKB_GSO_IPXIP6 and SKB_GSO_IPXIP4 (based on input
>> from Alexander)
>>   - Don't define GSO types specifally for IP6IP6 and IP4IP6, above
>> fix makes that uncessary
>>   - Don't bother clearing encapsulation flag in UDP tunnel segment
>> (another item suggested by Alexander).
>>
>> v3:
>>   - Address some minor comments from Alexander
>>
>> Tested:
>>Tested a variety of case, but not the full matrix (which is quite
>>large now). Most of the obivous cases (e.g. GRE) work fine. Still
>>some issues probably with GSO/GRO being effective in all cases.
>>
>> - IPv4/GRE/GUE/IPv6 with RCO
>>   1 TCP_STREAM
>> 6616 Mbps
>>   200 TCP_RR
>> 1244043 tps
>> 141/243/446 90/95/99% latencies
>> 86.61% CPU utilization
>> - IPv6/GRE/GUE/IPv6 with RCO
>>   1 TCP_STREAM
>> 6940 Mbps
>>   200 TCP_RR
>> 1270903 tps
>> 138/236/440 90/95/99% latencies
>> 87.51% CPU utilization
>>
>>  - IP6IP6
>>   1 TCP_STREAM
>> 2576 Mbps
>>   200 TCP_RR
>> 498981 tps
>> 388/498/631 90/95/99% latencies
>> 19.75% CPU utilization (1 CPU saturated)
>>
>>  - IP6IP6/GUE/IPv6 with RCO
>>   1 TCP_STREAM
>> 1854 Mbps
>>   200 TCP_RR
>> 1233818 tps
>> 143/244/451 90/95/99% latencies
>> 87.57 CPU utilization
>>
>>  - IP4IP6
>>   1 TCP_STREAM
>>   200 TCP_RR
>> 763774 tps
>> 250/318/466 90/95/99% latencies
>> 35.25% CPU utilization (1 CPU saturated)
>>
>>  - GRE with keyid
>>   200 TCP_RR
>> 744173 tps
>> 258/332/461 90/95/99% latencies
>> 34.59% CPU utilization (1 CPU saturated)
>
> So I tried testing your patch set and it looks like I cannot get GRE
> working for any netperf test.  If I pop the patches off it is even
> worse since it looks like patch 3 fixes some tunnel flags issues, but
> still doesn't resolve all the issues introduced with b05229f44228
> ("gre6: Cleanup GREv6 transmit path, call common GRE functions").
> Reverting the entire patch seems to resolve the issues, but I will try
> to pick it apart tonight to see if I can find the other issues that
> weren't addressed in this patch series.
>

Can you give details about configuration, test you're running, and HW?

> - Alex


[PATCH net-next 2/2] net: original ingress device index in PKTINFO

2016-05-06 Thread David Ahern
Applications such as OSPF and BFD need the original ingress device not
the VRF device; the latter can be derived from the former. To that end
add the skb_iif to inet_skb_parm and set it in ipv4 code after clearing
the skb control buffer similar to IPv6. From there the pktinfo can just
pull it from cb with the PKTINFO_SKB_CB cast.

The previous patch moving the skb->dev change to L3 means nothing else
is needed for IPv6; it just works.

Signed-off-by: David Ahern 
---
 include/net/ip.h   | 1 +
 net/ipv4/ip_input.c| 1 +
 net/ipv4/ip_sockglue.c | 7 ++-
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/net/ip.h b/include/net/ip.h
index 247ac82e9cf2..37165fba3741 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -36,6 +36,7 @@
 struct sock;
 
 struct inet_skb_parm {
+   int iif;
struct ip_options   opt;/* Compiled IP options  
*/
unsigned char   flags;
 
diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
index 37375eedeef9..4b351af3e67b 100644
--- a/net/ipv4/ip_input.c
+++ b/net/ipv4/ip_input.c
@@ -478,6 +478,7 @@ int ip_rcv(struct sk_buff *skb, struct net_device *dev, 
struct packet_type *pt,
 
/* Remove any debris in the socket control block */
memset(IPCB(skb), 0, sizeof(struct inet_skb_parm));
+   IPCB(skb)->iif = skb->skb_iif;
 
/* Must drop socket now because of tproxy. */
skb_orphan(skb);
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index bdb222c0c6a2..5805762d7fc7 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -1193,7 +1193,12 @@ void ipv4_pktinfo_prepare(const struct sock *sk, struct 
sk_buff *skb)
   ipv6_sk_rxinfo(sk);
 
if (prepare && skb_rtable(skb)) {
-   pktinfo->ipi_ifindex = inet_iif(skb);
+   /* skb->cb is overloaded: prior to this point it is IP{6}CB
+* which has interface index (iif) as the first member of the
+* underlying inet{6}_skb_parm struct. This code then overlays
+* PKTINFO_SKB_CB and in_pktinfo also has iif as the first
+* element so the iif is picked up from the prior IPCB
+*/
pktinfo->ipi_spec_dst.s_addr = fib_compute_spec_dst(skb);
} else {
pktinfo->ipi_ifindex = 0;
-- 
2.1.4



[PATCH net-next v2 1/2] net: l3mdev: Add hook in ip and ipv6

2016-05-06 Thread David Ahern
Currently the VRF driver uses the rx_handler to switch the skb device
to the VRF device. Switching the dev prior to the ip / ipv6 layer
means the VRF driver has to duplicate IP/IPv6 processing which adds
overhead and makes features such as retaining the ingress device index
more complicated than necessary.

This patch moves the hook to the L3 layer just after the first NF_HOOK
for PRE_ROUTING. This location makes exposing the original ingress device
trivial (next patch) and allows adding other NF_HOOKs to the VRF driver
in the future.

dev_queue_xmit_nit is exported so that the VRF driver can cycle the skb
with the switched device through the packet taps to maintain current
behavior (tcpdump can be used on either the vrf device or the enslaved
devices).

Signed-off-by: David Ahern 
---
v2
- add skb_l3mdev_slave helper for inet6_iif and tcp_v6_iif rather
  than open coding the if case. Added benefit that the change compiles
  out if CONFIG_NET_L3_MASTER_DEV is not enabled.

 drivers/net/vrf.c | 188 ++
 include/linux/ipv6.h  |  17 -
 include/linux/netdevice.h |   2 +
 include/net/l3mdev.h  |  43 +++
 include/net/tcp.h |   4 +-
 net/core/dev.c|   3 +-
 net/ipv4/ip_input.c   |   7 ++
 net/ipv6/ip6_input.c  |   7 ++
 8 files changed, 170 insertions(+), 101 deletions(-)

diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
index 4b2461ae5d3b..0e2a58506a35 100644
--- a/drivers/net/vrf.c
+++ b/drivers/net/vrf.c
@@ -42,9 +42,6 @@
 #define DRV_NAME   "vrf"
 #define DRV_VERSION"1.0"
 
-#define vrf_master_get_rcu(dev) \
-   ((struct net_device *)rcu_dereference(dev->rx_handler_data))
-
 struct net_vrf {
struct rtable   *rth;
struct rt6_info *rt6;
@@ -60,90 +57,12 @@ struct pcpu_dstats {
struct u64_stats_sync   syncp;
 };
 
-/* neighbor handling is done with actual device; do not want
- * to flip skb->dev for those ndisc packets. This really fails
- * for multiple next protocols (e.g., NEXTHDR_HOP). But it is
- * a start.
- */
-#if IS_ENABLED(CONFIG_IPV6)
-static bool check_ipv6_frame(const struct sk_buff *skb)
-{
-   const struct ipv6hdr *ipv6h;
-   struct ipv6hdr _ipv6h;
-   bool rc = true;
-
-   ipv6h = skb_header_pointer(skb, 0, sizeof(_ipv6h), &_ipv6h);
-   if (!ipv6h)
-   goto out;
-
-   if (ipv6h->nexthdr == NEXTHDR_ICMP) {
-   const struct icmp6hdr *icmph;
-   struct icmp6hdr _icmph;
-
-   icmph = skb_header_pointer(skb, sizeof(_ipv6h),
-  sizeof(_icmph), &_icmph);
-   if (!icmph)
-   goto out;
-
-   switch (icmph->icmp6_type) {
-   case NDISC_ROUTER_SOLICITATION:
-   case NDISC_ROUTER_ADVERTISEMENT:
-   case NDISC_NEIGHBOUR_SOLICITATION:
-   case NDISC_NEIGHBOUR_ADVERTISEMENT:
-   case NDISC_REDIRECT:
-   rc = false;
-   break;
-   }
-   }
-
-out:
-   return rc;
-}
-#else
-static bool check_ipv6_frame(const struct sk_buff *skb)
-{
-   return false;
-}
-#endif
-
-static bool is_ip_rx_frame(struct sk_buff *skb)
-{
-   switch (skb->protocol) {
-   case htons(ETH_P_IP):
-   return true;
-   case htons(ETH_P_IPV6):
-   return check_ipv6_frame(skb);
-   }
-   return false;
-}
-
 static void vrf_tx_error(struct net_device *vrf_dev, struct sk_buff *skb)
 {
vrf_dev->stats.tx_errors++;
kfree_skb(skb);
 }
 
-/* note: already called with rcu_read_lock */
-static rx_handler_result_t vrf_handle_frame(struct sk_buff **pskb)
-{
-   struct sk_buff *skb = *pskb;
-
-   if (is_ip_rx_frame(skb)) {
-   struct net_device *dev = vrf_master_get_rcu(skb->dev);
-   struct pcpu_dstats *dstats = this_cpu_ptr(dev->dstats);
-
-   u64_stats_update_begin(>syncp);
-   dstats->rx_pkts++;
-   dstats->rx_bytes += skb->len;
-   u64_stats_update_end(>syncp);
-
-   skb->dev = dev;
-
-   return RX_HANDLER_ANOTHER;
-   }
-   return RX_HANDLER_PASS;
-}
-
 static struct rtnl_link_stats64 *vrf_get_stats64(struct net_device *dev,
 struct rtnl_link_stats64 
*stats)
 {
@@ -506,28 +425,14 @@ static int do_vrf_add_slave(struct net_device *dev, 
struct net_device *port_dev)
 {
int ret;
 
-   /* register the packet handler for slave ports */
-   ret = netdev_rx_handler_register(port_dev, vrf_handle_frame, dev);
-   if (ret) {
-   netdev_err(port_dev,
-  "Device %s failed to register rx_handler\n",
-  port_dev->name);
-   goto out_fail;
-   }
-
ret = netdev_master_upper_dev_link(port_dev, dev, 

[PATCH net-next v2 0/2] net: vrf: Fixup PKTINFO to return enslaved device index

2016-05-06 Thread David Ahern
Applications such as OSPF and BFD need the original ingress device not
the VRF device; the latter can be derived from the former. To that end
move the packet intercept from an rx handler that is invoked by
__netif_receive_skb_core to the ipv4 and ipv6 receive processing.

IPv6 already saves the skb_iif to the control buffer in ipv6_rcv. Since
the skb->dev has not been switched the cb has the enslaved device. Make
the same happen for IPv4 by adding the skb_iif to inet_skb_parm and set
it in ipv4 code after clearing the skb control buffer similar to IPv6.
>From there the pktinfo can just pull it from cb with the PKTINFO_SKB_CB
cast. 

David Ahern (2):
  net: l3mdev: Add hook in ip and ipv6
  net: original ingress device index in PKTINFO

 drivers/net/vrf.c | 188 ++
 include/linux/ipv6.h  |  17 -
 include/linux/netdevice.h |   2 +
 include/net/ip.h  |   1 +
 include/net/l3mdev.h  |  43 +++
 include/net/tcp.h |   4 +-
 net/core/dev.c|   3 +-
 net/ipv4/ip_input.c   |   8 ++
 net/ipv4/ip_sockglue.c|   7 +-
 net/ipv6/ip6_input.c  |   7 ++
 10 files changed, 178 insertions(+), 102 deletions(-)

-- 
2.1.4



[PATCH v2 net-next] ifb: support more features

2016-05-06 Thread Eric Dumazet
From: Eric Dumazet 

When using ifb+netem on ingress on SIT/IPIP/GRE traffic,
GRO packets are not properly processed.

Segmentation should not be forced, since ifb is already adding
quite a performance hit.

Signed-off-by: Eric Dumazet 
---
 drivers/net/ifb.c |3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ifb.c b/drivers/net/ifb.c
index cc56fac3c3f8..66c0eeafcb5d 100644
--- a/drivers/net/ifb.c
+++ b/drivers/net/ifb.c
@@ -196,6 +196,7 @@ static const struct net_device_ops ifb_netdev_ops = {
 
 #define IFB_FEATURES (NETIF_F_HW_CSUM | NETIF_F_SG  | NETIF_F_FRAGLIST | \
  NETIF_F_TSO_ECN | NETIF_F_TSO | NETIF_F_TSO6  | \
+ NETIF_F_GSO_ENCAP_ALL | \
  NETIF_F_HIGHDMA | NETIF_F_HW_VLAN_CTAG_TX | \
  NETIF_F_HW_VLAN_STAG_TX)
 
@@ -224,6 +225,8 @@ static void ifb_setup(struct net_device *dev)
dev->tx_queue_len = TX_Q_LIMIT;
 
dev->features |= IFB_FEATURES;
+   dev->hw_features |= dev->features;
+   dev->hw_enc_features |= dev->features;
dev->vlan_features |= IFB_FEATURES & ~(NETIF_F_HW_VLAN_CTAG_TX |
   NETIF_F_HW_VLAN_STAG_TX);
 




Re: [PATCH v3 net-next 00/11] ipv6: Enable GUEoIPv6 and more fixes for v6 tunneling

2016-05-06 Thread Alexander Duyck
On Fri, May 6, 2016 at 3:11 PM, Tom Herbert  wrote:
> This patch set:
>   - Fixes GRE6 to process translate flags correctly from configuration
>   - Adds support for GSO and GRO for ip6ip6 and ip4ip6
>   - Add support for FOU and GUE in IPv6
>   - Support GRE, ip6ip6 and ip4ip6 over FOU/GUE
>   - Fixes ip6_input to deal with UDP encapsulations
>   - Some other minor fixes
>
> v2:
>   - Removed a check of GSO types in MPLS
>   - Define GSO type SKB_GSO_IPXIP6 and SKB_GSO_IPXIP4 (based on input
> from Alexander)
>   - Don't define GSO types specifally for IP6IP6 and IP4IP6, above
> fix makes that uncessary
>   - Don't bother clearing encapsulation flag in UDP tunnel segment
> (another item suggested by Alexander).
>
> v3:
>   - Address some minor comments from Alexander
>
> Tested:
>Tested a variety of case, but not the full matrix (which is quite
>large now). Most of the obivous cases (e.g. GRE) work fine. Still
>some issues probably with GSO/GRO being effective in all cases.
>
> - IPv4/GRE/GUE/IPv6 with RCO
>   1 TCP_STREAM
> 6616 Mbps
>   200 TCP_RR
> 1244043 tps
> 141/243/446 90/95/99% latencies
> 86.61% CPU utilization
> - IPv6/GRE/GUE/IPv6 with RCO
>   1 TCP_STREAM
> 6940 Mbps
>   200 TCP_RR
> 1270903 tps
> 138/236/440 90/95/99% latencies
> 87.51% CPU utilization
>
>  - IP6IP6
>   1 TCP_STREAM
> 2576 Mbps
>   200 TCP_RR
> 498981 tps
> 388/498/631 90/95/99% latencies
> 19.75% CPU utilization (1 CPU saturated)
>
>  - IP6IP6/GUE/IPv6 with RCO
>   1 TCP_STREAM
> 1854 Mbps
>   200 TCP_RR
> 1233818 tps
> 143/244/451 90/95/99% latencies
> 87.57 CPU utilization
>
>  - IP4IP6
>   1 TCP_STREAM
>   200 TCP_RR
> 763774 tps
> 250/318/466 90/95/99% latencies
> 35.25% CPU utilization (1 CPU saturated)
>
>  - GRE with keyid
>   200 TCP_RR
> 744173 tps
> 258/332/461 90/95/99% latencies
> 34.59% CPU utilization (1 CPU saturated)

So I tried testing your patch set and it looks like I cannot get GRE
working for any netperf test.  If I pop the patches off it is even
worse since it looks like patch 3 fixes some tunnel flags issues, but
still doesn't resolve all the issues introduced with b05229f44228
("gre6: Cleanup GREv6 transmit path, call common GRE functions").
Reverting the entire patch seems to resolve the issues, but I will try
to pick it apart tonight to see if I can find the other issues that
weren't addressed in this patch series.

- Alex


Re: [PATCH net-next 0/2] net: vrf: Fixup PKTINFO to return enslaved device index

2016-05-06 Thread David Ahern

On 5/5/16 1:37 PM, David Ahern wrote:

Applications such as OSPF and BFD need the original ingress device not
the VRF device; the latter can be derived from the former. To that end
move the packet intercept from an rx handler that is invoked by
__netif_receive_skb_core to the ipv4 and ipv6 receive processing.

IPv6 already saves the skb_iif to the control buffer in ipv6_rcv. Since
the skb->dev has not been switched the cb has the enslaved device. Make
the same happen for IPv4 by adding the skb_iif to inet_skb_parm and set
it in ipv4 code after clearing the skb control buffer similar to IPv6.
 From there the pktinfo can just pull it from cb with the PKTINFO_SKB_CB
cast.

David Ahern (2):
   net: l3mdev: Add hook in ip and ipv6
   net: original ingress device index in PKTINFO

  drivers/net/vrf.c | 186 ++
  include/linux/ipv6.h  |   3 +-
  include/linux/netdevice.h |   2 +
  include/net/ip.h  |   1 +
  include/net/l3mdev.h  |  43 +++
  include/net/tcp.h |   3 +-
  net/core/dev.c|   3 +-
  net/ipv4/ip_input.c   |   8 ++
  net/ipv4/ip_sockglue.c|   7 +-
  net/ipv6/ip6_input.c  |   7 ++
  10 files changed, 161 insertions(+), 102 deletions(-)



Dave: please ignore. I have an update that better encapsulates the 
inet6_iif and tcp iif change. Will send v2 shortly.


Re: [PATCH net-next] ifb: support more features

2016-05-06 Thread Eric Dumazet
On Fri, 2016-05-06 at 16:41 -0700, Eric Dumazet wrote:
> From: Eric Dumazet 
> 
> When using ifb+netem on ingress on SIT/IPIP/GRE traffic,
> GRO packets are not properly processed.
> 
> Segmentation should not be forced, as ifb is already adding
> quite a performance hit.

Please ignore, wrong version.

Will send a V2

Thanks.




Re: [PATCH net-next] net: make sch_handle_ingress() drop monitor ready

2016-05-06 Thread Alexei Starovoitov
On Fri, May 06, 2016 at 03:55:50PM -0700, Eric Dumazet wrote:
> From: Eric Dumazet 
> 
> TC_ACT_STOLEN is used when ingress traffic is mirred/redirected
> to say ifb.
> 
> Packet is not dropped, but consumed.
> 
> Only TC_ACT_SHOT is a clear indication something went wrong.
> 
> Signed-off-by: Eric Dumazet 
> Cc: Jamal Hadi Salim 

Acked-by: Alexei Starovoitov 



Re: [patch net-next] mlxsw: spectrum: Fix ordering in mlxsw_sp_fini

2016-05-06 Thread David Miller
From: Jiri Pirko 
Date: Fri,  6 May 2016 22:20:59 +0200

> From: Jiri Pirko 
> 
> Fixes: 0f433fa0ec ("mlxsw: spectrum_buffers: Implement shared buffer 
> configuration")
> Signed-off-by: Jiri Pirko 

Applied.


Re: [Intel-wired-lan] [PATCH] e1000e: prevent division by zero if TIMINCA is zero

2016-05-06 Thread Rustad, Mark D

Denys Vlasenko  wrote:


Users report that under VMWare, er32(TIMINCA) returns zero.
This causes division by zero at init time as follows:

 ==>incvalue = er32(TIMINCA) & E1000_TIMINCA_INCVALUE_MASK;
for (i = 0; i < E1000_MAX_82574_SYSTIM_REREADS; i++) {
/* latch SYSTIMH on read of SYSTIML */
systim_next = (cycle_t)er32(SYSTIML);
systim_next |= (cycle_t)er32(SYSTIMH) << 32;

time_delta = systim_next - systim;
temp = time_delta;
 >  rem = do_div(temp, incvalue);

This change makes kernel survive this, and users report that
NIC does work after this change.

Since on real hardware incvalue is never zero, this should not affect
real hardware use case.

Signed-off-by: Denys Vlasenko 
CC: Jeff Kirsher 
CC: "Ruinskiy, Dima" 
CC: intel-wired-...@lists.osuosl.org
CC: netdev@vger.kernel.org
CC: LKML 
---
 drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c  
b/drivers/net/ethernet/intel/e1000e/netdev.c

index 269087c..0626935 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -4315,7 +4315,8 @@ static cycle_t e1000e_cyclecounter_read(const  
struct cyclecounter *cc)


time_delta = systim_next - systim;
temp = time_delta;
-   rem = do_div(temp, incvalue);
+   /* VMWare users have seen incvalue of zero, don't div / 
0 */
+   rem = incvalue ? do_div(temp, incvalue) : (time_delta 
!= 0);

systim = systim_next;



I seem to recall that this was rejected before because it really is  
VMWare's bug and, if they fix it, any existing VMs that use this will just  
work. Changing the driver will only fix it for vms that install a new  
driver. I don't object to doing it, it just seems like not the most  
effective place to address the issue.


--
Mark Rustad, Networking Division, Intel Corporation


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [PATCH 1/2] net: phy: add ethtool_phy_{get|set}_link_ksettings

2016-05-06 Thread Florian Fainelli
On 06/05/16 16:18, Philippe Reynes wrote:
> The callback {get|set}_link_ksettings are often defined
> in a very close way. There are mainly two differences in
> those callback:
> - the name of the netdev private structure
> - the name of the struct phydev in the private structure
> 
> We add two defines ethtool_phy_{get|set}_link_ksettings
> to avoid writing severals times almost the same function.

This looks fine in principle, but then there is a whole ton of code that
could become like that in the kernel, I do not have any strong opinion
either way...
-- 
Florian


[PATCH net-next] ifb: support more features

2016-05-06 Thread Eric Dumazet
From: Eric Dumazet 

When using ifb+netem on ingress on SIT/IPIP/GRE traffic,
GRO packets are not properly processed.

Segmentation should not be forced, as ifb is already adding
quite a performance hit.

Signed-off-by: Eric Dumazet 
---
 drivers/net/ifb.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ifb.c b/drivers/net/ifb.c
index cc56fac3c3f8..0cb5d8cbe679 100644
--- a/drivers/net/ifb.c
+++ b/drivers/net/ifb.c
@@ -197,7 +197,7 @@ static const struct net_device_ops ifb_netdev_ops = {
 #define IFB_FEATURES (NETIF_F_HW_CSUM | NETIF_F_SG  | NETIF_F_FRAGLIST | \
  NETIF_F_TSO_ECN | NETIF_F_TSO | NETIF_F_TSO6  | \
  NETIF_F_HIGHDMA | NETIF_F_HW_VLAN_CTAG_TX | \
- NETIF_F_HW_VLAN_STAG_TX)
+ NETIF_F_GSO_ENCAP_ALL | NETIF_F_HW_VLAN_STAG_TX)
 
 static void ifb_dev_free(struct net_device *dev)
 {
@@ -224,6 +224,7 @@ static void ifb_setup(struct net_device *dev)
dev->tx_queue_len = TX_Q_LIMIT;
 
dev->features |= IFB_FEATURES;
+   dev->hw_features |= NETIF_F_GSO_ENCAP_ALL;
dev->vlan_features |= IFB_FEATURES & ~(NETIF_F_HW_VLAN_CTAG_TX |
   NETIF_F_HW_VLAN_STAG_TX);
 




[PATCH 2/2] net: ethernet: fec: use ethtool_phy_{get|set}_link_ksettings

2016-05-06 Thread Philippe Reynes
Use the generic ethtool_phy_{get|set}_link_ksettings to
generate the callback for {get|set}_link_ksettings.

Signed-off-by: Philippe Reynes 
---
 drivers/net/ethernet/freescale/fec_main.c |   25 ++---
 1 files changed, 2 insertions(+), 23 deletions(-)

diff --git a/drivers/net/ethernet/freescale/fec_main.c 
b/drivers/net/ethernet/freescale/fec_main.c
index bfa10c3..00339d0 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -2058,29 +2058,8 @@ static void fec_enet_mii_remove(struct fec_enet_private 
*fep)
}
 }
 
-static int fec_enet_get_link_ksettings(struct net_device *ndev,
-  struct ethtool_link_ksettings *cmd)
-{
-   struct fec_enet_private *fep = netdev_priv(ndev);
-   struct phy_device *phydev = fep->phy_dev;
-
-   if (!phydev)
-   return -ENODEV;
-
-   return phy_ethtool_ksettings_get(phydev, cmd);
-}
-
-static int fec_enet_set_link_ksettings(struct net_device *ndev,
-  const struct ethtool_link_ksettings *cmd)
-{
-   struct fec_enet_private *fep = netdev_priv(ndev);
-   struct phy_device *phydev = fep->phy_dev;
-
-   if (!phydev)
-   return -ENODEV;
-
-   return phy_ethtool_ksettings_set(phydev, cmd);
-}
+ethtool_phy_get_link_ksettings(fec_enet, fec_enet_private, phy_dev);
+ethtool_phy_set_link_ksettings(fec_enet, fec_enet_private, phy_dev);
 
 static void fec_enet_get_drvinfo(struct net_device *ndev,
 struct ethtool_drvinfo *info)
-- 
1.7.4.4



[PATCH 1/2] net: phy: add ethtool_phy_{get|set}_link_ksettings

2016-05-06 Thread Philippe Reynes
The callback {get|set}_link_ksettings are often defined
in a very close way. There are mainly two differences in
those callback:
- the name of the netdev private structure
- the name of the struct phydev in the private structure

We add two defines ethtool_phy_{get|set}_link_ksettings
to avoid writing severals times almost the same function.

Signed-off-by: Philippe Reynes 
---
 include/linux/phy.h |   46 ++
 1 files changed, 46 insertions(+), 0 deletions(-)

diff --git a/include/linux/phy.h b/include/linux/phy.h
index be3f83b..e4a79fa 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -830,6 +830,52 @@ int phy_ethtool_set_wol(struct phy_device *phydev, struct 
ethtool_wolinfo *wol);
 void phy_ethtool_get_wol(struct phy_device *phydev,
 struct ethtool_wolinfo *wol);
 
+/**
+ * ethtool_phy_get_link_ksettings() - Helper macro for get_link_ksettings
+ * @name: name of the driver
+ * @private: name of the private structure in the net device
+ * @phy: name of the phydev variable in the private structure
+ *
+ * Helper macro for the callback get_link_ksettings which
+ * simply call phy_ethtool_ksettings_get.
+ */
+#define ethtool_phy_get_link_ksettings(name, private, phy) \
+   static int  \
+   name##_get_link_ksettings(struct net_device *ndev,  \
+ struct ethtool_link_ksettings *cmd)   \
+   {   \
+   struct private *priv = netdev_priv(ndev);   \
+   struct phy_device *phydev = priv->phy;  \
+   \
+   if (!phydev)\
+   return -ENODEV; \
+   \
+   return phy_ethtool_ksettings_get(phydev, cmd);  \
+   }
+
+/**
+ * ethtool_phy_set_link_ksettings() - Helper macro for set_link_ksettings
+ * @name: name of the driver
+ * @private: name of the private structure in the net device
+ * @phy: name of the phydev variable in the private structure
+ *
+ * Helper macro for the callback set_link_ksettings which
+ * simply call phy_ethtool_ksettings_set.
+ */
+#define ethtool_phy_set_link_ksettings(name, private, phy) \
+   static int  \
+   name##_set_link_ksettings(struct net_device *ndev,  \
+ const struct ethtool_link_ksettings *cmd) \
+   {   \
+   struct private *priv = netdev_priv(ndev);   \
+   struct phy_device *phydev = priv->phy;  \
+   \
+   if (!phydev)\
+   return -ENODEV; \
+   \
+   return phy_ethtool_ksettings_set(phydev, cmd);  \
+   }
+
 int __init mdio_bus_init(void);
 void mdio_bus_exit(void);
 
-- 
1.7.4.4



[PATCH net-next 3/7] Driver: Vmxnet3: Allow variable length Transmit Data ring buffer

2016-05-06 Thread Shrikrishna Khare
Signed-off-by: Sriram Rangarajan 
Signed-off-by: Shrikrishna Khare 
---
 drivers/net/vmxnet3/vmxnet3_defs.h| 12 +++-
 drivers/net/vmxnet3/vmxnet3_drv.c | 55 ++-
 drivers/net/vmxnet3/vmxnet3_ethtool.c |  9 +++---
 drivers/net/vmxnet3/vmxnet3_int.h |  7 -
 4 files changed, 64 insertions(+), 19 deletions(-)

diff --git a/drivers/net/vmxnet3/vmxnet3_defs.h 
b/drivers/net/vmxnet3/vmxnet3_defs.h
index a26a69d..701d989 100644
--- a/drivers/net/vmxnet3/vmxnet3_defs.h
+++ b/drivers/net/vmxnet3/vmxnet3_defs.h
@@ -92,6 +92,7 @@ enum {
VMXNET3_CMD_GET_DEV_EXTRA_INFO,
VMXNET3_CMD_GET_CONF_INTR,
VMXNET3_CMD_GET_RESERVED1,
+   VMXNET3_CMD_GET_TXDATA_DESC_SIZE
 };
 
 /*
@@ -377,6 +378,10 @@ union Vmxnet3_GenericDesc {
 #define VMXNET3_RING_SIZE_ALIGN 32
 #define VMXNET3_RING_SIZE_MASK  (VMXNET3_RING_SIZE_ALIGN - 1)
 
+/* Tx Data Ring buffer size must be a multiple of 64 */
+#define VMXNET3_TXDATA_DESC_SIZE_ALIGN 64
+#define VMXNET3_TXDATA_DESC_SIZE_MASK  (VMXNET3_TXDATA_DESC_SIZE_ALIGN - 1)
+
 /* Max ring size */
 #define VMXNET3_TX_RING_MAX_SIZE   4096
 #define VMXNET3_TC_RING_MAX_SIZE   4096
@@ -384,6 +389,9 @@ union Vmxnet3_GenericDesc {
 #define VMXNET3_RX_RING2_MAX_SIZE  4096
 #define VMXNET3_RC_RING_MAX_SIZE   8192
 
+#define VMXNET3_TXDATA_DESC_MIN_SIZE 128
+#define VMXNET3_TXDATA_DESC_MAX_SIZE 2048
+
 /* a list of reasons for queue stop */
 
 enum {
@@ -470,7 +478,9 @@ struct Vmxnet3_TxQueueConf {
__le32  compRingSize; /* # of comp desc */
__le32  ddLen;/* size of driver data */
u8  intrIdx;
-   u8  _pad[7];
+   u8  _pad1[1];
+   __le16  txDataRingDescSize;
+   u8  _pad2[4];
 };
 
 
diff --git a/drivers/net/vmxnet3/vmxnet3_drv.c 
b/drivers/net/vmxnet3/vmxnet3_drv.c
index 53434ed..58632b1 100644
--- a/drivers/net/vmxnet3/vmxnet3_drv.c
+++ b/drivers/net/vmxnet3/vmxnet3_drv.c
@@ -435,8 +435,8 @@ vmxnet3_tq_destroy(struct vmxnet3_tx_queue *tq,
tq->tx_ring.base = NULL;
}
if (tq->data_ring.base) {
-   dma_free_coherent(>pdev->dev, tq->data_ring.size *
- sizeof(struct Vmxnet3_TxDataDesc),
+   dma_free_coherent(>pdev->dev,
+ tq->data_ring.size * tq->txdata_desc_size,
  tq->data_ring.base, tq->data_ring.basePA);
tq->data_ring.base = NULL;
}
@@ -478,8 +478,8 @@ vmxnet3_tq_init(struct vmxnet3_tx_queue *tq,
tq->tx_ring.next2fill = tq->tx_ring.next2comp = 0;
tq->tx_ring.gen = VMXNET3_INIT_GEN;
 
-   memset(tq->data_ring.base, 0, tq->data_ring.size *
-  sizeof(struct Vmxnet3_TxDataDesc));
+   memset(tq->data_ring.base, 0,
+  tq->data_ring.size * tq->txdata_desc_size);
 
/* reset the tx comp ring contents to 0 and reset comp ring states */
memset(tq->comp_ring.base, 0, tq->comp_ring.size *
@@ -514,10 +514,10 @@ vmxnet3_tq_create(struct vmxnet3_tx_queue *tq,
}
 
tq->data_ring.base = dma_alloc_coherent(>pdev->dev,
-   tq->data_ring.size * sizeof(struct Vmxnet3_TxDataDesc),
+   tq->data_ring.size * tq->txdata_desc_size,
>data_ring.basePA, GFP_KERNEL);
if (!tq->data_ring.base) {
-   netdev_err(adapter->netdev, "failed to allocate data ring\n");
+   netdev_err(adapter->netdev, "failed to allocate tx data 
ring\n");
goto err;
}
 
@@ -689,7 +689,7 @@ vmxnet3_map_pkt(struct sk_buff *skb, struct vmxnet3_tx_ctx 
*ctx,
if (ctx->copy_size) {
ctx->sop_txd->txd.addr = cpu_to_le64(tq->data_ring.basePA +
tq->tx_ring.next2fill *
-   sizeof(struct Vmxnet3_TxDataDesc));
+   tq->txdata_desc_size);
ctx->sop_txd->dword[2] = cpu_to_le32(dw2 | ctx->copy_size);
ctx->sop_txd->dword[3] = 0;
 
@@ -873,8 +873,9 @@ vmxnet3_parse_hdr(struct sk_buff *skb, struct 
vmxnet3_tx_queue *tq,
ctx->eth_ip_hdr_size = 0;
ctx->l4_hdr_size = 0;
/* copy as much as allowed */
-   ctx->copy_size = min((unsigned int)VMXNET3_HDR_COPY_SIZE
-, skb_headlen(skb));
+   ctx->copy_size = min_t(unsigned int,
+  tq->txdata_desc_size,
+  skb_headlen(skb));
}
 
if (skb->len <= VMXNET3_HDR_COPY_SIZE)
@@ -885,7 +886,7 @@ vmxnet3_parse_hdr(struct sk_buff *skb, struct 
vmxnet3_tx_queue *tq,
goto err;
}
 
-   if 

[PATCH net-next 4/7] Driver: Vmxnet3: Add Receive Data Ring support

2016-05-06 Thread Shrikrishna Khare
Receive Data Ring buffer length is configurable via ethtool -G ethX rx-mini

Signed-off-by: Shrikrishna Khare 
---
 drivers/net/vmxnet3/vmxnet3_defs.h|  14 +++-
 drivers/net/vmxnet3/vmxnet3_drv.c | 153 +++---
 drivers/net/vmxnet3/vmxnet3_ethtool.c |  48 ---
 drivers/net/vmxnet3/vmxnet3_int.h |  23 -
 4 files changed, 193 insertions(+), 45 deletions(-)

diff --git a/drivers/net/vmxnet3/vmxnet3_defs.h 
b/drivers/net/vmxnet3/vmxnet3_defs.h
index 701d989..f3b31c2 100644
--- a/drivers/net/vmxnet3/vmxnet3_defs.h
+++ b/drivers/net/vmxnet3/vmxnet3_defs.h
@@ -174,6 +174,8 @@ struct Vmxnet3_TxDataDesc {
u8  data[VMXNET3_HDR_COPY_SIZE];
 };
 
+typedef u8 Vmxnet3_RxDataDesc;
+
 #define VMXNET3_TCD_GEN_SHIFT  31
 #define VMXNET3_TCD_GEN_SIZE   1
 #define VMXNET3_TCD_TXIDX_SHIFT0
@@ -382,6 +384,10 @@ union Vmxnet3_GenericDesc {
 #define VMXNET3_TXDATA_DESC_SIZE_ALIGN 64
 #define VMXNET3_TXDATA_DESC_SIZE_MASK  (VMXNET3_TXDATA_DESC_SIZE_ALIGN - 1)
 
+/* Rx Data Ring buffer size must be a multiple of 64 */
+#define VMXNET3_RXDATA_DESC_SIZE_ALIGN 64
+#define VMXNET3_RXDATA_DESC_SIZE_MASK  (VMXNET3_RXDATA_DESC_SIZE_ALIGN - 1)
+
 /* Max ring size */
 #define VMXNET3_TX_RING_MAX_SIZE   4096
 #define VMXNET3_TC_RING_MAX_SIZE   4096
@@ -392,6 +398,8 @@ union Vmxnet3_GenericDesc {
 #define VMXNET3_TXDATA_DESC_MIN_SIZE 128
 #define VMXNET3_TXDATA_DESC_MAX_SIZE 2048
 
+#define VMXNET3_RXDATA_DESC_MAX_SIZE 2048
+
 /* a list of reasons for queue stop */
 
 enum {
@@ -488,12 +496,14 @@ struct Vmxnet3_RxQueueConf {
__le64  rxRingBasePA[2];
__le64  compRingBasePA;
__le64  ddPA;/* driver data */
-   __le64  reserved;
+   __le64  rxDataRingBasePA;
__le32  rxRingSize[2];   /* # of rx desc */
__le32  compRingSize;/* # of rx comp desc */
__le32  ddLen;   /* size of driver data */
u8  intrIdx;
-   u8  _pad[7];
+   u8  _pad1[1];
+   __le16  rxDataRingDescSize;  /* size of rx data ring buffer */
+   u8  _pad2[4];
 };
 
 
diff --git a/drivers/net/vmxnet3/vmxnet3_drv.c 
b/drivers/net/vmxnet3/vmxnet3_drv.c
index 58632b1..a6dc7c7 100644
--- a/drivers/net/vmxnet3/vmxnet3_drv.c
+++ b/drivers/net/vmxnet3/vmxnet3_drv.c
@@ -1284,9 +1284,10 @@ vmxnet3_rq_rx_complete(struct vmxnet3_rx_queue *rq,
 */
break;
}
-   BUG_ON(rcd->rqID != rq->qid && rcd->rqID != rq->qid2);
+   BUG_ON(rcd->rqID != rq->qid && rcd->rqID != rq->qid2 &&
+  rcd->rqID != rq->dataRingQid);
idx = rcd->rxdIdx;
-   ring_idx = rcd->rqID < adapter->num_rx_queues ? 0 : 1;
+   ring_idx = VMXNET3_GET_RING_IDX(adapter, rcd->rqID);
ring = rq->rx_ring + ring_idx;
vmxnet3_getRxDesc(rxd, >rx_ring[ring_idx].base[idx].rxd,
  );
@@ -1301,8 +1302,12 @@ vmxnet3_rq_rx_complete(struct vmxnet3_rx_queue *rq,
}
 
if (rcd->sop) { /* first buf of the pkt */
+   bool rxDataRingUsed;
+   u16 len;
+
BUG_ON(rxd->btype != VMXNET3_RXD_BTYPE_HEAD ||
-  rcd->rqID != rq->qid);
+  (rcd->rqID != rq->qid &&
+   rcd->rqID != rq->dataRingQid));
 
BUG_ON(rbi->buf_type != VMXNET3_RX_BUF_SKB);
BUG_ON(ctx->skb != NULL || rbi->skb == NULL);
@@ -1318,8 +1323,12 @@ vmxnet3_rq_rx_complete(struct vmxnet3_rx_queue *rq,
 
skip_page_frags = false;
ctx->skb = rbi->skb;
+
+   rxDataRingUsed =
+   VMXNET3_RX_DATA_RING(adapter, rcd->rqID);
+   len = rxDataRingUsed ? rcd->len : rbi->len;
new_skb = netdev_alloc_skb_ip_align(adapter->netdev,
-   rbi->len);
+   len);
if (new_skb == NULL) {
/* Skb allocation failed, do not handover this
 * skb to stack. Reuse it. Drop the existing pkt
@@ -1330,25 +1339,48 @@ vmxnet3_rq_rx_complete(struct vmxnet3_rx_queue *rq,
skip_page_frags = true;
goto rcd_done;
}
-   new_dma_addr = dma_map_single(>pdev->dev,
- new_skb->data, rbi->len,
- PCI_DMA_FROMDEVICE);
-   if (dma_mapping_error(>pdev->dev,
-

[PATCH net-next 0/7] Driver: Vmxnet3: Version 3

2016-05-06 Thread Shrikrishna Khare
This patchset upgrades Vmxnet3 to Version 3.


Shrikrishna Khare (7):
  Driver: Vmxnet3: Prepare for version 3 changes
  Driver: Vmxnet3: Introduce generic command interface to configure the
device
  Driver: Vmxnet3: Allow variable length Transmit Data ring buffer
  Driver: Vmxnet3: Add Receive Data Ring support
  Driver: Vmxnet3: Add support for get_coalesce, set_coalesce ethtool
operations
  Driver: Vmxnet3: Introduce command to register memory region
  Driver: Vmxnet3: Update to Version 3

 drivers/net/vmxnet3/Makefile  |   4 +-
 drivers/net/vmxnet3/upt1_defs.h   |   4 +-
 drivers/net/vmxnet3/vmxnet3_defs.h| 106 -
 drivers/net/vmxnet3/vmxnet3_drv.c | 278 +++---
 drivers/net/vmxnet3/vmxnet3_ethtool.c | 192 +--
 drivers/net/vmxnet3/vmxnet3_int.h |  50 +-
 6 files changed, 551 insertions(+), 83 deletions(-)

-- 
1.9.1



[PATCH net-next 2/7] Driver: Vmxnet3: Introduce generic command interface to configure the device

2016-05-06 Thread Shrikrishna Khare
Signed-off-by: Guolin Yang 
Signed-off-by: Shrikrishna Khare 
---
 drivers/net/vmxnet3/vmxnet3_defs.h | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/drivers/net/vmxnet3/vmxnet3_defs.h 
b/drivers/net/vmxnet3/vmxnet3_defs.h
index 8345e0c..a26a69d 100644
--- a/drivers/net/vmxnet3/vmxnet3_defs.h
+++ b/drivers/net/vmxnet3/vmxnet3_defs.h
@@ -79,6 +79,7 @@ enum {
VMXNET3_CMD_RESERVED1,
VMXNET3_CMD_LOAD_PLUGIN,
VMXNET3_CMD_RESERVED2,
+   VMXNET3_CMD_RESERVED3,
 
VMXNET3_CMD_FIRST_GET = 0xF00D,
VMXNET3_CMD_GET_QUEUE_STATUS = VMXNET3_CMD_FIRST_GET,
@@ -612,6 +613,18 @@ struct Vmxnet3_RxQueueDesc {
u8__pad[88]; /* 128 aligned */
 };
 
+struct Vmxnet3_SetPolling {
+   u8  enablePolling;
+};
+
+/* If the command data <= 16 bytes, use the shared memory directly.
+ * otherwise, use variable length configuration descriptor.
+ */
+union Vmxnet3_CmdInfo {
+   struct Vmxnet3_VariableLenConfDesc  varConf;
+   struct Vmxnet3_SetPolling   setPolling;
+   __le64  data[2];
+};
 
 struct Vmxnet3_DSDevRead {
/* read-only region for device, read by dev in response to a SET cmd */
@@ -630,7 +643,14 @@ struct Vmxnet3_DriverShared {
__le32  pad;
struct Vmxnet3_DSDevReaddevRead;
__le32  ecr;
-   __le32  reserved[5];
+   __le32  reserved;
+   union {
+   __le32  reserved1[4];
+   union Vmxnet3_CmdInfo   cmdInfo; /* only valid in the context of
+ * executing the relevant
+ * command
+ */
+   } cu;
 };
 
 
-- 
1.9.1



[PATCH net-next 5/7] Driver: Vmxnet3: Add support for get_coalesce, set_coalesce ethtool operations

2016-05-06 Thread Shrikrishna Khare
Signed-off-by: Keyong Sun 
Signed-off-by: Manoj Tammali 
Signed-off-by: Shrikrishna Khare 
---
 drivers/net/vmxnet3/vmxnet3_defs.h|  32 +++-
 drivers/net/vmxnet3/vmxnet3_drv.c |  47 
 drivers/net/vmxnet3/vmxnet3_ethtool.c | 135 ++
 drivers/net/vmxnet3/vmxnet3_int.h |   5 ++
 4 files changed, 218 insertions(+), 1 deletion(-)

diff --git a/drivers/net/vmxnet3/vmxnet3_defs.h 
b/drivers/net/vmxnet3/vmxnet3_defs.h
index f3b31c2..a0fdaac 100644
--- a/drivers/net/vmxnet3/vmxnet3_defs.h
+++ b/drivers/net/vmxnet3/vmxnet3_defs.h
@@ -80,6 +80,7 @@ enum {
VMXNET3_CMD_LOAD_PLUGIN,
VMXNET3_CMD_RESERVED2,
VMXNET3_CMD_RESERVED3,
+   VMXNET3_CMD_SET_COALESCE,
 
VMXNET3_CMD_FIRST_GET = 0xF00D,
VMXNET3_CMD_GET_QUEUE_STATUS = VMXNET3_CMD_FIRST_GET,
@@ -637,6 +638,36 @@ struct Vmxnet3_SetPolling {
u8  enablePolling;
 };
 
+#define VMXNET3_COAL_STATIC_MAX_DEPTH  128
+#define VMXNET3_COAL_RBC_MIN_RATE  100
+#define VMXNET3_COAL_RBC_MAX_RATE  10
+
+enum Vmxnet3_CoalesceMode {
+   VMXNET3_COALESCE_DEFAULT= 0,
+   VMXNET3_COALESCE_DISABLED   = 1,
+   VMXNET3_COALESCE_ADAPT  = 2,
+   VMXNET3_COALESCE_STATIC = 3,
+   VMXNET3_COALESCE_RBC= 4
+};
+
+struct Vmxnet3_CoalesceRbc {
+   u32 rbc_rate;
+};
+
+struct Vmxnet3_CoalesceStatic {
+   u32 tx_depth;
+   u32 tx_comp_depth;
+   u32 rx_depth;
+};
+
+struct Vmxnet3_CoalesceScheme {
+   enum Vmxnet3_CoalesceMode   coalMode;
+   union {
+   struct Vmxnet3_CoalesceRbc  coalRbc;
+   struct Vmxnet3_CoalesceStatic   coalStatic;
+   } coalPara;
+};
+
 /* If the command data <= 16 bytes, use the shared memory directly.
  * otherwise, use variable length configuration descriptor.
  */
@@ -673,7 +704,6 @@ struct Vmxnet3_DriverShared {
} cu;
 };
 
-
 #define VMXNET3_ECR_RQERR   (1 << 0)
 #define VMXNET3_ECR_TQERR   (1 << 1)
 #define VMXNET3_ECR_LINK(1 << 2)
diff --git a/drivers/net/vmxnet3/vmxnet3_drv.c 
b/drivers/net/vmxnet3/vmxnet3_drv.c
index a6dc7c7..fe1c6ad 100644
--- a/drivers/net/vmxnet3/vmxnet3_drv.c
+++ b/drivers/net/vmxnet3/vmxnet3_drv.c
@@ -2491,6 +2491,26 @@ vmxnet3_setup_driver_shared(struct vmxnet3_adapter 
*adapter)
/* the rest are already zeroed */
 }
 
+static void
+vmxnet3_init_coalesce(struct vmxnet3_adapter *adapter)
+{
+   struct Vmxnet3_DriverShared *shared = adapter->shared;
+   union Vmxnet3_CmdInfo *cmdInfo = >cu.cmdInfo;
+   unsigned long flags;
+
+   if (!VMXNET3_VERSION_GE_3(adapter) ||
+   adapter->coal_conf->coalMode == VMXNET3_COALESCE_DEFAULT) {
+   return;
+   }
+
+   spin_lock_irqsave(>cmd_lock, flags);
+   cmdInfo->varConf.confVer = 1;
+   cmdInfo->varConf.confLen = cpu_to_le32(sizeof(*adapter->coal_conf));
+   cmdInfo->varConf.confPA  = cpu_to_le64(adapter->coal_conf_pa);
+   VMXNET3_WRITE_BAR1_REG(adapter, VMXNET3_REG_CMD,
+  VMXNET3_CMD_SET_COALESCE);
+   spin_unlock_irqrestore(>cmd_lock, flags);
+}
 
 int
 vmxnet3_activate_dev(struct vmxnet3_adapter *adapter)
@@ -2540,6 +2560,8 @@ vmxnet3_activate_dev(struct vmxnet3_adapter *adapter)
goto activate_err;
}
 
+   vmxnet3_init_coalesce(adapter);
+
for (i = 0; i < adapter->num_rx_queues; i++) {
VMXNET3_WRITE_BAR0_REG(adapter,
VMXNET3_REG_RXPROD + i * VMXNET3_REG_ALIGN,
@@ -3345,6 +3367,21 @@ vmxnet3_probe_device(struct pci_dev *pdev,
goto err_ver;
}
 
+   if (VMXNET3_VERSION_GE_3(adapter)) {
+   adapter->coal_conf =
+   dma_alloc_coherent(>pdev->dev,
+  sizeof(struct Vmxnet3_CoalesceScheme)
+  ,
+  >coal_conf_pa,
+  GFP_KERNEL);
+   if (!adapter->coal_conf) {
+   err = -ENOMEM;
+   goto err_ver;
+   }
+   memset(adapter->coal_conf, 0, sizeof(*adapter->coal_conf));
+   adapter->coal_conf->coalMode = VMXNET3_COALESCE_DEFAULT;
+   }
+
SET_NETDEV_DEV(netdev, >dev);
vmxnet3_declare_features(adapter, dma64);
 
@@ -3407,6 +3444,11 @@ vmxnet3_probe_device(struct pci_dev *pdev,
return 0;
 
 err_register:
+   if (VMXNET3_VERSION_GE_3(adapter)) {
+   dma_free_coherent(>pdev->dev,
+ sizeof(struct Vmxnet3_CoalesceScheme),
+ 

[PATCH net-next 7/7] Driver: Vmxnet3: Update to Version 3

2016-05-06 Thread Shrikrishna Khare
Signed-off-by: Shrikrishna Khare 
---
 drivers/net/vmxnet3/vmxnet3_drv.c | 7 ++-
 drivers/net/vmxnet3/vmxnet3_int.h | 4 ++--
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/net/vmxnet3/vmxnet3_drv.c 
b/drivers/net/vmxnet3/vmxnet3_drv.c
index fe1c6ad..5f98fb2 100644
--- a/drivers/net/vmxnet3/vmxnet3_drv.c
+++ b/drivers/net/vmxnet3/vmxnet3_drv.c
@@ -3339,7 +3339,12 @@ vmxnet3_probe_device(struct pci_dev *pdev,
goto err_alloc_pci;
 
ver = VMXNET3_READ_BAR1_REG(adapter, VMXNET3_REG_VRRS);
-   if (ver & (1 << VMXNET3_REV_2)) {
+   if (ver & (1 << VMXNET3_REV_3)) {
+   VMXNET3_WRITE_BAR1_REG(adapter,
+  VMXNET3_REG_VRRS,
+  1 << VMXNET3_REV_3);
+   adapter->version = VMXNET3_REV_3 + 1;
+   } else if (ver & (1 << VMXNET3_REV_2)) {
VMXNET3_WRITE_BAR1_REG(adapter,
   VMXNET3_REG_VRRS,
   1 << VMXNET3_REV_2);
diff --git a/drivers/net/vmxnet3/vmxnet3_int.h 
b/drivers/net/vmxnet3/vmxnet3_int.h
index 8cd1851..cc67837 100644
--- a/drivers/net/vmxnet3/vmxnet3_int.h
+++ b/drivers/net/vmxnet3/vmxnet3_int.h
@@ -69,10 +69,10 @@
 /*
  * Version numbers
  */
-#define VMXNET3_DRIVER_VERSION_STRING   "1.4.7.0-k"
+#define VMXNET3_DRIVER_VERSION_STRING   "1.4.8.0-k"
 
 /* a 32-bit int, each byte encode a verion number in VMXNET3_DRIVER_VERSION */
-#define VMXNET3_DRIVER_VERSION_NUM  0x01040700
+#define VMXNET3_DRIVER_VERSION_NUM  0x01040800
 
 #if defined(CONFIG_PCI_MSI)
/* RSS only makes sense if MSI-X is supported. */
-- 
1.9.1



[PATCH net-next 1/7] Driver: Vmxnet3: Prepare for version 3 changes

2016-05-06 Thread Shrikrishna Khare
Signed-off-by: Shrikrishna Khare 
---
 drivers/net/vmxnet3/Makefile  |  4 ++--
 drivers/net/vmxnet3/upt1_defs.h   |  4 ++--
 drivers/net/vmxnet3/vmxnet3_defs.h|  9 ++---
 drivers/net/vmxnet3/vmxnet3_drv.c | 22 +-
 drivers/net/vmxnet3/vmxnet3_ethtool.c |  4 ++--
 drivers/net/vmxnet3/vmxnet3_int.h | 13 +++--
 6 files changed, 36 insertions(+), 20 deletions(-)

diff --git a/drivers/net/vmxnet3/Makefile b/drivers/net/vmxnet3/Makefile
index 880f509..8cdbb63 100644
--- a/drivers/net/vmxnet3/Makefile
+++ b/drivers/net/vmxnet3/Makefile
@@ -2,7 +2,7 @@
 #
 # Linux driver for VMware's vmxnet3 ethernet NIC.
 #
-# Copyright (C) 2007-2009, VMware, Inc. All Rights Reserved.
+# Copyright (C) 2007-2016, VMware, Inc. All Rights Reserved.
 #
 # This program is free software; you can redistribute it and/or modify it
 # under the terms of the GNU General Public License as published by the
@@ -21,7 +21,7 @@
 # The full GNU General Public License is included in this distribution in
 # the file called "COPYING".
 #
-# Maintained by: Shreyas Bhatewara 
+# Maintained by: pv-driv...@vmware.com
 #
 #
 

diff --git a/drivers/net/vmxnet3/upt1_defs.h b/drivers/net/vmxnet3/upt1_defs.h
index 969c751..db9f1fd 100644
--- a/drivers/net/vmxnet3/upt1_defs.h
+++ b/drivers/net/vmxnet3/upt1_defs.h
@@ -1,7 +1,7 @@
 /*
  * Linux driver for VMware's vmxnet3 ethernet NIC.
  *
- * Copyright (C) 2008-2009, VMware, Inc. All Rights Reserved.
+ * Copyright (C) 2008-2016, VMware, Inc. All Rights Reserved.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms of the GNU General Public License as published by the
@@ -20,7 +20,7 @@
  * The full GNU General Public License is included in this distribution in
  * the file called "COPYING".
  *
- * Maintained by: Shreyas Bhatewara 
+ * Maintained by: pv-driv...@vmware.com
  *
  */
 
diff --git a/drivers/net/vmxnet3/vmxnet3_defs.h 
b/drivers/net/vmxnet3/vmxnet3_defs.h
index 72ba8ae..8345e0c 100644
--- a/drivers/net/vmxnet3/vmxnet3_defs.h
+++ b/drivers/net/vmxnet3/vmxnet3_defs.h
@@ -1,7 +1,7 @@
 /*
  * Linux driver for VMware's vmxnet3 ethernet NIC.
  *
- * Copyright (C) 2008-2015, VMware, Inc. All Rights Reserved.
+ * Copyright (C) 2008-2016, VMware, Inc. All Rights Reserved.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms of the GNU General Public License as published by the
@@ -20,7 +20,7 @@
  * The full GNU General Public License is included in this distribution in
  * the file called "COPYING".
  *
- * Maintained by: Shreyas Bhatewara 
+ * Maintained by: pv-driv...@vmware.com
  *
  */
 
@@ -76,7 +76,9 @@ enum {
VMXNET3_CMD_UPDATE_IML,
VMXNET3_CMD_UPDATE_PMCFG,
VMXNET3_CMD_UPDATE_FEATURE,
+   VMXNET3_CMD_RESERVED1,
VMXNET3_CMD_LOAD_PLUGIN,
+   VMXNET3_CMD_RESERVED2,
 
VMXNET3_CMD_FIRST_GET = 0xF00D,
VMXNET3_CMD_GET_QUEUE_STATUS = VMXNET3_CMD_FIRST_GET,
@@ -87,7 +89,8 @@ enum {
VMXNET3_CMD_GET_DID_LO,
VMXNET3_CMD_GET_DID_HI,
VMXNET3_CMD_GET_DEV_EXTRA_INFO,
-   VMXNET3_CMD_GET_CONF_INTR
+   VMXNET3_CMD_GET_CONF_INTR,
+   VMXNET3_CMD_GET_RESERVED1,
 };
 
 /*
diff --git a/drivers/net/vmxnet3/vmxnet3_drv.c 
b/drivers/net/vmxnet3/vmxnet3_drv.c
index db8022a..53434ed 100644
--- a/drivers/net/vmxnet3/vmxnet3_drv.c
+++ b/drivers/net/vmxnet3/vmxnet3_drv.c
@@ -1,7 +1,7 @@
 /*
  * Linux driver for VMware's vmxnet3 ethernet NIC.
  *
- * Copyright (C) 2008-2009, VMware, Inc. All Rights Reserved.
+ * Copyright (C) 2008-2016, VMware, Inc. All Rights Reserved.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms of the GNU General Public License as published by the
@@ -20,7 +20,7 @@
  * The full GNU General Public License is included in this distribution in
  * the file called "COPYING".
  *
- * Maintained by: Shreyas Bhatewara 
+ * Maintained by: pv-driv...@vmware.com
  *
  */
 
@@ -1363,7 +1363,7 @@ vmxnet3_rq_rx_complete(struct vmxnet3_rx_queue *rq,
rbi->dma_addr = new_dma_addr;
rxd->addr = cpu_to_le64(rbi->dma_addr);
rxd->len = rbi->len;
-   if (adapter->version == 2 &&
+   if (VMXNET3_VERSION_GE_2(adapter) &&
rcd->type == VMXNET3_CDTYPE_RXCOMP_LRO) {
struct Vmxnet3_RxCompDescExt *rcdlro;
rcdlro = (struct Vmxnet3_RxCompDescExt *)rcd;
@@ -3200,12 +3200,16 @@ vmxnet3_probe_device(struct pci_dev *pdev,
goto err_alloc_pci;
 
ver = VMXNET3_READ_BAR1_REG(adapter, VMXNET3_REG_VRRS);
-   if (ver & 2) {
-   

[PATCH net-next 6/7] Driver: Vmxnet3: Introduce command to register memory region

2016-05-06 Thread Shrikrishna Khare
Signed-off-by: Guolin Yang 
Signed-off-by: Shrikrishna Khare 
---
 drivers/net/vmxnet3/vmxnet3_defs.h | 17 +
 1 file changed, 17 insertions(+)

diff --git a/drivers/net/vmxnet3/vmxnet3_defs.h 
b/drivers/net/vmxnet3/vmxnet3_defs.h
index a0fdaac..a336b25 100644
--- a/drivers/net/vmxnet3/vmxnet3_defs.h
+++ b/drivers/net/vmxnet3/vmxnet3_defs.h
@@ -81,6 +81,7 @@ enum {
VMXNET3_CMD_RESERVED2,
VMXNET3_CMD_RESERVED3,
VMXNET3_CMD_SET_COALESCE,
+   VMXNET3_CMD_REGISTER_MEMREGS,
 
VMXNET3_CMD_FIRST_GET = 0xF00D,
VMXNET3_CMD_GET_QUEUE_STATUS = VMXNET3_CMD_FIRST_GET,
@@ -668,6 +669,22 @@ struct Vmxnet3_CoalesceScheme {
} coalPara;
 };
 
+struct Vmxnet3_MemoryRegion {
+   __le64  startPA;
+   __le32  length;
+   __le16  txQueueBits;
+   __le16  rxQueueBits;
+};
+
+#define MAX_MEMORY_REGION_PER_QUEUE 16
+#define MAX_MEMORY_REGION_PER_DEVICE 256
+
+struct Vmxnet3_MemRegs {
+   __le16  numRegs;
+   __le16  pad[3];
+   struct Vmxnet3_MemoryRegion memRegs[1];
+};
+
 /* If the command data <= 16 bytes, use the shared memory directly.
  * otherwise, use variable length configuration descriptor.
  */
-- 
1.9.1



Re: [RFC PATCH v3 00/19] CALIPSO Implementation

2016-05-06 Thread Paul Moore
On Wed, Feb 17, 2016 at 8:21 AM, Huw Davies  wrote:
> This patch series implements RFC 5570 - Common Architecture Label IPv6
> Security Option (CALIPSO).  Its goal is to set MLS sensitivity labels
> on IPv6 packets using a hop-by-hop option.  CALIPSO is very similar to
> its IPv4 cousin CIPSO and much of this series is based on that code.
>
> If anybody actually wants to play with this, then you'll need some patches
> to netlabel-tools that are currently available on the 'calipso' branch at:
> https://github.com/hdmdavies/netlabel_tools.git .  The protocol has changed
> very slightly from the v2 patches, so please update to the latest.
>
> This patch series is based off v4.5-rc4.
>
> Thanks to Paul Moore, Hannes Frederic Sowa, Casey Schaufler and Julia
> Lawall for their comments so far.
>
> Changes between v3 and v2:
>
> * Change CALIPSO_MAP_PASS to 2.
> * Move calipso_init() before ipv6_sysctl_register().
> * Rewrite calipso_tlv_len() to check for overflows.
> * Simplify calispo_opt_find().
> * Simplify calipso_opt_insert().
> * Use calipso_tlv_len() in calipso_sock_getattr().
> * Simplify calipso_skbuff_setattr().
> * Return early from netlbl_domhsh_search_def() when a match is found.
> * Remove nested if from netlbl_domhsh_validate().
> * Don't return from netlbl_domhsh_add() with lock held.
> * Various style changes.
> * Rebased to v4.5-rc4.
>
> Changes between v2 and v1:
>
> * Simplify ipv6_renew_options_kern() to use set_fs(KERNEL_DS).
>   Thanks to Hannes Frederic Sowa for suggesting this.
> * Use the parent socket to account for the listener socket
>   option's memory usage.  Again, thanks for Hannes for this.
> * Added netlbl_cfg_calipso_* functions for SMACK.
> * Rebased to v4.4-rc8.
>
> Huw Davies (19):
>   netlabel: Mark rcu pointers with __rcu.
>   netlabel: Add an address family to domain hash entries.
>   netlabel: Initial support for the CALIPSO netlink protocol.
>   netlabel: Add support for querying a CALIPSO DOI.
>   netlabel: Add support for enumerating the CALIPSO DOI list.
>   netlabel: Add support for creating a CALIPSO protocol domain mapping.
>   netlabel: Add support for removing a CALIPSO DOI.
>   ipv6: Add ipv6_renew_options_kern() that accepts a kernel mem pointer.
>   netlabel: Move bitmap manipulation functions to the NetLabel core.
>   calipso: Set the calipso socket label to match the secattr.
>   netlabel: Prevent setsockopt() from changing the hop-by-hop option.
>   ipv6: Allow request socks to contain IPv6 options.
>   calipso: Allow request sockets to be relabelled by the lsm.
>   ipv6: constify the skb pointer of ipv6_find_tlv().
>   calipso: Allow the lsm to label the skbuff directly.
>   netlabel: Pass a family parameter to netlbl_skbuff_err().
>   calipso: Add validation of CALIPSO option.
>   calipso: Add a label cache.
>   netlabel: Implement CALIPSO config functions for SMACK.

Other than the small nit I just mentioned with respect to CALIPSO
option validation code, this all looks reasonable to me.  I know you
are in the process of doing interop with Solaris TX folks, assuming
that goes well I think we are ready to throw this into linux-next with
an eye towards v4.8.  For the past several weeks I've been including
this patchset in my Fedora COPR test kernels (located below), others
are welcome to try it out if they like.

DaveM, what is your take on this?  I'm happy to get this into
linux-next via the SELinux tree, but it does touch some core network
code (the IPv6 option handling, much in the same way as CIPSO touched
the IPv4 option handling) and I don't want to push this if you aren't
okay with it.  Of course if you want to take this via your tree that's
fine with me too.

-- 
paul moore
www.paul-moore.com


[PATCH 2/2] net: Use ns_capable_noaudit() when determining net sysctl permissions

2016-05-06 Thread Tyler Hicks
The capability check should not be audited since it is only being used
to determine the inode permissions. A failed check does not indicate a
violation of security policy but, when an LSM is enabled, a denial audit
message was being generated.

The denial audit message caused confusion for some application authors
because root-running Go applications always triggered the denial. To
prevent this confusion, the capability check in net_ctl_permissions() is
switched to the noaudit variant.

BugLink: https://launchpad.net/bugs/1465724

Signed-off-by: Tyler Hicks 
---
 net/sysctl_net.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sysctl_net.c b/net/sysctl_net.c
index ed98c1f..46a71c7 100644
--- a/net/sysctl_net.c
+++ b/net/sysctl_net.c
@@ -46,7 +46,7 @@ static int net_ctl_permissions(struct ctl_table_header *head,
kgid_t root_gid = make_kgid(net->user_ns, 0);
 
/* Allow network administrator to have same access as root. */
-   if (ns_capable(net->user_ns, CAP_NET_ADMIN) ||
+   if (ns_capable_noaudit(net->user_ns, CAP_NET_ADMIN) ||
uid_eq(root_uid, current_euid())) {
int mode = (table->mode >> 6) & 7;
return (mode << 6) | (mode << 3) | mode;
-- 
2.7.4



[PATCH 1/2] kernel: Add noaudit variant of ns_capable()

2016-05-06 Thread Tyler Hicks
When checking the current cred for a capability in a specific user
namespace, it isn't always desirable to have the LSMs audit the check.
This patch adds a noaudit variant of ns_capable() for when those
situations arise.

The common logic between ns_capable() and the new ns_capable_noaudit()
is moved into a single, shared function to keep duplicated code to a
minimum and ease maintainability.

Signed-off-by: Tyler Hicks 
---
 include/linux/capability.h |  5 +
 kernel/capability.c| 46 --
 2 files changed, 41 insertions(+), 10 deletions(-)

diff --git a/include/linux/capability.h b/include/linux/capability.h
index 00690ff..5f3c63d 100644
--- a/include/linux/capability.h
+++ b/include/linux/capability.h
@@ -206,6 +206,7 @@ extern bool has_ns_capability_noaudit(struct task_struct *t,
  struct user_namespace *ns, int cap);
 extern bool capable(int cap);
 extern bool ns_capable(struct user_namespace *ns, int cap);
+extern bool ns_capable_noaudit(struct user_namespace *ns, int cap);
 #else
 static inline bool has_capability(struct task_struct *t, int cap)
 {
@@ -233,6 +234,10 @@ static inline bool ns_capable(struct user_namespace *ns, 
int cap)
 {
return true;
 }
+static inline bool ns_capable_noaudit(struct user_namespace *ns, int cap)
+{
+   return true;
+}
 #endif /* CONFIG_MULTIUSER */
 extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap);
 extern bool file_ns_capable(const struct file *file, struct user_namespace 
*ns, int cap);
diff --git a/kernel/capability.c b/kernel/capability.c
index 45432b5..00411c8 100644
--- a/kernel/capability.c
+++ b/kernel/capability.c
@@ -361,6 +361,24 @@ bool has_capability_noaudit(struct task_struct *t, int cap)
return has_ns_capability_noaudit(t, _user_ns, cap);
 }
 
+static bool ns_capable_common(struct user_namespace *ns, int cap, bool audit)
+{
+   int capable;
+
+   if (unlikely(!cap_valid(cap))) {
+   pr_crit("capable() called with invalid cap=%u\n", cap);
+   BUG();
+   }
+
+   capable = audit ? security_capable(current_cred(), ns, cap) :
+ security_capable_noaudit(current_cred(), ns, cap);
+   if (capable == 0) {
+   current->flags |= PF_SUPERPRIV;
+   return true;
+   }
+   return false;
+}
+
 /**
  * ns_capable - Determine if the current task has a superior capability in 
effect
  * @ns:  The usernamespace we want the capability in
@@ -374,19 +392,27 @@ bool has_capability_noaudit(struct task_struct *t, int 
cap)
  */
 bool ns_capable(struct user_namespace *ns, int cap)
 {
-   if (unlikely(!cap_valid(cap))) {
-   pr_crit("capable() called with invalid cap=%u\n", cap);
-   BUG();
-   }
-
-   if (security_capable(current_cred(), ns, cap) == 0) {
-   current->flags |= PF_SUPERPRIV;
-   return true;
-   }
-   return false;
+   return ns_capable_common(ns, cap, true);
 }
 EXPORT_SYMBOL(ns_capable);
 
+/**
+ * ns_capable_noaudit - Determine if the current task has a superior capability
+ * (unaudited) in effect
+ * @ns:  The usernamespace we want the capability in
+ * @cap: The capability to be tested for
+ *
+ * Return true if the current task has the given superior capability currently
+ * available for use, false if not.
+ *
+ * This sets PF_SUPERPRIV on the task if the capability is available on the
+ * assumption that it's about to be used.
+ */
+bool ns_capable_noaudit(struct user_namespace *ns, int cap)
+{
+   return ns_capable_common(ns, cap, false);
+}
+EXPORT_SYMBOL(ns_capable_noaudit);
 
 /**
  * capable - Determine if the current task has a superior capability in effect
-- 
2.7.4



[PATCH 0/2] Quiet noisy LSM denial when accessing net sysctl

2016-05-06 Thread Tyler Hicks
This pair of patches does away with what I believe is a useless denial
audit message when a privileged process initially accesses a net sysctl.

The bug was first discovered when running Go applications under AppArmor
confinement. It can be triggered like so:

  $ echo "profile test { file, }" | sudo apparmor_parser -rq

Once the profile is loaded, invoke Go as root under confinement:

  $ sudo aa-exec -p test -- go version
  go version go1.6.1 linux/amd64

Here's the denial:

  audit: type=1400 audit(1462575436.832:29): apparmor="DENIED" 
operation="capable" profile="test" pid=1157 comm="go" capability=12  
capname="net_admin"

The reproducer in minimal form is:

  $ sudo aa-exec -p test -- cat /proc/sys/net/core/somaxconn
  128

The denial:

  audit: type=1400 audit(1462575670.000:29): apparmor="DENIED" 
operation="capable" profile="test" pid=1161 comm="cat" capability=12  
capname="net_admin"

Thanks!

Tyler



Re: [RFC PATCH v3 17/19] calipso: Add validation of CALIPSO option.

2016-05-06 Thread Paul Moore
On Wed, Feb 17, 2016 at 8:22 AM, Huw Davies  wrote:
> We check lengths, checksum and the DOI.  We leave checking of the
> level and categories for the socket layer.
>
> Signed-off-by: Huw Davies 
> ---
>  include/net/calipso.h |  6 ++
>  net/ipv6/calipso.c| 42 ++
>  net/ipv6/exthdrs.c| 27 +++
>  3 files changed, 75 insertions(+)
>
> diff --git a/include/net/calipso.h b/include/net/calipso.h
> index 38dbb47..85404e2 100644
> --- a/include/net/calipso.h
> +++ b/include/net/calipso.h
> @@ -65,6 +65,7 @@ struct calipso_doi {
>  #ifdef CONFIG_NETLABEL
>  int __init calipso_init(void);
>  void calipso_exit(void);
> +bool calipso_validate(const struct sk_buff *skb, const unsigned char 
> *option);
>  #else
>  static inline int __init calipso_init(void)
>  {
> @@ -74,6 +75,11 @@ static inline int __init calipso_init(void)
>  static inline void calipso_exit(void)
>  {
>  }
> +static inline bool calipso_validate(const struct sk_buff *skb,
> +   const unsigned char *option)
> +{
> +   return true;
> +}
>  #endif /* CONFIG_NETLABEL */
>
>  #endif /* _CALIPSO_H */
> diff --git a/net/ipv6/calipso.c b/net/ipv6/calipso.c
> index fa371a8..b8bcf9f 100644
> --- a/net/ipv6/calipso.c
> +++ b/net/ipv6/calipso.c
> @@ -321,6 +321,48 @@ doi_walk_return:
>  }
>
>  /**
> + * calipso_validate - Validate a CALIPSO option
> + * @skb: the packet
> + * @option: the start of the option
> + *
> + * Description:
> + * This routine is called to validate a CALIPSO option.
> + * If the option is valid then a zero value is returned.  If the
> + * option is invalid then a non-zero value is returned and
> + * representing the offset to the offending portion of the option.
> + *
> + * The caller should have already checked that the length of the
> + * option (including the TLV header) is >= 10 and that the catmap
> + * length is consistent with the option length.
> + *
> + * We leave checks on the level and categories to the socket layer.
> + */
> +bool calipso_validate(const struct sk_buff *skb, const unsigned char *option)
> +{
> +   struct calipso_doi *doi_def;
> +   int ret_val;
> +   u16 crc, len = option[1] + 2;
> +   static const u8 zero[2];
> +
> +   /* The original CRC runs over the option including the TLV header
> +* with the CRC-16 field (at offset 8) zeroed out. */
> +   crc = crc_ccitt(0x, option, 8);
> +   crc = crc_ccitt(crc, zero, sizeof(zero));
> +   if (len > 10)
> +   crc = crc_ccitt(crc, option + 10, len - 10);
> +   crc = ~crc;

I should have caught this in the v2 patchset when I mentioned it with
respect to the CRC generation, but why not simply do 'crc =
~crc_cccitt(...);'?

Also, while I'm looking at this, why not do the CRC verification in
ipv6_hop_calipso()?  The only thing we should need to do here is the
DOI lookup/verification so that we still work correctly when
CONFIG_NETLABEL=n; all the core protocol stuff, e.g. length and
checksum validation, should be done in the core stack functions, e.g.
ipv6_hop_calipso().

> +   if (option[8] != (crc & 0xff) || option[9] != ((crc >> 8) & 0xff))
> +   return false;
> +
> +   rcu_read_lock();
> +   doi_def = calipso_doi_search(get_unaligned_be32(option + 2));
> +   ret_val = !!doi_def;
> +   rcu_read_unlock();
> +
> +   return ret_val;
> +}
> +
> +/**
>   * calipso_map_cat_hton - Perform a category mapping from host to network
>   * @doi_def: the DOI definition
>   * @secattr: the security attributes
> diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
> index d5fd3e7..0f69cab 100644
> --- a/net/ipv6/exthdrs.c
> +++ b/net/ipv6/exthdrs.c
> @@ -43,6 +43,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #if IS_ENABLED(CONFIG_IPV6_MIP6)
>  #include 
>  #endif
> @@ -603,6 +604,28 @@ drop:
> return false;
>  }
>
> +/* CALIPSO RFC 5570 */
> +
> +static bool ipv6_hop_calipso(struct sk_buff *skb, int optoff)
> +{
> +   const unsigned char *nh = skb_network_header(skb);
> +
> +   if (nh[optoff + 1] < 8)
> +   goto drop;
> +
> +   if (nh[optoff + 6] * 4 + 8 > nh[optoff + 1])
> +   goto drop;
> +
> +   if (!calipso_validate(skb, nh + optoff))
> +   goto drop;
> +
> +   return true;
> +
> +drop:
> +   kfree_skb(skb);
> +   return false;
> +}
> +
>  static const struct tlvtype_proc tlvprochopopt_lst[] = {
> {
> .type   = IPV6_TLV_ROUTERALERT,
> @@ -612,6 +635,10 @@ static const struct tlvtype_proc tlvprochopopt_lst[] = {
> .type   = IPV6_TLV_JUMBO,
> .func   = ipv6_hop_jumbo,
> },
> +   {
> +   .type   = IPV6_TLV_CALIPSO,
> +   .func   = ipv6_hop_calipso,
> +   },
> { -1, }
>  };
>
> --
> 2.7.0
>
> --
> To unsubscribe from this list: send the line 

[PATCH net-next] net: make sch_handle_ingress() drop monitor ready

2016-05-06 Thread Eric Dumazet
From: Eric Dumazet 

TC_ACT_STOLEN is used when ingress traffic is mirred/redirected
to say ifb.

Packet is not dropped, but consumed.

Only TC_ACT_SHOT is a clear indication something went wrong.

Signed-off-by: Eric Dumazet 
Cc: Jamal Hadi Salim 
---
 net/core/dev.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 
e98ba63fe2800c10d4d407f12d11ca8975dc1e8e..c7490339315cc0046a2da67c00a95f079e602588
 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3956,9 +3956,11 @@ sch_handle_ingress(struct sk_buff *skb, struct 
packet_type **pt_prev, int *ret,
break;
case TC_ACT_SHOT:
qdisc_qstats_cpu_drop(cl->q);
+   kfree_skb(skb);
+   return NULL;
case TC_ACT_STOLEN:
case TC_ACT_QUEUED:
-   kfree_skb(skb);
+   consume_skb(skb);
return NULL;
case TC_ACT_REDIRECT:
/* skb_mac_header check was done by cls/act_bpf, so




Re: [PATCH v3 net-next 02/11] net: define gso types for IPx over IPv4 and IPv6

2016-05-06 Thread Tom Herbert
On Fri, May 6, 2016 at 3:48 PM, Alexander Duyck
 wrote:
> On 05/06/2016 03:11 PM, Tom Herbert wrote:
>>
>> This patch defines two new GDO definitions SKB_GSO_IPXIP4 and
>> SKB_GSO_IPXIP6 along with corresponding NETIF_F_GSO_IPXIP4 and
>> NETIF_F_GSO_IPXIP6. These are used to described IP in IP
>> tunnel and what the outer protocol is. The inner protocol
>> can be deduced from other GSO types (e.g. SKB_GSO_TCPV4 and
>> SKB_GSO_TCPV6). The GSO types of SKB_GSO_IPIP and SKB_GSO_SIT
>> are removed (these are both instances of SKB_GSO_IPXIP4).
>> SKB_GSO_IPXIP6 will be used when support for GSO with IP
>> encapsulation over IPv6 is added.
>>
>> Signed-off-by: Tom Herbert 
>> ---
>>   drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c  |  5 ++---
>>   drivers/net/ethernet/broadcom/bnxt/bnxt.c |  4 ++--
>>   drivers/net/ethernet/intel/i40e/i40e_main.c   |  3 +--
>>   drivers/net/ethernet/intel/i40e/i40e_txrx.c   |  3 +--
>>   drivers/net/ethernet/intel/i40evf/i40e_txrx.c |  3 +--
>>   drivers/net/ethernet/intel/i40evf/i40evf_main.c   |  3 +--
>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |  3 +--
>>   drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c |  3 +--
>>   include/linux/netdev_features.h   | 12 ++--
>>   include/linux/netdevice.h |  4 ++--
>>   include/linux/skbuff.h|  4 ++--
>>   net/core/ethtool.c|  4 ++--
>>   net/ipv4/af_inet.c|  2 +-
>>   net/ipv4/ipip.c   |  2 +-
>>   net/ipv6/ip6_offload.c|  4 ++--
>>   net/ipv6/sit.c|  4 ++--
>>   net/netfilter/ipvs/ip_vs_xmit.c   | 17 +++--
>>   17 files changed, 35 insertions(+), 45 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
>> b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
>> index d465bd7..0a5b770 100644
>> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
>> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
>> @@ -13259,12 +13259,11 @@ static int bnx2x_init_dev(struct bnx2x *bp,
>> struct pci_dev *pdev,
>> NETIF_F_RXHASH | NETIF_F_HW_VLAN_CTAG_TX;
>> if (!chip_is_e1x) {
>> dev->hw_features |= NETIF_F_GSO_GRE |
>> NETIF_F_GSO_UDP_TUNNEL |
>> -   NETIF_F_GSO_IPIP | NETIF_F_GSO_SIT;
>> +   NETIF_F_GSO_IPXIP4;
>> dev->hw_enc_features =
>> NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | NETIF_F_SG |
>> NETIF_F_TSO | NETIF_F_TSO_ECN | NETIF_F_TSO6 |
>> -   NETIF_F_GSO_IPIP |
>> -   NETIF_F_GSO_SIT |
>> +   NETIF_F_GSO_IPXIP4 |
>> NETIF_F_GSO_GRE | NETIF_F_GSO_UDP_TUNNEL;
>> }
>>
>> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
>> b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
>> index fd85b6d..e449228 100644
>> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
>> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
>> @@ -6218,7 +6218,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const
>> struct pci_device_id *ent)
>> dev->hw_features = NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
>> NETIF_F_SG |
>>NETIF_F_TSO | NETIF_F_TSO6 |
>>NETIF_F_GSO_UDP_TUNNEL | NETIF_F_GSO_GRE |
>> -  NETIF_F_GSO_IPIP | NETIF_F_GSO_SIT |
>> +  NETIF_F_GSO_IPXIP4 |
>>NETIF_F_GSO_UDP_TUNNEL_CSUM |
>> NETIF_F_GSO_GRE_CSUM |
>>NETIF_F_GSO_PARTIAL | NETIF_F_RXHASH |
>>NETIF_F_RXCSUM | NETIF_F_LRO | NETIF_F_GRO;
>> @@ -6228,7 +6228,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const
>> struct pci_device_id *ent)
>> NETIF_F_TSO | NETIF_F_TSO6 |
>> NETIF_F_GSO_UDP_TUNNEL | NETIF_F_GSO_GRE |
>> NETIF_F_GSO_UDP_TUNNEL_CSUM | NETIF_F_GSO_GRE_CSUM
>> |
>> -   NETIF_F_GSO_IPIP | NETIF_F_GSO_SIT |
>> +   NETIF_F_GSO_IPXIP4;
>> NETIF_F_GSO_PARTIAL;
>> dev->gso_partial_features = NETIF_F_GSO_UDP_TUNNEL_CSUM |
>> NETIF_F_GSO_GRE_CSUM;
>
>
> You should try building the bnxt driver at least if you are making changes
> to it.  This bit is broken.  It should be an '|' and not a ';' here.
>
> The other bits seem to be correct.  I'll try testing this over the next few
> hours to see if I can find any other issues.

Huh, that's odd. I'm pretty sure I built drivers/net with allyesconfig
with this.

Thanks for catching it.

>
> - Alex


Re: [PATCH v3 net-next 02/11] net: define gso types for IPx over IPv4 and IPv6

2016-05-06 Thread Alexander Duyck

On 05/06/2016 03:11 PM, Tom Herbert wrote:

This patch defines two new GDO definitions SKB_GSO_IPXIP4 and
SKB_GSO_IPXIP6 along with corresponding NETIF_F_GSO_IPXIP4 and
NETIF_F_GSO_IPXIP6. These are used to described IP in IP
tunnel and what the outer protocol is. The inner protocol
can be deduced from other GSO types (e.g. SKB_GSO_TCPV4 and
SKB_GSO_TCPV6). The GSO types of SKB_GSO_IPIP and SKB_GSO_SIT
are removed (these are both instances of SKB_GSO_IPXIP4).
SKB_GSO_IPXIP6 will be used when support for GSO with IP
encapsulation over IPv6 is added.

Signed-off-by: Tom Herbert 
---
  drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c  |  5 ++---
  drivers/net/ethernet/broadcom/bnxt/bnxt.c |  4 ++--
  drivers/net/ethernet/intel/i40e/i40e_main.c   |  3 +--
  drivers/net/ethernet/intel/i40e/i40e_txrx.c   |  3 +--
  drivers/net/ethernet/intel/i40evf/i40e_txrx.c |  3 +--
  drivers/net/ethernet/intel/i40evf/i40evf_main.c   |  3 +--
  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |  3 +--
  drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c |  3 +--
  include/linux/netdev_features.h   | 12 ++--
  include/linux/netdevice.h |  4 ++--
  include/linux/skbuff.h|  4 ++--
  net/core/ethtool.c|  4 ++--
  net/ipv4/af_inet.c|  2 +-
  net/ipv4/ipip.c   |  2 +-
  net/ipv6/ip6_offload.c|  4 ++--
  net/ipv6/sit.c|  4 ++--
  net/netfilter/ipvs/ip_vs_xmit.c   | 17 +++--
  17 files changed, 35 insertions(+), 45 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c 
b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index d465bd7..0a5b770 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -13259,12 +13259,11 @@ static int bnx2x_init_dev(struct bnx2x *bp, struct 
pci_dev *pdev,
NETIF_F_RXHASH | NETIF_F_HW_VLAN_CTAG_TX;
if (!chip_is_e1x) {
dev->hw_features |= NETIF_F_GSO_GRE | NETIF_F_GSO_UDP_TUNNEL |
-   NETIF_F_GSO_IPIP | NETIF_F_GSO_SIT;
+   NETIF_F_GSO_IPXIP4;
dev->hw_enc_features =
NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | NETIF_F_SG |
NETIF_F_TSO | NETIF_F_TSO_ECN | NETIF_F_TSO6 |
-   NETIF_F_GSO_IPIP |
-   NETIF_F_GSO_SIT |
+   NETIF_F_GSO_IPXIP4 |
NETIF_F_GSO_GRE | NETIF_F_GSO_UDP_TUNNEL;
}

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index fd85b6d..e449228 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -6218,7 +6218,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const 
struct pci_device_id *ent)
dev->hw_features = NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | NETIF_F_SG |
   NETIF_F_TSO | NETIF_F_TSO6 |
   NETIF_F_GSO_UDP_TUNNEL | NETIF_F_GSO_GRE |
-  NETIF_F_GSO_IPIP | NETIF_F_GSO_SIT |
+  NETIF_F_GSO_IPXIP4 |
   NETIF_F_GSO_UDP_TUNNEL_CSUM | NETIF_F_GSO_GRE_CSUM |
   NETIF_F_GSO_PARTIAL | NETIF_F_RXHASH |
   NETIF_F_RXCSUM | NETIF_F_LRO | NETIF_F_GRO;
@@ -6228,7 +6228,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const 
struct pci_device_id *ent)
NETIF_F_TSO | NETIF_F_TSO6 |
NETIF_F_GSO_UDP_TUNNEL | NETIF_F_GSO_GRE |
NETIF_F_GSO_UDP_TUNNEL_CSUM | NETIF_F_GSO_GRE_CSUM |
-   NETIF_F_GSO_IPIP | NETIF_F_GSO_SIT |
+   NETIF_F_GSO_IPXIP4;
NETIF_F_GSO_PARTIAL;
dev->gso_partial_features = NETIF_F_GSO_UDP_TUNNEL_CSUM |
NETIF_F_GSO_GRE_CSUM;


You should try building the bnxt driver at least if you are making 
changes to it.  This bit is broken.  It should be an '|' and not a ';' here.


The other bits seem to be correct.  I'll try testing this over the next 
few hours to see if I can find any other issues.


- Alex


Re: [PATCH v3 net-next 01/11] gso: Remove arbitrary checks for unsupported GSO

2016-05-06 Thread Alexander Duyck

On 05/06/2016 03:11 PM, Tom Herbert wrote:

In several gso_segment functions there are checks of gso_type against
a seemingly arbitrary list of SKB_GSO_* flags. This seems like an
attempt to identify unsupported GSO types, but since the stack is
the one that set these GSO types in the first place this seems
unnecessary to do. If a combination isn't valid in the first
place that stack should not allow setting it.

This is a code simplication especially for add new GSO types.

Signed-off-by: Tom Herbert 
---
  net/ipv4/af_inet.c | 18 --
  net/ipv4/gre_offload.c | 14 --
  net/ipv4/tcp_offload.c | 19 ---
  net/ipv4/udp_offload.c | 10 --
  net/ipv6/ip6_offload.c | 18 --
  net/ipv6/udp_offload.c | 13 -
  net/mpls/mpls_gso.c|  9 -
  7 files changed, 101 deletions(-)



This is a nice clean-up since tunnels have essentially made things like 
trying to filter TSOV4 or TSOV6 pointless for the network layer.


Acked-by: Alexander Duyck 



Re: [PATCH net-next 15/21] net: dsa: mv88e6xxx: factorize VLAN Ethertype

2016-05-06 Thread Andrew Lunn
On Fri, May 06, 2016 at 05:57:17PM -0400, Vivien Didelot wrote:
> The 6131 switch models have a Core Tag Type register. Add a
> MV88E6XXX_FLAG_CORE_TAG_TYPE flag and set the VLAN Ethertype to 0x8100
> in the shared setup code if it is present.

Do you have any idea what the core tag is?

   Andrew


Re: [patch net 0/2] mlxsw: Couple of fixes

2016-05-06 Thread David Miller
From: Jiri Pirko 
Date: Fri,  6 May 2016 22:18:38 +0200

> From: Jiri Pirko 
> 
> Ido Schimmel (2):
>   mlxsw: spectrum: Fix rollback order in LAG join failure
>   mlxsw: spectrum: Add missing rollback in flood configuration

Applied.


Re: [PATCH net-next 13/21] net: dsa: mv88e6xxx: factorize global setup

2016-05-06 Thread Andrew Lunn
>  
> -int mv88e6xxx_setup_common(struct mv88e6xxx_priv_state *ps)
> +static int _mv88e6xxx_setup_global(struct mv88e6xxx_priv_state *ps)

Another _ prefix.

Andrew


Re: [PATCH net-next 12/21] net: dsa: mv88e6xxx: factorize switch reset

2016-05-06 Thread Andrew Lunn
> diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
> index dbbda4e..b744ed5 100644
> --- a/drivers/net/dsa/mv88e6xxx.c
> +++ b/drivers/net/dsa/mv88e6xxx.c
> @@ -2551,6 +2551,68 @@ restore_page_0:
>   return ret;
>  }
>  
> +static int _mv88e6xxx_switch_reset(struct mv88e6xxx_priv_state *ps)
> +{

You missed removing the _ prefix.

Thanks for removing the others. With so much less noise, it is easier
to review the patches.

   Andrew


Re: [PATCH net v3 1/2] udp_tunnel: Remove redundant udp_tunnel_gro_complete().

2016-05-06 Thread David Miller
From: Jarno Rajahalme 
Date: Tue,  3 May 2016 16:10:20 -0700

> The setting of the UDP tunnel GSO type is already performed by
> udp[46]_gro_complete().
> 
> Signed-off-by: Jarno Rajahalme 

Applied.


Re: [PATCH net v3 2/2] udp_offload: Set encapsulation before inner completes.

2016-05-06 Thread David Miller
From: Jarno Rajahalme 
Date: Tue,  3 May 2016 16:10:21 -0700

> UDP tunnel segmentation code relies on the inner offsets being set for
> an UDP tunnel GSO packet, but the inner *_complete() functions will
> set the inner offsets only if 'encapsulation' is set before calling
> them.  Currently, udp_gro_complete() sets 'encapsulation' only after
> the inner *_complete() functions are done.  This causes the inner
> offsets having invalid values after udp_gro_complete() returns, which
> in turn will make it impossible to properly segment the packet in case
> it needs to be forwarded, which would be visible to the user either as
> invalid packets being sent or as packet loss.
> 
> This patch fixes this by setting skb's 'encapsulation' in
> udp_gro_complete() before calling into the inner complete functions,
> and by making each possible UDP tunnel gro_complete() callback set the
> inner_mac_header to the beginning of the tunnel payload.
> 
> Signed-off-by: Jarno Rajahalme 
> ---
> v3: Added setting inner_mac_header from all possible callbacks to cover
> cases where there is no inner mac header.

Applied.


Re: [PATCH net-next 06/21] net: dsa: mv88e6xxx: factorize MAC address setting

2016-05-06 Thread Andrew Lunn
> @@ -378,6 +385,7 @@ enum mv88e6xxx_cap {
>  #define MV88E6XXX_FLAG_EEPROMBIT(MV88E6XXX_CAP_EEPROM)
>  #define MV88E6XXX_FLAG_PPU   BIT(MV88E6XXX_CAP_PPU)
>  #define MV88E6XXX_FLAG_SMI_PHY   BIT(MV88E6XXX_CAP_SMI_PHY)
> +#define MV88E6XXX_FLAG_SWITCH_MACBIT(MV88E6XXX_CAP_SWITCH_MAC_WOL_WOF)
>  #define MV88E6XXX_FLAG_TEMP  BIT(MV88E6XXX_CAP_TEMP)
>  #define MV88E6XXX_FLAG_TEMP_LIMITBIT(MV88E6XXX_CAP_TEMP_LIMIT)

There is a general pattern here that the flag has a name derived from
the capability. Except you dropped the WOL_WOF here. It would probably
be better to not have WOL_WOF at all.

   Andrew


Re: [PATCHv2 bluetooth-next 04/10] ndisc: add addr_len parameter to ndisc_opt_addr_space

2016-05-06 Thread Hannes Frederic Sowa
On 04.05.2016 14:30, Alexander Aring wrote:
> Hi,
> 
> On 05/02/2016 09:37 PM, Hannes Frederic Sowa wrote:
>> On 20.04.2016 10:19, Alexander Aring wrote:
>>> This patch makes the address length as argument for the
>>> ndisc_opt_addr_space function. This is necessary to handle addresses
>>> which don't use dev->addr_len as address length.
>>
>> Would it make sense for patch 4, 5 and 6 to add the operation to ndisc_ops?
>>
> 
> not sure if I understand this question right,
> 
> We have now the ndisc_ops where we can could change the send/recv of
> NS/NA, also is_useropt (for add 6CO RA is userspace option field).
> 
> In case of 802.15.4 we have two MAC addresses with different length:
> 
>  - extended address - 8 bytes => EUI64
>  - short address - 2 bytes
> 
> Now [0] describes how to make the source/target address option for
> NS/NA/RS/RA/... to deal with both addresses.
> 
> The short address is a special case in 802.15.4 and not always
> available. If available we add both addresses as option field in
> NS/NA (RS/RA will follow in future, but currently NS/NA only).
> 
> At this point the understanding of [0] differs in 6LoWPAN
> implementations.
> 
> Some people handles it like:
> 
> Handle the short address/extended address in XOR case of 6LoWPAN
> interface. The interface has as MAC address the extended XOR short (if
> available), depends on setting.
> 
> Then dev->addr_len is 8 XOR 2.
> 
> Other people (inclusive me) handle it like:
> 
> Handle the short/extended address in case of OR, but never short address
> alone. The interface can be accessed by extended address or short
> address and each neighbour stores both information.
> 
> The case "short address never alone" means that the extended address is
> always available and MUST be there.
> 
> Furthermore, depends on L3 addressing it could be useful to have the
> possibility to decide if using or short OR extended address as L2 address
> for do better compressing stuff.
> 
> ---
> 
> I implement it as OR case, so we add both addresses when short address
> is available. Also we drop NS/NA when the short address is given only,
> in theory we could also react on this and store a "dummy" 0x00..00
> address for extended address then.
> 
> Not sure how it need to be handled correctly, for now I implemented how
> I understand it.
> 
> In case of the OR case, we need to add two option fields for the
> address, extended and short. This is why I do the calculation stuff more
> accessible with different address lengths, so we can use 8 or 2 and not
> dev->addr_len which stores always the 802.15.4 EUI64 address length.
> 
> And the answer would be, no it makes no sense because we need to call
> these functions with 8 (dev->addr_len) and 2 (if short addr is
> available).

I had to understand the usage in patch 9. It seems you are right, the
decision cannot be done based on the protocol alone but based on the
context, so we need to pass in different lengths based on the context.
Thanks for your explanation.

I would still suggest to not use net_device as an argument but just the
type and length to keep the API cleaner, but this is not a strong opinion.

Thanks,
Hannes



Re: [PATCH net-next 04/21] net: dsa: mv88e6xxx: factorize EEPROM access

2016-05-06 Thread Andrew Lunn
> @@ -2596,6 +2797,9 @@ int mv88e6xxx_setup_common(struct mv88e6xxx_priv_state 
> *ps)
>  
>   INIT_WORK(>bridge_work, mv88e6xxx_bridge_work);
>  
> + if (mv88e6xxx_has(ps, MV88E6XXX_FLAG_EEPROM))
> + mutex_init(>eeprom_mutex);
> +

This seems to be the only place MV88E6XXX_FLAG_EEPROM is checked.  It
seems like mv88e6xxx_get_eeprom() and and mv88e6xxx_set_eeprom()
should also check this.

   Andrew


[PATCH v3 net-next 01/11] gso: Remove arbitrary checks for unsupported GSO

2016-05-06 Thread Tom Herbert
In several gso_segment functions there are checks of gso_type against
a seemingly arbitrary list of SKB_GSO_* flags. This seems like an
attempt to identify unsupported GSO types, but since the stack is
the one that set these GSO types in the first place this seems
unnecessary to do. If a combination isn't valid in the first
place that stack should not allow setting it.

This is a code simplication especially for add new GSO types.

Signed-off-by: Tom Herbert 
---
 net/ipv4/af_inet.c | 18 --
 net/ipv4/gre_offload.c | 14 --
 net/ipv4/tcp_offload.c | 19 ---
 net/ipv4/udp_offload.c | 10 --
 net/ipv6/ip6_offload.c | 18 --
 net/ipv6/udp_offload.c | 13 -
 net/mpls/mpls_gso.c|  9 -
 7 files changed, 101 deletions(-)

diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 2e6e65f..7f08d45 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1205,24 +1205,6 @@ static struct sk_buff *inet_gso_segment(struct sk_buff 
*skb,
int ihl;
int id;
 
-   if (unlikely(skb_shinfo(skb)->gso_type &
-~(SKB_GSO_TCPV4 |
-  SKB_GSO_UDP |
-  SKB_GSO_DODGY |
-  SKB_GSO_TCP_ECN |
-  SKB_GSO_GRE |
-  SKB_GSO_GRE_CSUM |
-  SKB_GSO_IPIP |
-  SKB_GSO_SIT |
-  SKB_GSO_TCPV6 |
-  SKB_GSO_UDP_TUNNEL |
-  SKB_GSO_UDP_TUNNEL_CSUM |
-  SKB_GSO_TCP_FIXEDID |
-  SKB_GSO_TUNNEL_REMCSUM |
-  SKB_GSO_PARTIAL |
-  0)))
-   goto out;
-
skb_reset_network_header(skb);
nhoff = skb_network_header(skb) - skb_mac_header(skb);
if (unlikely(!pskb_may_pull(skb, sizeof(*iph
diff --git a/net/ipv4/gre_offload.c b/net/ipv4/gre_offload.c
index e88190a..ecd1e09 100644
--- a/net/ipv4/gre_offload.c
+++ b/net/ipv4/gre_offload.c
@@ -26,20 +26,6 @@ static struct sk_buff *gre_gso_segment(struct sk_buff *skb,
int gre_offset, outer_hlen;
bool need_csum, ufo;
 
-   if (unlikely(skb_shinfo(skb)->gso_type &
-   ~(SKB_GSO_TCPV4 |
- SKB_GSO_TCPV6 |
- SKB_GSO_UDP |
- SKB_GSO_DODGY |
- SKB_GSO_TCP_ECN |
- SKB_GSO_TCP_FIXEDID |
- SKB_GSO_GRE |
- SKB_GSO_GRE_CSUM |
- SKB_GSO_IPIP |
- SKB_GSO_SIT |
- SKB_GSO_PARTIAL)))
-   goto out;
-
if (!skb->encapsulation)
goto out;
 
diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c
index 02737b6..5c59649 100644
--- a/net/ipv4/tcp_offload.c
+++ b/net/ipv4/tcp_offload.c
@@ -83,25 +83,6 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb,
 
if (skb_gso_ok(skb, features | NETIF_F_GSO_ROBUST)) {
/* Packet is from an untrusted source, reset gso_segs. */
-   int type = skb_shinfo(skb)->gso_type;
-
-   if (unlikely(type &
-~(SKB_GSO_TCPV4 |
-  SKB_GSO_DODGY |
-  SKB_GSO_TCP_ECN |
-  SKB_GSO_TCP_FIXEDID |
-  SKB_GSO_TCPV6 |
-  SKB_GSO_GRE |
-  SKB_GSO_GRE_CSUM |
-  SKB_GSO_IPIP |
-  SKB_GSO_SIT |
-  SKB_GSO_UDP_TUNNEL |
-  SKB_GSO_UDP_TUNNEL_CSUM |
-  SKB_GSO_TUNNEL_REMCSUM |
-  0) ||
-!(type & (SKB_GSO_TCPV4 |
-  SKB_GSO_TCPV6
-   goto out;
 
skb_shinfo(skb)->gso_segs = DIV_ROUND_UP(skb->len, mss);
 
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index 097060de..b556ef6 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -209,16 +209,6 @@ static struct sk_buff *udp4_ufo_fragment(struct sk_buff 
*skb,
 
if (skb_gso_ok(skb, features | NETIF_F_GSO_ROBUST)) {
/* Packet is from an untrusted source, reset gso_segs. */
-   int type = skb_shinfo(skb)->gso_type;
-
-   if (unlikely(type & ~(SKB_GSO_UDP | SKB_GSO_DODGY |
- SKB_GSO_UDP_TUNNEL |
- SKB_GSO_UDP_TUNNEL_CSUM |
- SKB_GSO_TUNNEL_REMCSUM |
- SKB_GSO_IPIP |
- 

[PATCH v3 net-next 03/11] gre6: Fix flag translations

2016-05-06 Thread Tom Herbert
GRE for IPv6 does not properly translate for GRE flags to tunnel
flags and vice versa. This patch fixes that.

Signed-off-by: Tom Herbert 
---
 net/ipv6/ip6_gre.c | 20 
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index 47b671a..70a1f72 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -799,8 +799,8 @@ static void ip6gre_tnl_parm_from_user(struct __ip6_tnl_parm 
*p,
p->link = u->link;
p->i_key = u->i_key;
p->o_key = u->o_key;
-   p->i_flags = u->i_flags;
-   p->o_flags = u->o_flags;
+   p->i_flags = gre_flags_to_tnl_flags(u->i_flags);
+   p->o_flags = gre_flags_to_tnl_flags(u->o_flags);
memcpy(p->name, u->name, sizeof(u->name));
 }
 
@@ -817,8 +817,8 @@ static void ip6gre_tnl_parm_to_user(struct ip6_tnl_parm2 *u,
u->link = p->link;
u->i_key = p->i_key;
u->o_key = p->o_key;
-   u->i_flags = p->i_flags;
-   u->o_flags = p->o_flags;
+   u->i_flags = gre_tnl_flags_to_gre_flags(p->i_flags);
+   u->o_flags = gre_tnl_flags_to_gre_flags(p->o_flags);
memcpy(u->name, p->name, sizeof(u->name));
 }
 
@@ -1217,10 +1217,12 @@ static void ip6gre_netlink_parms(struct nlattr *data[],
parms->link = nla_get_u32(data[IFLA_GRE_LINK]);
 
if (data[IFLA_GRE_IFLAGS])
-   parms->i_flags = nla_get_be16(data[IFLA_GRE_IFLAGS]);
+   parms->i_flags = gre_flags_to_tnl_flags(
+   nla_get_be16(data[IFLA_GRE_IFLAGS]));
 
if (data[IFLA_GRE_OFLAGS])
-   parms->o_flags = nla_get_be16(data[IFLA_GRE_OFLAGS]);
+   parms->o_flags = gre_flags_to_tnl_flags(
+   nla_get_be16(data[IFLA_GRE_OFLAGS]));
 
if (data[IFLA_GRE_IKEY])
parms->i_key = nla_get_be32(data[IFLA_GRE_IKEY]);
@@ -1412,8 +1414,10 @@ static int ip6gre_fill_info(struct sk_buff *skb, const 
struct net_device *dev)
struct __ip6_tnl_parm *p = >parms;
 
if (nla_put_u32(skb, IFLA_GRE_LINK, p->link) ||
-   nla_put_be16(skb, IFLA_GRE_IFLAGS, p->i_flags) ||
-   nla_put_be16(skb, IFLA_GRE_OFLAGS, p->o_flags) ||
+   nla_put_be16(skb, IFLA_GRE_IFLAGS,
+gre_tnl_flags_to_gre_flags(p->i_flags)) ||
+   nla_put_be16(skb, IFLA_GRE_OFLAGS,
+gre_tnl_flags_to_gre_flags(p->o_flags)) ||
nla_put_be32(skb, IFLA_GRE_IKEY, p->i_key) ||
nla_put_be32(skb, IFLA_GRE_OKEY, p->o_key) ||
nla_put_in6_addr(skb, IFLA_GRE_LOCAL, >laddr) ||
-- 
2.8.0.rc2



[PATCH v3 net-next 05/11] fou: Split out {fou,gue}_build_header

2016-05-06 Thread Tom Herbert
Create __fou_build_header and __gue_build_header. These implement the
protocol generic parts of building the fou and gue header.
fou_build_header and gue_build_header implement the IPv4 specific
functions and call the __*_build_header functions.

Signed-off-by: Tom Herbert 
---
 include/net/fou.h |  8 
 net/ipv4/fou.c| 47 +--
 2 files changed, 41 insertions(+), 14 deletions(-)

diff --git a/include/net/fou.h b/include/net/fou.h
index 19b8a0c..7d2fda2 100644
--- a/include/net/fou.h
+++ b/include/net/fou.h
@@ -11,9 +11,9 @@
 size_t fou_encap_hlen(struct ip_tunnel_encap *e);
 static size_t gue_encap_hlen(struct ip_tunnel_encap *e);
 
-int fou_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
-u8 *protocol, struct flowi4 *fl4);
-int gue_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
-u8 *protocol, struct flowi4 *fl4);
+int __fou_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
+  u8 *protocol, __be16 *sport, int type);
+int __gue_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
+  u8 *protocol, __be16 *sport, int type);
 
 #endif
diff --git a/net/ipv4/fou.c b/net/ipv4/fou.c
index a8b5cbf..971c8c6 100644
--- a/net/ipv4/fou.c
+++ b/net/ipv4/fou.c
@@ -778,6 +778,22 @@ static void fou_build_udp(struct sk_buff *skb, struct 
ip_tunnel_encap *e,
*protocol = IPPROTO_UDP;
 }
 
+int __fou_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
+  u8 *protocol, __be16 *sport, int type)
+{
+   int err;
+
+   err = iptunnel_handle_offloads(skb, type);
+   if (err)
+   return err;
+
+   *sport = e->sport ? : udp_flow_src_port(dev_net(skb->dev),
+   skb, 0, 0, false);
+
+   return 0;
+}
+EXPORT_SYMBOL(__fou_build_header);
+
 int fou_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
 u8 *protocol, struct flowi4 *fl4)
 {
@@ -786,26 +802,21 @@ int fou_build_header(struct sk_buff *skb, struct 
ip_tunnel_encap *e,
__be16 sport;
int err;
 
-   err = iptunnel_handle_offloads(skb, type);
+   err = __fou_build_header(skb, e, protocol, , type);
if (err)
return err;
 
-   sport = e->sport ? : udp_flow_src_port(dev_net(skb->dev),
-  skb, 0, 0, false);
fou_build_udp(skb, e, fl4, protocol, sport);
 
return 0;
 }
 EXPORT_SYMBOL(fou_build_header);
 
-int gue_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
-u8 *protocol, struct flowi4 *fl4)
+int __gue_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
+  u8 *protocol, __be16 *sport, int type)
 {
-   int type = e->flags & TUNNEL_ENCAP_FLAG_CSUM ? SKB_GSO_UDP_TUNNEL_CSUM :
-  SKB_GSO_UDP_TUNNEL;
struct guehdr *guehdr;
size_t hdrlen, optlen = 0;
-   __be16 sport;
void *data;
bool need_priv = false;
int err;
@@ -824,8 +835,8 @@ int gue_build_header(struct sk_buff *skb, struct 
ip_tunnel_encap *e,
return err;
 
/* Get source port (based on flow hash) before skb_push */
-   sport = e->sport ? : udp_flow_src_port(dev_net(skb->dev),
-  skb, 0, 0, false);
+   *sport = e->sport ? : udp_flow_src_port(dev_net(skb->dev),
+   skb, 0, 0, false);
 
hdrlen = sizeof(struct guehdr) + optlen;
 
@@ -870,6 +881,22 @@ int gue_build_header(struct sk_buff *skb, struct 
ip_tunnel_encap *e,
 
}
 
+   return 0;
+}
+EXPORT_SYMBOL(__gue_build_header);
+
+int gue_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
+u8 *protocol, struct flowi4 *fl4)
+{
+   int type = e->flags & TUNNEL_ENCAP_FLAG_CSUM ? SKB_GSO_UDP_TUNNEL_CSUM :
+  SKB_GSO_UDP_TUNNEL;
+   __be16 sport;
+   int err;
+
+   err = __gue_build_header(skb, e, protocol, , type);
+   if (err)
+   return err;
+
fou_build_udp(skb, e, fl4, protocol, sport);
 
return 0;
-- 
2.8.0.rc2



[PATCH v3 net-next 06/11] fou: Add encap ops for IPv6 tunnels

2016-05-06 Thread Tom Herbert
Thsi packet adds IP tunnel encapsulation operations for IPv6. This
includes the infrastructure to add and delete operations. IPv6 variants
for fou6_build_header and gue6_build_header are added in a new
fou6 module. These encapsulation operations for fou and gue are
automatically added when the fou6 module loads.

Signed-off-by: Tom Herbert 
---
 include/net/fou.h  |   2 +-
 include/net/ip6_tunnel.h   |  14 +
 net/ipv6/Makefile  |   4 +-
 net/ipv6/fou6.c| 140 +
 net/ipv6/ip6_tunnel_core.c |  44 ++
 5 files changed, 202 insertions(+), 2 deletions(-)
 create mode 100644 net/ipv6/fou6.c
 create mode 100644 net/ipv6/ip6_tunnel_core.c

diff --git a/include/net/fou.h b/include/net/fou.h
index 7d2fda2..f5cc691 100644
--- a/include/net/fou.h
+++ b/include/net/fou.h
@@ -9,7 +9,7 @@
 #include 
 
 size_t fou_encap_hlen(struct ip_tunnel_encap *e);
-static size_t gue_encap_hlen(struct ip_tunnel_encap *e);
+size_t gue_encap_hlen(struct ip_tunnel_encap *e);
 
 int __fou_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
   u8 *protocol, __be16 *sport, int type);
diff --git a/include/net/ip6_tunnel.h b/include/net/ip6_tunnel.h
index fb9e015..1c14c27 100644
--- a/include/net/ip6_tunnel.h
+++ b/include/net/ip6_tunnel.h
@@ -34,6 +34,20 @@ struct __ip6_tnl_parm {
__be32  o_key;
 };
 
+struct ip6_tnl_encap_ops {
+   size_t (*encap_hlen)(struct ip_tunnel_encap *e);
+   int (*build_header)(struct sk_buff *skb, struct ip_tunnel_encap *e,
+   u8 *protocol, struct flowi6 *fl6);
+};
+
+extern const struct ip6_tnl_encap_ops __rcu *
+   ip6tun_encaps[MAX_IPTUN_ENCAP_OPS];
+
+int ip6_tnl_encap_add_ops(const struct ip6_tnl_encap_ops *op,
+ unsigned int num);
+int ip6_tnl_encap_del_ops(const struct ip6_tnl_encap_ops *op,
+ unsigned int num);
+
 /* IPv6 tunnel */
 struct ip6_tnl {
struct ip6_tnl __rcu *next; /* next tunnel in list */
diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile
index 5e9d6bf..5cf4a1f 100644
--- a/net/ipv6/Makefile
+++ b/net/ipv6/Makefile
@@ -9,7 +9,7 @@ ipv6-objs :=af_inet6.o anycast.o ip6_output.o ip6_input.o 
addrconf.o \
route.o ip6_fib.o ipv6_sockglue.o ndisc.o udp.o udplite.o \
raw.o icmp.o mcast.o reassembly.o tcp_ipv6.o ping.o \
exthdrs.o datagram.o ip6_flowlabel.o inet6_connection_sock.o \
-   udp_offload.o
+   udp_offload.o ip6_tunnel_core.o
 
 ipv6-offload :=ip6_offload.o tcpv6_offload.o exthdrs_offload.o
 
@@ -43,6 +43,8 @@ obj-$(CONFIG_IPV6_SIT) += sit.o
 obj-$(CONFIG_IPV6_TUNNEL) += ip6_tunnel.o
 obj-$(CONFIG_IPV6_GRE) += ip6_gre.o
 
+obj-$(CONFIG_NET_FOU) += fou6.o
+
 obj-y += addrconf_core.o exthdrs_core.o ip6_checksum.o ip6_icmp.o
 obj-$(CONFIG_INET) += output_core.o protocol.o $(ipv6-offload)
 
diff --git a/net/ipv6/fou6.c b/net/ipv6/fou6.c
new file mode 100644
index 000..c972d0b
--- /dev/null
+++ b/net/ipv6/fou6.c
@@ -0,0 +1,140 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static void fou6_build_udp(struct sk_buff *skb, struct ip_tunnel_encap *e,
+  struct flowi6 *fl6, u8 *protocol, __be16 sport)
+{
+   struct udphdr *uh;
+
+   skb_push(skb, sizeof(struct udphdr));
+   skb_reset_transport_header(skb);
+
+   uh = udp_hdr(skb);
+
+   uh->dest = e->dport;
+   uh->source = sport;
+   uh->len = htons(skb->len);
+   udp6_set_csum(!(e->flags & TUNNEL_ENCAP_FLAG_CSUM6), skb,
+ >saddr, >daddr, skb->len);
+
+   *protocol = IPPROTO_UDP;
+}
+
+int fou6_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
+ u8 *protocol, struct flowi6 *fl6)
+{
+   __be16 sport;
+   int err;
+   int type = e->flags & TUNNEL_ENCAP_FLAG_CSUM6 ?
+   SKB_GSO_UDP_TUNNEL_CSUM : SKB_GSO_UDP_TUNNEL;
+
+   err = __fou_build_header(skb, e, protocol, , type);
+   if (err)
+   return err;
+
+   fou6_build_udp(skb, e, fl6, protocol, sport);
+
+   return 0;
+}
+EXPORT_SYMBOL(fou6_build_header);
+
+int gue6_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
+ u8 *protocol, struct flowi6 *fl6)
+{
+   __be16 sport;
+   int err;
+   int type = e->flags & TUNNEL_ENCAP_FLAG_CSUM6 ?
+   SKB_GSO_UDP_TUNNEL_CSUM : SKB_GSO_UDP_TUNNEL;
+
+   err = __gue_build_header(skb, e, protocol, , type);
+   if (err)
+   return err;
+
+   fou6_build_udp(skb, e, fl6, protocol, sport);
+
+   return 0;
+}
+EXPORT_SYMBOL(gue6_build_header);
+
+#ifdef CONFIG_NET_FOU_IP_TUNNELS
+
+static const struct ip6_tnl_encap_ops fou_ip6tun_ops = {
+   .encap_hlen = 

[PATCH v3 net-next 02/11] net: define gso types for IPx over IPv4 and IPv6

2016-05-06 Thread Tom Herbert
This patch defines two new GDO definitions SKB_GSO_IPXIP4 and
SKB_GSO_IPXIP6 along with corresponding NETIF_F_GSO_IPXIP4 and
NETIF_F_GSO_IPXIP6. These are used to described IP in IP
tunnel and what the outer protocol is. The inner protocol
can be deduced from other GSO types (e.g. SKB_GSO_TCPV4 and
SKB_GSO_TCPV6). The GSO types of SKB_GSO_IPIP and SKB_GSO_SIT
are removed (these are both instances of SKB_GSO_IPXIP4).
SKB_GSO_IPXIP6 will be used when support for GSO with IP
encapsulation over IPv6 is added.

Signed-off-by: Tom Herbert 
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c  |  5 ++---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c |  4 ++--
 drivers/net/ethernet/intel/i40e/i40e_main.c   |  3 +--
 drivers/net/ethernet/intel/i40e/i40e_txrx.c   |  3 +--
 drivers/net/ethernet/intel/i40evf/i40e_txrx.c |  3 +--
 drivers/net/ethernet/intel/i40evf/i40evf_main.c   |  3 +--
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |  3 +--
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c |  3 +--
 include/linux/netdev_features.h   | 12 ++--
 include/linux/netdevice.h |  4 ++--
 include/linux/skbuff.h|  4 ++--
 net/core/ethtool.c|  4 ++--
 net/ipv4/af_inet.c|  2 +-
 net/ipv4/ipip.c   |  2 +-
 net/ipv6/ip6_offload.c|  4 ++--
 net/ipv6/sit.c|  4 ++--
 net/netfilter/ipvs/ip_vs_xmit.c   | 17 +++--
 17 files changed, 35 insertions(+), 45 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c 
b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index d465bd7..0a5b770 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -13259,12 +13259,11 @@ static int bnx2x_init_dev(struct bnx2x *bp, struct 
pci_dev *pdev,
NETIF_F_RXHASH | NETIF_F_HW_VLAN_CTAG_TX;
if (!chip_is_e1x) {
dev->hw_features |= NETIF_F_GSO_GRE | NETIF_F_GSO_UDP_TUNNEL |
-   NETIF_F_GSO_IPIP | NETIF_F_GSO_SIT;
+   NETIF_F_GSO_IPXIP4;
dev->hw_enc_features =
NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | NETIF_F_SG |
NETIF_F_TSO | NETIF_F_TSO_ECN | NETIF_F_TSO6 |
-   NETIF_F_GSO_IPIP |
-   NETIF_F_GSO_SIT |
+   NETIF_F_GSO_IPXIP4 |
NETIF_F_GSO_GRE | NETIF_F_GSO_UDP_TUNNEL;
}
 
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index fd85b6d..e449228 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -6218,7 +6218,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const 
struct pci_device_id *ent)
dev->hw_features = NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | NETIF_F_SG |
   NETIF_F_TSO | NETIF_F_TSO6 |
   NETIF_F_GSO_UDP_TUNNEL | NETIF_F_GSO_GRE |
-  NETIF_F_GSO_IPIP | NETIF_F_GSO_SIT |
+  NETIF_F_GSO_IPXIP4 |
   NETIF_F_GSO_UDP_TUNNEL_CSUM | NETIF_F_GSO_GRE_CSUM |
   NETIF_F_GSO_PARTIAL | NETIF_F_RXHASH |
   NETIF_F_RXCSUM | NETIF_F_LRO | NETIF_F_GRO;
@@ -6228,7 +6228,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const 
struct pci_device_id *ent)
NETIF_F_TSO | NETIF_F_TSO6 |
NETIF_F_GSO_UDP_TUNNEL | NETIF_F_GSO_GRE |
NETIF_F_GSO_UDP_TUNNEL_CSUM | NETIF_F_GSO_GRE_CSUM |
-   NETIF_F_GSO_IPIP | NETIF_F_GSO_SIT |
+   NETIF_F_GSO_IPXIP4;
NETIF_F_GSO_PARTIAL;
dev->gso_partial_features = NETIF_F_GSO_UDP_TUNNEL_CSUM |
NETIF_F_GSO_GRE_CSUM;
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index f6da6b7..c2a4c10 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -9131,8 +9131,7 @@ static int i40e_config_netdev(struct i40e_vsi *vsi)
   NETIF_F_TSO6 |
   NETIF_F_GSO_GRE  |
   NETIF_F_GSO_GRE_CSUM |
-  NETIF_F_GSO_IPIP |
-  NETIF_F_GSO_SIT  |
+  NETIF_F_GSO_IPXIP4   |
   NETIF_F_GSO_UDP_TUNNEL   |
   NETIF_F_GSO_UDP_TUNNEL_CSUM  |
 

[PATCH v3 net-next 11/11] ip6_gre: Add support for fou/gue encapsulation

2016-05-06 Thread Tom Herbert
Add netlink and setup for encapsulation

Signed-off-by: Tom Herbert 
---
 net/ipv6/ip6_gre.c | 75 ++
 1 file changed, 75 insertions(+)

diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index 70a1f72..ed5ddcc 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -1027,6 +1027,8 @@ static int ip6gre_tunnel_init_common(struct net_device 
*dev)
 
tunnel->tun_hlen = gre_calc_hlen(tunnel->parms.o_flags);
 
+   tunnel->hlen = tunnel->tun_hlen + tunnel->encap_hlen;
+
t_hlen = tunnel->hlen + sizeof(struct ipv6hdr);
 
dev->needed_headroom= LL_MAX_HEADER + t_hlen + 4;
@@ -1293,15 +1295,57 @@ static void ip6gre_tap_setup(struct net_device *dev)
dev->priv_flags &= ~IFF_TX_SKB_SHARING;
 }
 
+static bool ip6gre_netlink_encap_parms(struct nlattr *data[],
+  struct ip_tunnel_encap *ipencap)
+{
+   bool ret = false;
+
+   memset(ipencap, 0, sizeof(*ipencap));
+
+   if (!data)
+   return ret;
+
+   if (data[IFLA_GRE_ENCAP_TYPE]) {
+   ret = true;
+   ipencap->type = nla_get_u16(data[IFLA_GRE_ENCAP_TYPE]);
+   }
+
+   if (data[IFLA_GRE_ENCAP_FLAGS]) {
+   ret = true;
+   ipencap->flags = nla_get_u16(data[IFLA_GRE_ENCAP_FLAGS]);
+   }
+
+   if (data[IFLA_GRE_ENCAP_SPORT]) {
+   ret = true;
+   ipencap->sport = nla_get_be16(data[IFLA_GRE_ENCAP_SPORT]);
+   }
+
+   if (data[IFLA_GRE_ENCAP_DPORT]) {
+   ret = true;
+   ipencap->dport = nla_get_be16(data[IFLA_GRE_ENCAP_DPORT]);
+   }
+
+   return ret;
+}
+
 static int ip6gre_newlink(struct net *src_net, struct net_device *dev,
struct nlattr *tb[], struct nlattr *data[])
 {
struct ip6_tnl *nt;
struct net *net = dev_net(dev);
struct ip6gre_net *ign = net_generic(net, ip6gre_net_id);
+   struct ip_tunnel_encap ipencap;
int err;
 
nt = netdev_priv(dev);
+
+   if (ip6gre_netlink_encap_parms(data, )) {
+   int err = ip6_tnl_encap_setup(nt, );
+
+   if (err < 0)
+   return err;
+   }
+
ip6gre_netlink_parms(data, >parms);
 
if (ip6gre_tunnel_find(net, >parms, dev->type))
@@ -1348,10 +1392,18 @@ static int ip6gre_changelink(struct net_device *dev, 
struct nlattr *tb[],
struct net *net = nt->net;
struct ip6gre_net *ign = net_generic(net, ip6gre_net_id);
struct __ip6_tnl_parm p;
+   struct ip_tunnel_encap ipencap;
 
if (dev == ign->fb_tunnel_dev)
return -EINVAL;
 
+   if (ip6gre_netlink_encap_parms(data, )) {
+   int err = ip6_tnl_encap_setup(nt, );
+
+   if (err < 0)
+   return err;
+   }
+
ip6gre_netlink_parms(data, );
 
t = ip6gre_tunnel_locate(net, , 0);
@@ -1405,6 +1457,14 @@ static size_t ip6gre_get_size(const struct net_device 
*dev)
nla_total_size(4) +
/* IFLA_GRE_FLAGS */
nla_total_size(4) +
+   /* IFLA_GRE_ENCAP_TYPE */
+   nla_total_size(2) +
+   /* IFLA_GRE_ENCAP_FLAGS */
+   nla_total_size(2) +
+   /* IFLA_GRE_ENCAP_SPORT */
+   nla_total_size(2) +
+   /* IFLA_GRE_ENCAP_DPORT */
+   nla_total_size(2) +
0;
 }
 
@@ -1428,6 +1488,17 @@ static int ip6gre_fill_info(struct sk_buff *skb, const 
struct net_device *dev)
nla_put_be32(skb, IFLA_GRE_FLOWINFO, p->flowinfo) ||
nla_put_u32(skb, IFLA_GRE_FLAGS, p->flags))
goto nla_put_failure;
+
+   if (nla_put_u16(skb, IFLA_GRE_ENCAP_TYPE,
+   t->encap.type) ||
+   nla_put_be16(skb, IFLA_GRE_ENCAP_SPORT,
+t->encap.sport) ||
+   nla_put_be16(skb, IFLA_GRE_ENCAP_DPORT,
+t->encap.dport) ||
+   nla_put_u16(skb, IFLA_GRE_ENCAP_FLAGS,
+   t->encap.flags))
+   goto nla_put_failure;
+
return 0;
 
 nla_put_failure:
@@ -1446,6 +1517,10 @@ static const struct nla_policy 
ip6gre_policy[IFLA_GRE_MAX + 1] = {
[IFLA_GRE_ENCAP_LIMIT] = { .type = NLA_U8 },
[IFLA_GRE_FLOWINFO]= { .type = NLA_U32 },
[IFLA_GRE_FLAGS]   = { .type = NLA_U32 },
+   [IFLA_GRE_ENCAP_TYPE]   = { .type = NLA_U16 },
+   [IFLA_GRE_ENCAP_FLAGS]  = { .type = NLA_U16 },
+   [IFLA_GRE_ENCAP_SPORT]  = { .type = NLA_U16 },
+   [IFLA_GRE_ENCAP_DPORT]  = { .type = NLA_U16 },
 };
 
 static struct rtnl_link_ops ip6gre_link_ops __read_mostly = {
-- 
2.8.0.rc2



[PATCH v3 net-next 08/11] ipv6: Change "final" protocol processing for encapsulation

2016-05-06 Thread Tom Herbert
When performing foo-over-UDP, UDP are receveived processed by the
encapsulation header which returns another protocol to process.
This may result in processing two (or more) protocols in the
loop that are marked as INET6_PROTO_FINAL. The actions taken
for hitting a final protocol, in particular the skb_postpull_rcsum
can only be performed.

This patch set adds a check of a final protocol has been seen. The
rules are:
  - If the final protocol has not been seen any protocol is processed
(final and non-final). In the case of a final protocol, the final
actions are taken (like the skb_postpull_rcsum)
  - If a final protocol has been seen (e.g. an encapsulating UDP
header) then no further non-final protocols are allowed
(e.g. extension headers). For more final protocols the
final actions are not taken (e.g. skb_postpull_rcsum).

Signed-off-by: Tom Herbert 
---
 net/ipv6/ip6_input.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
index 2a0258a..7d98d01 100644
--- a/net/ipv6/ip6_input.c
+++ b/net/ipv6/ip6_input.c
@@ -216,6 +216,7 @@ static int ip6_input_finish(struct net *net, struct sock 
*sk, struct sk_buff *sk
unsigned int nhoff;
int nexthdr;
bool raw;
+   bool have_final = false;
 
/*
 *  Parse extension headers
@@ -235,9 +236,21 @@ resubmit:
if (ipprot) {
int ret;
 
-   if (ipprot->flags & INET6_PROTO_FINAL) {
+   if (have_final) {
+   if (!(ipprot->flags & INET6_PROTO_FINAL)) {
+   /* Once we've seen a final protocol don't
+* allow encapsulation on any non-final
+* ones. This allows foo in UDP encapsulation
+* to work.
+*/
+   goto discard;
+   }
+   } else if (ipprot->flags & INET6_PROTO_FINAL) {
const struct ipv6hdr *hdr;
 
+   /* Only do this once for first final protocol */
+   have_final = true;
+
/* Free reference early: we don't need it any more,
   and it may hold ip_conntrack module loaded
   indefinitely. */
-- 
2.8.0.rc2



[PATCH v3 net-next 00/11] ipv6: Enable GUEoIPv6 and more fixes for v6 tunneling

2016-05-06 Thread Tom Herbert
This patch set:
  - Fixes GRE6 to process translate flags correctly from configuration
  - Adds support for GSO and GRO for ip6ip6 and ip4ip6
  - Add support for FOU and GUE in IPv6
  - Support GRE, ip6ip6 and ip4ip6 over FOU/GUE
  - Fixes ip6_input to deal with UDP encapsulations
  - Some other minor fixes

v2:
  - Removed a check of GSO types in MPLS
  - Define GSO type SKB_GSO_IPXIP6 and SKB_GSO_IPXIP4 (based on input
from Alexander)
  - Don't define GSO types specifally for IP6IP6 and IP4IP6, above
fix makes that uncessary
  - Don't bother clearing encapsulation flag in UDP tunnel segment
(another item suggested by Alexander).

v3:
  - Address some minor comments from Alexander

Tested:
   Tested a variety of case, but not the full matrix (which is quite
   large now). Most of the obivous cases (e.g. GRE) work fine. Still
   some issues probably with GSO/GRO being effective in all cases.

- IPv4/GRE/GUE/IPv6 with RCO
  1 TCP_STREAM
6616 Mbps
  200 TCP_RR
1244043 tps
141/243/446 90/95/99% latencies
86.61% CPU utilization
- IPv6/GRE/GUE/IPv6 with RCO
  1 TCP_STREAM
6940 Mbps
  200 TCP_RR
1270903 tps
138/236/440 90/95/99% latencies
87.51% CPU utilization

 - IP6IP6
  1 TCP_STREAM
2576 Mbps
  200 TCP_RR
498981 tps
388/498/631 90/95/99% latencies
19.75% CPU utilization (1 CPU saturated)

 - IP6IP6/GUE/IPv6 with RCO
  1 TCP_STREAM
1854 Mbps
  200 TCP_RR
1233818 tps
143/244/451 90/95/99% latencies
87.57 CPU utilization

 - IP4IP6
  1 TCP_STREAM
  200 TCP_RR
763774 tps
250/318/466 90/95/99% latencies
35.25% CPU utilization (1 CPU saturated)

 - GRE with keyid
  200 TCP_RR
744173 tps
258/332/461 90/95/99% latencies
34.59% CPU utilization (1 CPU saturated)
  

Tom Herbert (11):
  gso: Remove arbitrary checks for unsupported GSO
  net: define gso types for IPx over IPv4 and IPv6
  gre6: Fix flag translations
  fou: Call setup_udp_tunnel_sock
  fou: Split out {fou,gue}_build_header
  fou: Add encap ops for IPv6 tunnels
  ipv6: Fix nexthdr for reinjection
  ipv6: Change "final" protocol processing for encapsulation
  fou: Support IPv6 in fou
  ip6_tun: Add infrastructure for doing encapsulation
  ip6_gre: Add support for fou/gue encapsulation

 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c  |   5 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt.c |   4 +-
 drivers/net/ethernet/intel/i40e/i40e_main.c   |   3 +-
 drivers/net/ethernet/intel/i40e/i40e_txrx.c   |   3 +-
 drivers/net/ethernet/intel/i40evf/i40e_txrx.c |   3 +-
 drivers/net/ethernet/intel/i40evf/i40evf_main.c   |   3 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |   3 +-
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c |   3 +-
 include/linux/netdev_features.h   |  12 +-
 include/linux/netdevice.h |   4 +-
 include/linux/skbuff.h|   4 +-
 include/net/fou.h |  10 +-
 include/net/ip6_tunnel.h  |  22 +++-
 net/core/ethtool.c|   4 +-
 net/ipv4/af_inet.c|  20 +--
 net/ipv4/fou.c| 144 +-
 net/ipv4/gre_offload.c|  14 ---
 net/ipv4/ipip.c   |   2 +-
 net/ipv4/tcp_offload.c|  19 ---
 net/ipv4/udp_offload.c|  10 --
 net/ipv6/Makefile |   4 +-
 net/ipv6/fou6.c   | 140 +
 net/ipv6/ip6_gre.c|  95 --
 net/ipv6/ip6_input.c  |  24 +++-
 net/ipv6/ip6_offload.c|  22 +---
 net/ipv6/ip6_tunnel.c |   4 +
 net/ipv6/ip6_tunnel_core.c| 108 
 net/ipv6/sit.c|   4 +-
 net/ipv6/udp_offload.c|  13 --
 net/mpls/mpls_gso.c   |   9 --
 net/netfilter/ipvs/ip_vs_xmit.c   |  17 ++-
 31 files changed, 511 insertions(+), 221 deletions(-)
 create mode 100644 net/ipv6/fou6.c
 create mode 100644 net/ipv6/ip6_tunnel_core.c

-- 
2.8.0.rc2



[PATCH v3 net-next 09/11] fou: Support IPv6 in fou

2016-05-06 Thread Tom Herbert
This patch adds receive path support for IPv6 with fou.

- Add address family to fou structure for open sockets. This supports
  AF_INET and AF_INET6. Lookups for fou ports are performed on both the
  port number and family.
- In fou and gue receive adjust tot_len in IPv4 header or payload_len
  based on address family.
- Allow AF_INET6 in FOU_ATTR_AF netlink attribute.

Signed-off-by: Tom Herbert 
---
 net/ipv4/fou.c | 47 +++
 1 file changed, 35 insertions(+), 12 deletions(-)

diff --git a/net/ipv4/fou.c b/net/ipv4/fou.c
index 971c8c6..75db828 100644
--- a/net/ipv4/fou.c
+++ b/net/ipv4/fou.c
@@ -21,6 +21,7 @@ struct fou {
u8 protocol;
u8 flags;
__be16 port;
+   u8 family;
u16 type;
struct list_head list;
struct rcu_head rcu;
@@ -47,14 +48,17 @@ static inline struct fou *fou_from_sock(struct sock *sk)
return sk->sk_user_data;
 }
 
-static int fou_recv_pull(struct sk_buff *skb, size_t len)
+static int fou_recv_pull(struct sk_buff *skb, struct fou *fou, size_t len)
 {
-   struct iphdr *iph = ip_hdr(skb);
-
/* Remove 'len' bytes from the packet (UDP header and
 * FOU header if present).
 */
-   iph->tot_len = htons(ntohs(iph->tot_len) - len);
+   if (fou->family == AF_INET)
+   ip_hdr(skb)->tot_len = htons(ntohs(ip_hdr(skb)->tot_len) - len);
+   else
+   ipv6_hdr(skb)->payload_len =
+   htons(ntohs(ipv6_hdr(skb)->payload_len) - len);
+
__skb_pull(skb, len);
skb_postpull_rcsum(skb, udp_hdr(skb), len);
skb_reset_transport_header(skb);
@@ -68,7 +72,7 @@ static int fou_udp_recv(struct sock *sk, struct sk_buff *skb)
if (!fou)
return 1;
 
-   if (fou_recv_pull(skb, sizeof(struct udphdr)))
+   if (fou_recv_pull(skb, fou, sizeof(struct udphdr)))
goto drop;
 
return -fou->protocol;
@@ -141,7 +145,11 @@ static int gue_udp_recv(struct sock *sk, struct sk_buff 
*skb)
 
hdrlen = sizeof(struct guehdr) + optlen;
 
-   ip_hdr(skb)->tot_len = htons(ntohs(ip_hdr(skb)->tot_len) - len);
+   if (fou->family == AF_INET)
+   ip_hdr(skb)->tot_len = htons(ntohs(ip_hdr(skb)->tot_len) - len);
+   else
+   ipv6_hdr(skb)->payload_len =
+   htons(ntohs(ipv6_hdr(skb)->payload_len) - len);
 
/* Pull csum through the guehdr now . This can be used if
 * there is a remote checksum offload.
@@ -424,7 +432,8 @@ static int fou_add_to_port_list(struct net *net, struct fou 
*fou)
 
mutex_lock(>fou_lock);
list_for_each_entry(fout, >fou_list, list) {
-   if (fou->port == fout->port) {
+   if (fou->port == fout->port &&
+   fou->family == fout->family) {
mutex_unlock(>fou_lock);
return -EALREADY;
}
@@ -469,8 +478,9 @@ static int fou_create(struct net *net, struct fou_cfg *cfg,
 
sk = sock->sk;
 
-   fou->flags = cfg->flags;
fou->port = cfg->udp_config.local_udp_port;
+   fou->family = cfg->udp_config.family;
+   fou->flags = cfg->flags;
fou->type = cfg->type;
fou->sock = sock;
 
@@ -522,12 +532,13 @@ static int fou_destroy(struct net *net, struct fou_cfg 
*cfg)
 {
struct fou_net *fn = net_generic(net, fou_net_id);
__be16 port = cfg->udp_config.local_udp_port;
+   u8 family = cfg->udp_config.family;
int err = -EINVAL;
struct fou *fou;
 
mutex_lock(>fou_lock);
list_for_each_entry(fou, >fou_list, list) {
-   if (fou->port == port) {
+   if (fou->port == port && fou->family == family) {
fou_release(fou);
err = 0;
break;
@@ -565,8 +576,15 @@ static int parse_nl_config(struct genl_info *info,
if (info->attrs[FOU_ATTR_AF]) {
u8 family = nla_get_u8(info->attrs[FOU_ATTR_AF]);
 
-   if (family != AF_INET)
-   return -EINVAL;
+   switch (family) {
+   case AF_INET:
+   break;
+   case AF_INET6:
+   cfg->udp_config.ipv6_v6only = 1;
+   break;
+   default:
+   return -EAFNOSUPPORT;
+   }
 
cfg->udp_config.family = family;
}
@@ -657,6 +675,7 @@ static int fou_nl_cmd_get_port(struct sk_buff *skb, struct 
genl_info *info)
struct fou_cfg cfg;
struct fou *fout;
__be16 port;
+   u8 family;
int ret;
 
ret = parse_nl_config(info, );
@@ -666,6 +685,10 @@ static int fou_nl_cmd_get_port(struct sk_buff *skb, struct 
genl_info *info)
if (port == 0)
return -EINVAL;
 
+   family = cfg.udp_config.family;
+   if (family != 

[PATCH v3 net-next 04/11] fou: Call setup_udp_tunnel_sock

2016-05-06 Thread Tom Herbert
Use helper function to set up UDP tunnel related information for a fou
socket.

Signed-off-by: Tom Herbert 
---
 net/ipv4/fou.c | 50 --
 1 file changed, 16 insertions(+), 34 deletions(-)

diff --git a/net/ipv4/fou.c b/net/ipv4/fou.c
index 7ac5ec8..a8b5cbf 100644
--- a/net/ipv4/fou.c
+++ b/net/ipv4/fou.c
@@ -446,31 +446,13 @@ static void fou_release(struct fou *fou)
kfree_rcu(fou, rcu);
 }
 
-static int fou_encap_init(struct sock *sk, struct fou *fou, struct fou_cfg 
*cfg)
-{
-   udp_sk(sk)->encap_rcv = fou_udp_recv;
-   udp_sk(sk)->gro_receive = fou_gro_receive;
-   udp_sk(sk)->gro_complete = fou_gro_complete;
-   fou_from_sock(sk)->protocol = cfg->protocol;
-
-   return 0;
-}
-
-static int gue_encap_init(struct sock *sk, struct fou *fou, struct fou_cfg 
*cfg)
-{
-   udp_sk(sk)->encap_rcv = gue_udp_recv;
-   udp_sk(sk)->gro_receive = gue_gro_receive;
-   udp_sk(sk)->gro_complete = gue_gro_complete;
-
-   return 0;
-}
-
 static int fou_create(struct net *net, struct fou_cfg *cfg,
  struct socket **sockp)
 {
struct socket *sock = NULL;
struct fou *fou = NULL;
struct sock *sk;
+   struct udp_tunnel_sock_cfg tunnel_cfg;
int err;
 
/* Open UDP socket */
@@ -489,33 +471,33 @@ static int fou_create(struct net *net, struct fou_cfg 
*cfg,
 
fou->flags = cfg->flags;
fou->port = cfg->udp_config.local_udp_port;
+   fou->type = cfg->type;
+   fou->sock = sock;
+
+   memset(_cfg, 0, sizeof(tunnel_cfg));
+   tunnel_cfg.encap_type = 1;
+   tunnel_cfg.sk_user_data = fou;
+   tunnel_cfg.encap_destroy = NULL;
 
/* Initial for fou type */
switch (cfg->type) {
case FOU_ENCAP_DIRECT:
-   err = fou_encap_init(sk, fou, cfg);
-   if (err)
-   goto error;
+   tunnel_cfg.encap_rcv = fou_udp_recv;
+   tunnel_cfg.gro_receive = fou_gro_receive;
+   tunnel_cfg.gro_complete = fou_gro_complete;
+   fou->protocol = cfg->protocol;
break;
case FOU_ENCAP_GUE:
-   err = gue_encap_init(sk, fou, cfg);
-   if (err)
-   goto error;
+   tunnel_cfg.encap_rcv = gue_udp_recv;
+   tunnel_cfg.gro_receive = gue_gro_receive;
+   tunnel_cfg.gro_complete = gue_gro_complete;
break;
default:
err = -EINVAL;
goto error;
}
 
-   fou->type = cfg->type;
-
-   udp_sk(sk)->encap_type = 1;
-   udp_encap_enable();
-
-   sk->sk_user_data = fou;
-   fou->sock = sock;
-
-   inet_inc_convert_csum(sk);
+   setup_udp_tunnel_sock(net, sock, _cfg);
 
sk->sk_allocation = GFP_ATOMIC;
 
-- 
2.8.0.rc2



[PATCH v3 net-next 10/11] ip6_tun: Add infrastructure for doing encapsulation

2016-05-06 Thread Tom Herbert
Add encap_hlen and ip_tunnel_encap structure to ip6_tnl. Add functions
for getting encap hlen, setting up encap on a tunnel, performing
encapsulation operation.

Signed-off-by: Tom Herbert 
---
 include/net/ip6_tunnel.h   |  8 +-
 net/ipv6/ip6_tunnel.c  |  4 +++
 net/ipv6/ip6_tunnel_core.c | 64 ++
 3 files changed, 75 insertions(+), 1 deletion(-)

diff --git a/include/net/ip6_tunnel.h b/include/net/ip6_tunnel.h
index 1c14c27..1b8db86 100644
--- a/include/net/ip6_tunnel.h
+++ b/include/net/ip6_tunnel.h
@@ -66,10 +66,16 @@ struct ip6_tnl {
__u32 o_seqno;  /* The last output seqno */
int hlen;   /* tun_hlen + encap_hlen */
int tun_hlen;   /* Precalculated header length */
+   int encap_hlen; /* Encap header length (FOU,GUE) */
+   struct ip_tunnel_encap encap;
int mlink;
-
 };
 
+int ip6_tnl_encap_setup(struct ip6_tnl *t,
+   struct ip_tunnel_encap *ipencap);
+int ip6_tnl_encap(struct sk_buff *skb, struct ip6_tnl *t,
+ u8 *protocol, struct flowi6 *fl6);
+
 /* Tunnel encapsulation limit destination sub-option */
 
 struct ipv6_tlv_tnl_enc_lim {
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index ade55af..2c096ab 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -1013,6 +1013,10 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device 
*dev, __u8 dsfield,
unsigned int max_headroom = sizeof(struct ipv6hdr);
int err = -1;
 
+   err = ip6_tnl_encap(skb, t, , fl6);
+   if (err)
+   return err;
+
/* NBMA tunnel */
if (ipv6_addr_any(>parms.raddr)) {
struct in6_addr *addr6;
diff --git a/net/ipv6/ip6_tunnel_core.c b/net/ipv6/ip6_tunnel_core.c
index 5f5b79e..94aa414 100644
--- a/net/ipv6/ip6_tunnel_core.c
+++ b/net/ipv6/ip6_tunnel_core.c
@@ -42,3 +42,67 @@ int ip6_tnl_encap_del_ops(const struct ip6_tnl_encap_ops 
*ops,
 }
 EXPORT_SYMBOL(ip6_tnl_encap_del_ops);
 
+static int ip6_encap_hlen(struct ip_tunnel_encap *e)
+{
+   const struct ip6_tnl_encap_ops *ops;
+   int hlen = -EINVAL;
+
+   if (e->type == TUNNEL_ENCAP_NONE)
+   return 0;
+
+   if (e->type >= MAX_IPTUN_ENCAP_OPS)
+   return -EINVAL;
+
+   rcu_read_lock();
+   ops = rcu_dereference(ip6tun_encaps[e->type]);
+   if (likely(ops && ops->encap_hlen))
+   hlen = ops->encap_hlen(e);
+   rcu_read_unlock();
+
+   return hlen;
+}
+
+int ip6_tnl_encap_setup(struct ip6_tnl *t,
+   struct ip_tunnel_encap *ipencap)
+{
+   int hlen;
+
+   memset(>encap, 0, sizeof(t->encap));
+
+   hlen = ip6_encap_hlen(ipencap);
+   if (hlen < 0)
+   return hlen;
+
+   t->encap.type = ipencap->type;
+   t->encap.sport = ipencap->sport;
+   t->encap.dport = ipencap->dport;
+   t->encap.flags = ipencap->flags;
+
+   t->encap_hlen = hlen;
+   t->hlen = t->encap_hlen + t->tun_hlen;
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(ip6_tnl_encap_setup);
+
+int ip6_tnl_encap(struct sk_buff *skb, struct ip6_tnl *t,
+ u8 *protocol, struct flowi6 *fl6)
+{
+   const struct ip6_tnl_encap_ops *ops;
+   int ret = -EINVAL;
+
+   if (t->encap.type == TUNNEL_ENCAP_NONE)
+   return 0;
+
+   if (t->encap.type >= MAX_IPTUN_ENCAP_OPS)
+   return -EINVAL;
+
+   rcu_read_lock();
+   ops = rcu_dereference(ip6tun_encaps[t->encap.type]);
+   if (likely(ops && ops->build_header))
+   ret = ops->build_header(skb, >encap, protocol, fl6);
+   rcu_read_unlock();
+
+   return ret;
+}
+EXPORT_SYMBOL(ip6_tnl_encap);
-- 
2.8.0.rc2



[PATCH v3 net-next 07/11] ipv6: Fix nexthdr for reinjection

2016-05-06 Thread Tom Herbert
In ip6_input_finish the protocol handle returns a value greater than
zero the packet needs to be resubmitted using the returned protocol.
The returned protocol is being ignored and each time through resubmit
nexthdr is taken from an offest in the packet. This patch fixes that
so that nexthdr is taken from return value of the protocol handler.

Signed-off-by: Tom Herbert 
---
 net/ipv6/ip6_input.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
index 6ed5601..2a0258a 100644
--- a/net/ipv6/ip6_input.c
+++ b/net/ipv6/ip6_input.c
@@ -222,13 +222,14 @@ static int ip6_input_finish(struct net *net, struct sock 
*sk, struct sk_buff *sk
 */
 
rcu_read_lock();
-resubmit:
+
idev = ip6_dst_idev(skb_dst(skb));
if (!pskb_pull(skb, skb_transport_offset(skb)))
goto discard;
nhoff = IP6CB(skb)->nhoff;
nexthdr = skb_network_header(skb)[nhoff];
 
+resubmit:
raw = raw6_local_deliver(skb, nexthdr);
ipprot = rcu_dereference(inet6_protos[nexthdr]);
if (ipprot) {
@@ -256,10 +257,12 @@ resubmit:
goto discard;
 
ret = ipprot->handler(skb);
-   if (ret > 0)
+   if (ret > 0) {
+   nexthdr = ret;
goto resubmit;
-   else if (ret == 0)
+   } else if (ret == 0) {
__IP6_INC_STATS(net, idev, IPSTATS_MIB_INDELIVERS);
+   }
} else {
if (!raw) {
if (xfrm6_policy_check(NULL, XFRM_POLICY_IN, skb)) {
-- 
2.8.0.rc2



[PATCH net-next 08/21] net: dsa: mv88e6xxx: factorize registers access

2016-05-06 Thread Vivien Didelot
Only 6131 was not supporting the port registers access yet. Assume such
support and use the unlock access routines in the meantime.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/mv88e6131.c | 2 ++
 drivers/net/dsa/mv88e6xxx.c | 6 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/dsa/mv88e6131.c b/drivers/net/dsa/mv88e6131.c
index 432d3c4..3fb06af 100644
--- a/drivers/net/dsa/mv88e6131.c
+++ b/drivers/net/dsa/mv88e6131.c
@@ -157,6 +157,8 @@ struct dsa_switch_driver mv88e6131_switch_driver = {
.get_sset_count = mv88e6xxx_get_sset_count,
.get_eeprom = mv88e6xxx_get_eeprom,
.set_eeprom = mv88e6xxx_set_eeprom,
+   .get_regs_len   = mv88e6xxx_get_regs_len,
+   .get_regs   = mv88e6xxx_get_regs,
 #ifdef CONFIG_NET_DSA_HWMON
.get_temp   = mv88e6xxx_get_temp,
.get_temp_limit = mv88e6xxx_get_temp_limit,
diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index f76cdc9..7a4ab4b 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -787,13 +787,17 @@ void mv88e6xxx_get_regs(struct dsa_switch *ds, int port,
 
memset(p, 0xff, 32 * sizeof(u16));
 
+   mutex_lock(>smi_mutex);
+
for (i = 0; i < 32; i++) {
int ret;
 
-   ret = mv88e6xxx_reg_read(ps, REG_PORT(port), i);
+   ret = _mv88e6xxx_reg_read(ps, REG_PORT(port), i);
if (ret >= 0)
p[i] = ret;
}
+
+   mutex_unlock(>smi_mutex);
 }
 
 static int _mv88e6xxx_wait(struct mv88e6xxx_priv_state *ps, int reg, int 
offset,
-- 
2.8.2



[PATCH net-next 00/21] net: dsa: mv88e6xxx: turn into monolithic driver

2016-05-06 Thread Vivien Didelot
This patchset merges all mv88e6* drivers supported by the shared
mv88e6xxx code into a single mv88e6xxx DSA switch driver.

Some flags are added to describe the capabilities of a switch model,
such as the presence of a PPU, EEPROM, some old or new registers, etc.

First these flags are used to conditionally support the same set of
functions in every driver, then specific driver files are removed in
favor of the common mv88e6xxx driver.

Only the merge of driver specific setup code assumes a few differences.
If these differences such as frames priorities are really needed for
some models, they can easily be brought back in a future patch.

Some inconsistencies might show up, such as the need for
MV88E6XXX_FLAG_PPU and MV88E6XXX_FLAG_PPU_ACTIVE flags. But this
patchset does not aim to fix them yet. A future patch can do that if
they end up being unwanted.

The patchset has been tested on interconnected 88E6352 and 88E6185.

Changes RFC -> v1:
  - introduce flags in a separate patch
  - do not refactor anything yet
  - do not add new functions prefixed with _
  - drop packet discarding and mentioned tested platforms
  - factorize family flags
  - update text for NET_DSA_MV88E6XXX Kconfig entry

Vivien Didelot (21):
  net: dsa: mv88e6xxx: add flags to info
  net: dsa: mv88e6xxx: factorize PHY access with PPU
  net: dsa: mv88e6xxx: factorize PHY indirect access
  net: dsa: mv88e6xxx: factorize EEPROM access
  net: dsa: mv88e6xxx: factorize temperature access
  net: dsa: mv88e6xxx: factorize MAC address setting
  net: dsa: mv88e6xxx: factorize EEE access
  net: dsa: mv88e6xxx: factorize registers access
  net: dsa: mv88e6xxx: factorize bridge support
  net: dsa: mv88e6xxx: factorize VTU access
  net: dsa: mv88e6xxx: factorize ATU access
  net: dsa: mv88e6xxx: factorize switch reset
  net: dsa: mv88e6xxx: factorize global setup
  net: dsa: mv88e6xxx: factorize GLOBAL_CONTROL setup
  net: dsa: mv88e6xxx: factorize VLAN Ethertype
  net: dsa: mv88e6xxx: factorize GLOBAL_MONITOR_CONTROL setup
  net: dsa: mv88e6xxx: factorize GLOBAL_CONTROL_2 setup
  net: dsa: mv88e6xxx: factorize frames priorities
  net: dsa: mv88e6xxx: factorize switch setup
  net: dsa: mv88e6xxx: factorize tag protocol
  net: dsa: mv88e6xxx: factorize the switch driver

 arch/arm/configs/multi_v5_defconfig |   5 +-
 arch/arm/configs/mvebu_v7_defconfig |   2 +-
 arch/arm/configs/orion5x_defconfig  |   3 +-
 arch/tile/configs/tilegx_defconfig  |   3 +-
 arch/tile/configs/tilepro_defconfig |   3 +-
 drivers/net/dsa/Kconfig |  45 +-
 drivers/net/dsa/Makefile|  15 +-
 drivers/net/dsa/mv88e6123.c | 130 -
 drivers/net/dsa/mv88e6131.c | 204 
 drivers/net/dsa/mv88e6171.c | 151 --
 drivers/net/dsa/mv88e6352.c | 377 --
 drivers/net/dsa/mv88e6xxx.c | 984 +++-
 drivers/net/dsa/mv88e6xxx.h | 249 ++---
 13 files changed, 908 insertions(+), 1263 deletions(-)
 delete mode 100644 drivers/net/dsa/mv88e6123.c
 delete mode 100644 drivers/net/dsa/mv88e6131.c
 delete mode 100644 drivers/net/dsa/mv88e6171.c
 delete mode 100644 drivers/net/dsa/mv88e6352.c

-- 
2.8.2



[PATCH net-next 07/21] net: dsa: mv88e6xxx: factorize EEE access

2016-05-06 Thread Vivien Didelot
Add a MV88E6XXX_FLAG_EEE flag to describe switch models featuring Energy
Efficient Ethernet. Use it to conditionally support such access in the
common code.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/mv88e6123.c |  2 ++
 drivers/net/dsa/mv88e6131.c |  2 ++
 drivers/net/dsa/mv88e6171.c |  2 ++
 drivers/net/dsa/mv88e6xxx.c |  6 ++
 drivers/net/dsa/mv88e6xxx.h | 11 +--
 5 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/drivers/net/dsa/mv88e6123.c b/drivers/net/dsa/mv88e6123.c
index c349fb7..1ad7bcd 100644
--- a/drivers/net/dsa/mv88e6123.c
+++ b/drivers/net/dsa/mv88e6123.c
@@ -117,6 +117,8 @@ struct dsa_switch_driver mv88e6123_switch_driver = {
.set_addr   = mv88e6xxx_set_addr,
.phy_read   = mv88e6xxx_phy_read,
.phy_write  = mv88e6xxx_phy_write,
+   .set_eee= mv88e6xxx_set_eee,
+   .get_eee= mv88e6xxx_get_eee,
.get_strings= mv88e6xxx_get_strings,
.get_ethtool_stats  = mv88e6xxx_get_ethtool_stats,
.get_sset_count = mv88e6xxx_get_sset_count,
diff --git a/drivers/net/dsa/mv88e6131.c b/drivers/net/dsa/mv88e6131.c
index 1e040c6..432d3c4 100644
--- a/drivers/net/dsa/mv88e6131.c
+++ b/drivers/net/dsa/mv88e6131.c
@@ -150,6 +150,8 @@ struct dsa_switch_driver mv88e6131_switch_driver = {
.set_addr   = mv88e6xxx_set_addr,
.phy_read   = mv88e6xxx_phy_read,
.phy_write  = mv88e6xxx_phy_write,
+   .set_eee= mv88e6xxx_set_eee,
+   .get_eee= mv88e6xxx_get_eee,
.get_strings= mv88e6xxx_get_strings,
.get_ethtool_stats  = mv88e6xxx_get_ethtool_stats,
.get_sset_count = mv88e6xxx_get_sset_count,
diff --git a/drivers/net/dsa/mv88e6171.c b/drivers/net/dsa/mv88e6171.c
index f9b20e0..a98e7d3 100644
--- a/drivers/net/dsa/mv88e6171.c
+++ b/drivers/net/dsa/mv88e6171.c
@@ -126,6 +126,8 @@ struct dsa_switch_driver mv88e6171_switch_driver = {
.set_addr   = mv88e6xxx_set_addr,
.phy_read   = mv88e6xxx_phy_read,
.phy_write  = mv88e6xxx_phy_write,
+   .set_eee= mv88e6xxx_set_eee,
+   .get_eee= mv88e6xxx_get_eee,
.get_strings= mv88e6xxx_get_strings,
.get_ethtool_stats  = mv88e6xxx_get_ethtool_stats,
.get_sset_count = mv88e6xxx_get_sset_count,
diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index 10cc277..f76cdc9 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -1097,6 +1097,9 @@ int mv88e6xxx_get_eee(struct dsa_switch *ds, int port, 
struct ethtool_eee *e)
struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
int reg;
 
+   if (!mv88e6xxx_has(ps, MV88E6XXX_FLAG_EEE))
+   return -EOPNOTSUPP;
+
mutex_lock(>smi_mutex);
 
reg = _mv88e6xxx_phy_read_indirect(ps, port, 16);
@@ -1125,6 +1128,9 @@ int mv88e6xxx_set_eee(struct dsa_switch *ds, int port,
int reg;
int ret;
 
+   if (!mv88e6xxx_has(ps, MV88E6XXX_FLAG_EEE))
+   return -EOPNOTSUPP;
+
mutex_lock(>smi_mutex);
 
ret = _mv88e6xxx_phy_read_indirect(ps, port, 16);
diff --git a/drivers/net/dsa/mv88e6xxx.h b/drivers/net/dsa/mv88e6xxx.h
index 517e95f..b99e090 100644
--- a/drivers/net/dsa/mv88e6xxx.h
+++ b/drivers/net/dsa/mv88e6xxx.h
@@ -351,6 +351,10 @@ enum mv88e6xxx_family {
 };
 
 enum mv88e6xxx_cap {
+   /* Energy Efficient Ethernet.
+*/
+   MV88E6XXX_CAP_EEE,
+
/* EEPROM Command and Data registers.
 * See GLOBAL2_EEPROM_OP and GLOBAL2_EEPROM_DATA.
 */
@@ -382,6 +386,7 @@ enum mv88e6xxx_cap {
 };
 
 /* Bitmask of capabilities */
+#define MV88E6XXX_FLAG_EEE BIT(MV88E6XXX_CAP_EEE)
 #define MV88E6XXX_FLAG_EEPROM  BIT(MV88E6XXX_CAP_EEPROM)
 #define MV88E6XXX_FLAG_PPU BIT(MV88E6XXX_CAP_PPU)
 #define MV88E6XXX_FLAG_SMI_PHY BIT(MV88E6XXX_CAP_SMI_PHY)
@@ -403,7 +408,8 @@ enum mv88e6xxx_cap {
MV88E6XXX_FLAG_PPU
 
 #define MV88E6XXX_FLAGS_FAMILY_6320\
-   (MV88E6XXX_FLAG_EEPROM |\
+   (MV88E6XXX_FLAG_EEE |   \
+MV88E6XXX_FLAG_EEPROM |\
 MV88E6XXX_FLAG_SMI_PHY |   \
 MV88E6XXX_FLAG_SWITCH_MAC |\
 MV88E6XXX_FLAG_TEMP |  \
@@ -415,7 +421,8 @@ enum mv88e6xxx_cap {
 MV88E6XXX_FLAG_TEMP)
 
 #define MV88E6XXX_FLAGS_FAMILY_6352\
-   (MV88E6XXX_FLAG_EEPROM |\
+   (MV88E6XXX_FLAG_EEE |   \
+MV88E6XXX_FLAG_EEPROM |\
 MV88E6XXX_FLAG_SMI_PHY |   \
 MV88E6XXX_FLAG_SWITCH_MAC |\
 MV88E6XXX_FLAG_TEMP |  \
-- 
2.8.2



[PATCH net-next 01/21] net: dsa: mv88e6xxx: add flags to info

2016-05-06 Thread Vivien Didelot
Add a flags bitmap to the info structure in order to identify features
supported or not by the different switch models.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/mv88e6123.c |  3 +++
 drivers/net/dsa/mv88e6131.c |  4 
 drivers/net/dsa/mv88e6171.c |  4 
 drivers/net/dsa/mv88e6352.c |  6 ++
 drivers/net/dsa/mv88e6xxx.h | 21 +
 5 files changed, 38 insertions(+)

diff --git a/drivers/net/dsa/mv88e6123.c b/drivers/net/dsa/mv88e6123.c
index 5535a42..ab5885b 100644
--- a/drivers/net/dsa/mv88e6123.c
+++ b/drivers/net/dsa/mv88e6123.c
@@ -24,18 +24,21 @@ static const struct mv88e6xxx_info mv88e6123_table[] = {
.name = "Marvell 88E6123",
.num_databases = 4096,
.num_ports = 3,
+   .flags = MV88E6XXX_FLAGS_FAMILY_6165,
}, {
.prod_num = PORT_SWITCH_ID_PROD_NUM_6161,
.family = MV88E6XXX_FAMILY_6165,
.name = "Marvell 88E6161",
.num_databases = 4096,
.num_ports = 6,
+   .flags = MV88E6XXX_FLAGS_FAMILY_6165,
}, {
.prod_num = PORT_SWITCH_ID_PROD_NUM_6165,
.family = MV88E6XXX_FAMILY_6165,
.name = "Marvell 88E6165",
.num_databases = 4096,
.num_ports = 6,
+   .flags = MV88E6XXX_FLAGS_FAMILY_6165,
}
 };
 
diff --git a/drivers/net/dsa/mv88e6131.c b/drivers/net/dsa/mv88e6131.c
index 357ab79..d477320 100644
--- a/drivers/net/dsa/mv88e6131.c
+++ b/drivers/net/dsa/mv88e6131.c
@@ -24,24 +24,28 @@ static const struct mv88e6xxx_info mv88e6131_table[] = {
.name = "Marvell 88E6095/88E6095F",
.num_databases = 256,
.num_ports = 11,
+   .flags = MV88E6XXX_FLAGS_FAMILY_6095,
}, {
.prod_num = PORT_SWITCH_ID_PROD_NUM_6085,
.family = MV88E6XXX_FAMILY_6097,
.name = "Marvell 88E6085",
.num_databases = 4096,
.num_ports = 10,
+   .flags = MV88E6XXX_FLAGS_FAMILY_6097,
}, {
.prod_num = PORT_SWITCH_ID_PROD_NUM_6131,
.family = MV88E6XXX_FAMILY_6185,
.name = "Marvell 88E6131",
.num_databases = 256,
.num_ports = 8,
+   .flags = MV88E6XXX_FLAGS_FAMILY_6185,
}, {
.prod_num = PORT_SWITCH_ID_PROD_NUM_6185,
.family = MV88E6XXX_FAMILY_6185,
.name = "Marvell 88E6185",
.num_databases = 256,
.num_ports = 10,
+   .flags = MV88E6XXX_FLAGS_FAMILY_6185,
}
 };
 
diff --git a/drivers/net/dsa/mv88e6171.c b/drivers/net/dsa/mv88e6171.c
index f75164d..e64cbee 100644
--- a/drivers/net/dsa/mv88e6171.c
+++ b/drivers/net/dsa/mv88e6171.c
@@ -24,24 +24,28 @@ static const struct mv88e6xxx_info mv88e6171_table[] = {
.name = "Marvell 88E6171",
.num_databases = 4096,
.num_ports = 7,
+   .flags = MV88E6XXX_FLAGS_FAMILY_6351,
}, {
.prod_num = PORT_SWITCH_ID_PROD_NUM_6175,
.family = MV88E6XXX_FAMILY_6351,
.name = "Marvell 88E6175",
.num_databases = 4096,
.num_ports = 7,
+   .flags = MV88E6XXX_FLAGS_FAMILY_6351,
}, {
.prod_num = PORT_SWITCH_ID_PROD_NUM_6350,
.family = MV88E6XXX_FAMILY_6351,
.name = "Marvell 88E6350",
.num_databases = 4096,
.num_ports = 7,
+   .flags = MV88E6XXX_FLAGS_FAMILY_6351,
}, {
.prod_num = PORT_SWITCH_ID_PROD_NUM_6351,
.family = MV88E6XXX_FAMILY_6351,
.name = "Marvell 88E6351",
.num_databases = 4096,
.num_ports = 7,
+   .flags = MV88E6XXX_FLAGS_FAMILY_6351,
}
 };
 
diff --git a/drivers/net/dsa/mv88e6352.c b/drivers/net/dsa/mv88e6352.c
index c622a1d..c61f0f4 100644
--- a/drivers/net/dsa/mv88e6352.c
+++ b/drivers/net/dsa/mv88e6352.c
@@ -29,36 +29,42 @@ static const struct mv88e6xxx_info mv88e6352_table[] = {
.name = "Marvell 88E6320",
.num_databases = 4096,
.num_ports = 7,
+   .flags = MV88E6XXX_FLAGS_FAMILY_6320,
}, {
.prod_num = PORT_SWITCH_ID_PROD_NUM_6321,
.family = MV88E6XXX_FAMILY_6320,
.name = "Marvell 88E6321",
.num_databases = 4096,
.num_ports = 7,
+   .flags = MV88E6XXX_FLAGS_FAMILY_6320,
}, {
.prod_num = PORT_SWITCH_ID_PROD_NUM_6172,
.family = MV88E6XXX_FAMILY_6352,
.name = "Marvell 88E6172",
.num_databases = 4096,
.num_ports = 7,
+   

[PATCH net-next 04/21] net: dsa: mv88e6xxx: factorize EEPROM access

2016-05-06 Thread Vivien Didelot
Add a MV88E6XXX_FLAG_EEPROM flag to describe switch models featuring an
EEPROM and distribute the EEPROM access routines to all models.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/mv88e6123.c |   2 +
 drivers/net/dsa/mv88e6131.c |   2 +
 drivers/net/dsa/mv88e6171.c |   2 +
 drivers/net/dsa/mv88e6352.c | 207 +--
 drivers/net/dsa/mv88e6xxx.c | 208 +++-
 drivers/net/dsa/mv88e6xxx.h |  18 +++-
 6 files changed, 228 insertions(+), 211 deletions(-)

diff --git a/drivers/net/dsa/mv88e6123.c b/drivers/net/dsa/mv88e6123.c
index ab5885b..8330a8e 100644
--- a/drivers/net/dsa/mv88e6123.c
+++ b/drivers/net/dsa/mv88e6123.c
@@ -124,6 +124,8 @@ struct dsa_switch_driver mv88e6123_switch_driver = {
 #ifdef CONFIG_NET_DSA_HWMON
.get_temp   = mv88e6xxx_get_temp,
 #endif
+   .get_eeprom = mv88e6xxx_get_eeprom,
+   .set_eeprom = mv88e6xxx_set_eeprom,
.get_regs_len   = mv88e6xxx_get_regs_len,
.get_regs   = mv88e6xxx_get_regs,
 };
diff --git a/drivers/net/dsa/mv88e6131.c b/drivers/net/dsa/mv88e6131.c
index 9d21d69..ab8c507 100644
--- a/drivers/net/dsa/mv88e6131.c
+++ b/drivers/net/dsa/mv88e6131.c
@@ -153,6 +153,8 @@ struct dsa_switch_driver mv88e6131_switch_driver = {
.get_strings= mv88e6xxx_get_strings,
.get_ethtool_stats  = mv88e6xxx_get_ethtool_stats,
.get_sset_count = mv88e6xxx_get_sset_count,
+   .get_eeprom = mv88e6xxx_get_eeprom,
+   .set_eeprom = mv88e6xxx_set_eeprom,
.adjust_link= mv88e6xxx_adjust_link,
.port_bridge_join   = mv88e6xxx_port_bridge_join,
.port_bridge_leave  = mv88e6xxx_port_bridge_leave,
diff --git a/drivers/net/dsa/mv88e6171.c b/drivers/net/dsa/mv88e6171.c
index b190647..a7afbaa 100644
--- a/drivers/net/dsa/mv88e6171.c
+++ b/drivers/net/dsa/mv88e6171.c
@@ -133,6 +133,8 @@ struct dsa_switch_driver mv88e6171_switch_driver = {
 #ifdef CONFIG_NET_DSA_HWMON
.get_temp   = mv88e6xxx_get_temp,
 #endif
+   .get_eeprom = mv88e6xxx_get_eeprom,
+   .set_eeprom = mv88e6xxx_set_eeprom,
.get_regs_len   = mv88e6xxx_get_regs_len,
.get_regs   = mv88e6xxx_get_regs,
.port_bridge_join   = mv88e6xxx_port_bridge_join,
diff --git a/drivers/net/dsa/mv88e6352.c b/drivers/net/dsa/mv88e6352.c
index 6fa7c02..3bb271e 100644
--- a/drivers/net/dsa/mv88e6352.c
+++ b/drivers/net/dsa/mv88e6352.c
@@ -125,8 +125,6 @@ static int mv88e6352_setup(struct dsa_switch *ds)
if (ret < 0)
return ret;
 
-   mutex_init(>eeprom_mutex);
-
ret = mv88e6xxx_switch_reset(ps, true);
if (ret < 0)
return ret;
@@ -138,207 +136,6 @@ static int mv88e6352_setup(struct dsa_switch *ds)
return mv88e6xxx_setup_ports(ds);
 }
 
-static int mv88e6352_read_eeprom_word(struct dsa_switch *ds, int addr)
-{
-   struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
-   int ret;
-
-   mutex_lock(>eeprom_mutex);
-
-   ret = mv88e6xxx_reg_write(ps, REG_GLOBAL2, GLOBAL2_EEPROM_OP,
- GLOBAL2_EEPROM_OP_READ |
- (addr & GLOBAL2_EEPROM_OP_ADDR_MASK));
-   if (ret < 0)
-   goto error;
-
-   ret = mv88e6xxx_eeprom_busy_wait(ds);
-   if (ret < 0)
-   goto error;
-
-   ret = mv88e6xxx_reg_read(ps, REG_GLOBAL2, GLOBAL2_EEPROM_DATA);
-error:
-   mutex_unlock(>eeprom_mutex);
-   return ret;
-}
-
-static int mv88e6352_get_eeprom(struct dsa_switch *ds,
-   struct ethtool_eeprom *eeprom, u8 *data)
-{
-   int offset;
-   int len;
-   int ret;
-
-   offset = eeprom->offset;
-   len = eeprom->len;
-   eeprom->len = 0;
-
-   eeprom->magic = 0xc3ec4951;
-
-   ret = mv88e6xxx_eeprom_load_wait(ds);
-   if (ret < 0)
-   return ret;
-
-   if (offset & 1) {
-   int word;
-
-   word = mv88e6352_read_eeprom_word(ds, offset >> 1);
-   if (word < 0)
-   return word;
-
-   *data++ = (word >> 8) & 0xff;
-
-   offset++;
-   len--;
-   eeprom->len++;
-   }
-
-   while (len >= 2) {
-   int word;
-
-   word = mv88e6352_read_eeprom_word(ds, offset >> 1);
-   if (word < 0)
-   return word;
-
-   *data++ = word & 0xff;
-   *data++ = (word >> 8) & 0xff;
-
-   offset += 2;
-   len -= 2;
-   eeprom->len += 2;
-   }
-
-   if (len) {
-   int word;
-
-   word = mv88e6352_read_eeprom_word(ds, offset >> 1);
-   if (word < 0)
-   return word;
-

[PATCH net-next 09/21] net: dsa: mv88e6xxx: factorize bridge support

2016-05-06 Thread Vivien Didelot
Add MV88E6XXX_FLAG_PORTSTATE and MV88E6XXX_FLAG_VLANTABLE flags to
identify switch models with required 802.1D operations.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/mv88e6123.c |  3 +++
 drivers/net/dsa/mv88e6131.c |  1 +
 drivers/net/dsa/mv88e6xxx.c |  9 +
 drivers/net/dsa/mv88e6xxx.h | 38 ++
 4 files changed, 43 insertions(+), 8 deletions(-)

diff --git a/drivers/net/dsa/mv88e6123.c b/drivers/net/dsa/mv88e6123.c
index 1ad7bcd..81cd4a9 100644
--- a/drivers/net/dsa/mv88e6123.c
+++ b/drivers/net/dsa/mv88e6123.c
@@ -133,6 +133,9 @@ struct dsa_switch_driver mv88e6123_switch_driver = {
.set_eeprom = mv88e6xxx_set_eeprom,
.get_regs_len   = mv88e6xxx_get_regs_len,
.get_regs   = mv88e6xxx_get_regs,
+   .port_bridge_join   = mv88e6xxx_port_bridge_join,
+   .port_bridge_leave  = mv88e6xxx_port_bridge_leave,
+   .port_stp_state_set = mv88e6xxx_port_stp_state_set,
 };
 
 MODULE_ALIAS("platform:mv88e6123");
diff --git a/drivers/net/dsa/mv88e6131.c b/drivers/net/dsa/mv88e6131.c
index 3fb06af..5d25244 100644
--- a/drivers/net/dsa/mv88e6131.c
+++ b/drivers/net/dsa/mv88e6131.c
@@ -168,6 +168,7 @@ struct dsa_switch_driver mv88e6131_switch_driver = {
.adjust_link= mv88e6xxx_adjust_link,
.port_bridge_join   = mv88e6xxx_port_bridge_join,
.port_bridge_leave  = mv88e6xxx_port_bridge_leave,
+   .port_stp_state_set = mv88e6xxx_port_stp_state_set,
.port_vlan_filtering= mv88e6xxx_port_vlan_filtering,
.port_vlan_prepare  = mv88e6xxx_port_vlan_prepare,
.port_vlan_add  = mv88e6xxx_port_vlan_add,
diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index 7a4ab4b..492cb24 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -1361,6 +1361,9 @@ void mv88e6xxx_port_stp_state_set(struct dsa_switch *ds, 
int port, u8 state)
struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
int stp_state;
 
+   if (!mv88e6xxx_has(ps, MV88E6XXX_FLAG_PORTSTATE))
+   return;
+
switch (state) {
case BR_STATE_DISABLED:
stp_state = PORT_CONTROL_STATE_DISABLED;
@@ -2422,6 +2425,9 @@ int mv88e6xxx_port_bridge_join(struct dsa_switch *ds, int 
port,
struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
int i, err = 0;
 
+   if (!mv88e6xxx_has(ps, MV88E6XXX_FLAG_VLANTABLE))
+   return -EOPNOTSUPP;
+
mutex_lock(>smi_mutex);
 
/* Assign the bridge and remap each port's VLANTable */
@@ -2446,6 +2452,9 @@ void mv88e6xxx_port_bridge_leave(struct dsa_switch *ds, 
int port)
struct net_device *bridge = ps->ports[port].bridge_dev;
int i;
 
+   if (!mv88e6xxx_has(ps, MV88E6XXX_FLAG_VLANTABLE))
+   return;
+
mutex_lock(>smi_mutex);
 
/* Unassign the bridge and remap each port's VLANTable */
diff --git a/drivers/net/dsa/mv88e6xxx.h b/drivers/net/dsa/mv88e6xxx.h
index b99e090..d15e0b3 100644
--- a/drivers/net/dsa/mv88e6xxx.h
+++ b/drivers/net/dsa/mv88e6xxx.h
@@ -360,6 +360,11 @@ enum mv88e6xxx_cap {
 */
MV88E6XXX_CAP_EEPROM,
 
+   /* Port State Filtering for 802.1D Spanning Tree.
+* See PORT_CONTROL_STATE_* values in the PORT_CONTROL register.
+*/
+   MV88E6XXX_CAP_PORTSTATE,
+
/* PHY Polling Unit.
 * See GLOBAL_CONTROL_PPU_ENABLE and GLOBAL_STATUS_PPU_POLLING.
 */
@@ -383,50 +388,67 @@ enum mv88e6xxx_cap {
 */
MV88E6XXX_CAP_TEMP,
MV88E6XXX_CAP_TEMP_LIMIT,
+
+   /* In-chip Port Based VLANs.
+* Each port VLANTable register (see PORT_BASE_VLAN) is used to restrict
+* the output (or egress) ports to which it is allowed to send frames.
+*/
+   MV88E6XXX_CAP_VLANTABLE,
 };
 
 /* Bitmask of capabilities */
 #define MV88E6XXX_FLAG_EEE BIT(MV88E6XXX_CAP_EEE)
 #define MV88E6XXX_FLAG_EEPROM  BIT(MV88E6XXX_CAP_EEPROM)
+#define MV88E6XXX_FLAG_PORTSTATE   BIT(MV88E6XXX_CAP_PORTSTATE)
 #define MV88E6XXX_FLAG_PPU BIT(MV88E6XXX_CAP_PPU)
 #define MV88E6XXX_FLAG_SMI_PHY BIT(MV88E6XXX_CAP_SMI_PHY)
 #define MV88E6XXX_FLAG_SWITCH_MAC  BIT(MV88E6XXX_CAP_SWITCH_MAC_WOL_WOF)
 #define MV88E6XXX_FLAG_TEMPBIT(MV88E6XXX_CAP_TEMP)
 #define MV88E6XXX_FLAG_TEMP_LIMIT  BIT(MV88E6XXX_CAP_TEMP_LIMIT)
+#define MV88E6XXX_FLAG_VLANTABLE   BIT(MV88E6XXX_CAP_VLANTABLE)
 
 #define MV88E6XXX_FLAGS_FAMILY_6095\
-   MV88E6XXX_FLAG_PPU
+   (MV88E6XXX_FLAG_PPU |   \
+MV88E6XXX_FLAG_VLANTABLE)
 
 #define MV88E6XXX_FLAGS_FAMILY_6097\
-   MV88E6XXX_FLAG_PPU
+   (MV88E6XXX_FLAG_PPU |   \
+MV88E6XXX_FLAG_VLANTABLE)
 
 #define MV88E6XXX_FLAGS_FAMILY_6165\
(MV88E6XXX_FLAG_SWITCH_MAC |\
 MV88E6XXX_FLAG_TEMP)
 
 

[PATCH net-next 03/21] net: dsa: mv88e6xxx: factorize PHY indirect access

2016-05-06 Thread Vivien Didelot
Some switch has dedicated SMI PHY Command and Data registers, used to
indirectly access the PHYs, instead of direct access.

Identify these switch models and make mv88e6xxx_phy_{read,write} generic
enough to support every models.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/mv88e6171.c |  4 ++--
 drivers/net/dsa/mv88e6352.c |  4 ++--
 drivers/net/dsa/mv88e6xxx.c | 37 -
 drivers/net/dsa/mv88e6xxx.h | 22 +-
 4 files changed, 21 insertions(+), 46 deletions(-)

diff --git a/drivers/net/dsa/mv88e6171.c b/drivers/net/dsa/mv88e6171.c
index e64cbee..b190647 100644
--- a/drivers/net/dsa/mv88e6171.c
+++ b/drivers/net/dsa/mv88e6171.c
@@ -124,8 +124,8 @@ struct dsa_switch_driver mv88e6171_switch_driver = {
.probe  = mv88e6171_drv_probe,
.setup  = mv88e6171_setup,
.set_addr   = mv88e6xxx_set_addr_indirect,
-   .phy_read   = mv88e6xxx_phy_read_indirect,
-   .phy_write  = mv88e6xxx_phy_write_indirect,
+   .phy_read   = mv88e6xxx_phy_read,
+   .phy_write  = mv88e6xxx_phy_write,
.get_strings= mv88e6xxx_get_strings,
.get_ethtool_stats  = mv88e6xxx_get_ethtool_stats,
.get_sset_count = mv88e6xxx_get_sset_count,
diff --git a/drivers/net/dsa/mv88e6352.c b/drivers/net/dsa/mv88e6352.c
index c61f0f4..6fa7c02 100644
--- a/drivers/net/dsa/mv88e6352.c
+++ b/drivers/net/dsa/mv88e6352.c
@@ -344,8 +344,8 @@ struct dsa_switch_driver mv88e6352_switch_driver = {
.probe  = mv88e6352_drv_probe,
.setup  = mv88e6352_setup,
.set_addr   = mv88e6xxx_set_addr_indirect,
-   .phy_read   = mv88e6xxx_phy_read_indirect,
-   .phy_write  = mv88e6xxx_phy_write_indirect,
+   .phy_read   = mv88e6xxx_phy_read,
+   .phy_write  = mv88e6xxx_phy_write,
.get_strings= mv88e6xxx_get_strings,
.get_ethtool_stats  = mv88e6xxx_get_ethtool_stats,
.get_sset_count = mv88e6xxx_get_sset_count,
diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index a28b46c..2c8c5e1 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -2887,6 +2887,8 @@ mv88e6xxx_phy_read(struct dsa_switch *ds, int port, int 
regnum)
 
if (mv88e6xxx_has(ps, MV88E6XXX_FLAG_PPU))
ret = mv88e6xxx_phy_read_ppu(ps, addr, regnum);
+   else if (mv88e6xxx_has(ps, MV88E6XXX_FLAG_SMI_PHY))
+   ret = _mv88e6xxx_phy_read_indirect(ps, addr, regnum);
else
ret = _mv88e6xxx_phy_read(ps, addr, regnum);
 
@@ -2908,6 +2910,8 @@ mv88e6xxx_phy_write(struct dsa_switch *ds, int port, int 
regnum, u16 val)
 
if (mv88e6xxx_has(ps, MV88E6XXX_FLAG_PPU))
ret = mv88e6xxx_phy_write_ppu(ps, addr, regnum, val);
+   else if (mv88e6xxx_has(ps, MV88E6XXX_FLAG_SMI_PHY))
+   ret = _mv88e6xxx_phy_write_indirect(ps, addr, regnum, val);
else
ret = _mv88e6xxx_phy_write(ps, addr, regnum, val);
 
@@ -2915,39 +2919,6 @@ mv88e6xxx_phy_write(struct dsa_switch *ds, int port, int 
regnum, u16 val)
return ret;
 }
 
-int
-mv88e6xxx_phy_read_indirect(struct dsa_switch *ds, int port, int regnum)
-{
-   struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
-   int addr = mv88e6xxx_port_to_phy_addr(ps, port);
-   int ret;
-
-   if (addr < 0)
-   return 0x;
-
-   mutex_lock(>smi_mutex);
-   ret = _mv88e6xxx_phy_read_indirect(ps, addr, regnum);
-   mutex_unlock(>smi_mutex);
-   return ret;
-}
-
-int
-mv88e6xxx_phy_write_indirect(struct dsa_switch *ds, int port, int regnum,
-u16 val)
-{
-   struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
-   int addr = mv88e6xxx_port_to_phy_addr(ps, port);
-   int ret;
-
-   if (addr < 0)
-   return addr;
-
-   mutex_lock(>smi_mutex);
-   ret = _mv88e6xxx_phy_write_indirect(ps, addr, regnum, val);
-   mutex_unlock(>smi_mutex);
-   return ret;
-}
-
 #ifdef CONFIG_NET_DSA_HWMON
 
 static int mv88e61xx_get_temp(struct dsa_switch *ds, int *temp)
diff --git a/drivers/net/dsa/mv88e6xxx.h b/drivers/net/dsa/mv88e6xxx.h
index 52ca24e..5972571 100644
--- a/drivers/net/dsa/mv88e6xxx.h
+++ b/drivers/net/dsa/mv88e6xxx.h
@@ -355,10 +355,17 @@ enum mv88e6xxx_cap {
 * See GLOBAL_CONTROL_PPU_ENABLE and GLOBAL_STATUS_PPU_POLLING.
 */
MV88E6XXX_CAP_PPU,
+
+   /* SMI PHY Command and Data registers.
+* This requires an indirect access to PHY registers through
+* GLOBAL2_SMI_OP, otherwise direct access to PHY registers is done.
+*/
+   MV88E6XXX_CAP_SMI_PHY,
 };
 
 /* Bitmask of capabilities */
 #define MV88E6XXX_FLAG_PPU 

Re: rtk8168 driver help needed

2016-05-06 Thread Murali Karicheri
On 05/06/2016 06:02 PM, Murali Karicheri wrote:
> Hi,
> 
> I am trying to integrate the rtl8168 PCIe card to have Ethernet functional on 
> my
> Keystone EVM. I purchased the rtl8111c Gib card from Amazon. The Card is 
> detected
> by the RC and I can see it is enumerated and show up when doing lspci command.
> However I can't get the Ethernet port functional. Does this need MSI 
> interrupt?
> I can't see it has requested any.
> 
> root@k2g-evm:~# cat /proc/interrupts 
>CPU0   
>  18: 152104   GIC  27 Edge  arch_timer
>  22:   3326   GIC 196 Edge  serial
>  24:  0   GIC 120 Edge  253.i2c
>  25:  0   GIC 121 Edge  2530400.i2c
>  26: 14   GIC  98 Edge  21805800.spi
>  27:  0   GIC 160 Edge  268.keystone-dwc3, xhci-hcd:usb1
>  30:  0   GIC 228 Edge  gpmc
>  31:  0   GIC 229 Edge  21c8000.elm
>  32:159   GIC 356 Level 2a0.msgmgr rx_005_002
>  49:  0   GIC 232 Edge  270.edma_ccint
>  51:  0   GIC 249 Edge  270.edma_ccerrint
>  52:   1840   GIC 240 Edge  2728000.edma_ccint
>  54:  2   GIC 252 Edge  2728000.edma_ccerrint
>  57:   5424   GIC 128 Edge  mmc0
>  58:461   GIC 129 Edge  mmc1
> 285:  0  Keystone-PCIe-MSI-IRQ   0 Edge  PCIe PME, aerdrv
> IPI0:  0  CPU wakeup interrupts
> IPI1:  0  Timer broadcast interrupts
> IPI2:  0  Rescheduling interrupts
> IPI3:  0  Function call interrupts
> IPI4:  0  Single function call interrupts
> IPI5:  0  CPU stop interrupts
> IPI6:  0  IRQ work interrupts
> IPI7:  0  completion interrupts
> 
> 
> root@k2g-evm:~# ifconfig -a
> eth0  Link encap:Ethernet  HWaddr 00:00:00:00:00:00  
>   BROADCAST MULTICAST  MTU:1500  Metric:1
>   RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>   TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>   collisions:0 txqueuelen:1000 
>   RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

Also why MAC address is showing all zeros? Any settings needed?

> 
> gre0  Link encap:UNSPEC  HWaddr 
> 00-00-00-00-00-00-00-50-00-00-00-00-00-00-00-00  
>   NOARP  MTU:1476  Metric:1
>   RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>   TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>   collisions:0 txqueuelen:1 
>   RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
> 
> gretap0   Link encap:Ethernet  HWaddr 00:00:00:00:00:00  
>   BROADCAST MULTICAST  MTU:1462  Metric:1
>   RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>   TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>   collisions:0 txqueuelen:1000 
>   RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
> 
> loLink encap:Local Loopback  
>   inet addr:127.0.0.1  Mask:255.0.0.0
>   inet6 addr: ::1/128 Scope:Host
>   UP LOOPBACK RUNNING  MTU:65536  Metric:1
>   RX packets:14 errors:0 dropped:0 overruns:0 frame:0
>   TX packets:14 errors:0 dropped:0 overruns:0 carrier:0
>   collisions:0 txqueuelen:1 
>   RX bytes:968 (968.0 B)  TX bytes:968 (968.0 B)
> 
> tunl0 Link encap:UNSPEC  HWaddr 
> 00-00-00-00-31-00-00-50-00-00-00-00-00-00-00-00  
>   NOARP  MTU:1480  Metric:1
>   RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>   TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>   collisions:0 txqueuelen:1 
>   RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
> 
> root@k2g-evm:~# ifup eth0
> ifconfig: SIOCSIFFLAGS: Cannot assign requested address
> 
> 
> [2.303965] PCI host bridge /soc/pcie@2180 ranges:
> [2.309108]   No bus range found for /soc/pcie@2180, using [bus 00-ff]
> [2.316269]IO 0x2325..0x23253fff -> 0x
> [2.321499]   MEM 0x5000..0x5fff -> 0x5000
> [2.331666] keystone-pcie 21801000.pcie: PCI host bridge to bus :00
> [2.338283] pci_bus :00: root bus resource [bus 00-ff]
> [2.343937] pci_bus :00: root bus resource [io  0x-0x3fff]
> [2.350114] pci_bus :00: root bus resource [mem 0x5000-0x5fff]
> [2.357095] pci :00:00.0: [104c:b00b] type 01 class 0x060400
> [2.357665] PCI: bus0: Fast back to back transfers disabled
> [2.363717] pci :01:00.0: [10ec:8168] type 00 class 0x02
> [2.363809] pci :01:00.0: reg 0x10: [io  0x-0x00ff]
> [2.363867] pci :01:00.0: reg 0x18: [mem 0x-0x0fff 64bit]
> [2.363909] pci :01:00.0: reg 0x20: [mem 0x-0x 64bit 
> pref]
> [2.363939] pci :01:00.0: reg 0x30: [mem 0x-0x0001 pref]
> [2.364099] pci :01:00.0: supports D1 D2
> [2.364116] pci :01:00.0: PME# supported from D0 D1 D2 D3hot D3cold
> [2.381251] PCI: bus1: Fast back to 

[PATCH net-next 10/21] net: dsa: mv88e6xxx: factorize VTU access

2016-05-06 Thread Vivien Didelot
Add a MV88E6XXX_FLAG_VTU flag to indentify switch models with a VLAN
Table Unit.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/mv88e6123.c |  5 +
 drivers/net/dsa/mv88e6xxx.c | 16 
 drivers/net/dsa/mv88e6xxx.h | 24 ++--
 3 files changed, 39 insertions(+), 6 deletions(-)

diff --git a/drivers/net/dsa/mv88e6123.c b/drivers/net/dsa/mv88e6123.c
index 81cd4a9..da5aa9c 100644
--- a/drivers/net/dsa/mv88e6123.c
+++ b/drivers/net/dsa/mv88e6123.c
@@ -136,6 +136,11 @@ struct dsa_switch_driver mv88e6123_switch_driver = {
.port_bridge_join   = mv88e6xxx_port_bridge_join,
.port_bridge_leave  = mv88e6xxx_port_bridge_leave,
.port_stp_state_set = mv88e6xxx_port_stp_state_set,
+   .port_vlan_filtering= mv88e6xxx_port_vlan_filtering,
+   .port_vlan_prepare  = mv88e6xxx_port_vlan_prepare,
+   .port_vlan_add  = mv88e6xxx_port_vlan_add,
+   .port_vlan_del  = mv88e6xxx_port_vlan_del,
+   .port_vlan_dump = mv88e6xxx_port_vlan_dump,
 };
 
 MODULE_ALIAS("platform:mv88e6123");
diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index 492cb24..7994486 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -1588,6 +1588,9 @@ int mv88e6xxx_port_vlan_dump(struct dsa_switch *ds, int 
port,
u16 pvid;
int err;
 
+   if (!mv88e6xxx_has(ps, MV88E6XXX_FLAG_VTU))
+   return -EOPNOTSUPP;
+
mutex_lock(>smi_mutex);
 
err = _mv88e6xxx_port_pvid_get(ps, port, );
@@ -2011,6 +2014,9 @@ int mv88e6xxx_port_vlan_filtering(struct dsa_switch *ds, 
int port,
PORT_CONTROL_2_8021Q_DISABLED;
int ret;
 
+   if (!mv88e6xxx_has(ps, MV88E6XXX_FLAG_VTU))
+   return -EOPNOTSUPP;
+
mutex_lock(>smi_mutex);
 
ret = _mv88e6xxx_reg_read(ps, REG_PORT(port), PORT_CONTROL_2);
@@ -2044,8 +2050,12 @@ int mv88e6xxx_port_vlan_prepare(struct dsa_switch *ds, 
int port,
const struct switchdev_obj_port_vlan *vlan,
struct switchdev_trans *trans)
 {
+   struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
int err;
 
+   if (!mv88e6xxx_has(ps, MV88E6XXX_FLAG_VTU))
+   return -EOPNOTSUPP;
+
/* If the requested port doesn't belong to the same bridge as the VLAN
 * members, do not support it (yet) and fallback to software VLAN.
 */
@@ -2086,6 +2096,9 @@ void mv88e6xxx_port_vlan_add(struct dsa_switch *ds, int 
port,
bool pvid = vlan->flags & BRIDGE_VLAN_INFO_PVID;
u16 vid;
 
+   if (!mv88e6xxx_has(ps, MV88E6XXX_FLAG_VTU))
+   return;
+
mutex_lock(>smi_mutex);
 
for (vid = vlan->vid_begin; vid <= vlan->vid_end; ++vid)
@@ -2143,6 +2156,9 @@ int mv88e6xxx_port_vlan_del(struct dsa_switch *ds, int 
port,
u16 pvid, vid;
int err = 0;
 
+   if (!mv88e6xxx_has(ps, MV88E6XXX_FLAG_VTU))
+   return -EOPNOTSUPP;
+
mutex_lock(>smi_mutex);
 
err = _mv88e6xxx_port_pvid_get(ps, port, );
diff --git a/drivers/net/dsa/mv88e6xxx.h b/drivers/net/dsa/mv88e6xxx.h
index d15e0b3..4f21206 100644
--- a/drivers/net/dsa/mv88e6xxx.h
+++ b/drivers/net/dsa/mv88e6xxx.h
@@ -394,6 +394,11 @@ enum mv88e6xxx_cap {
 * the output (or egress) ports to which it is allowed to send frames.
 */
MV88E6XXX_CAP_VLANTABLE,
+
+   /* VLAN Table Unit.
+* The VTU is used to program 802.1Q VLANs. See GLOBAL_VTU_OP.
+*/
+   MV88E6XXX_CAP_VTU,
 };
 
 /* Bitmask of capabilities */
@@ -406,14 +411,17 @@ enum mv88e6xxx_cap {
 #define MV88E6XXX_FLAG_TEMPBIT(MV88E6XXX_CAP_TEMP)
 #define MV88E6XXX_FLAG_TEMP_LIMIT  BIT(MV88E6XXX_CAP_TEMP_LIMIT)
 #define MV88E6XXX_FLAG_VLANTABLE   BIT(MV88E6XXX_CAP_VLANTABLE)
+#define MV88E6XXX_FLAG_VTU BIT(MV88E6XXX_CAP_VTU)
 
 #define MV88E6XXX_FLAGS_FAMILY_6095\
(MV88E6XXX_FLAG_PPU |   \
-MV88E6XXX_FLAG_VLANTABLE)
+MV88E6XXX_FLAG_VLANTABLE | \
+MV88E6XXX_FLAG_VTU)
 
 #define MV88E6XXX_FLAGS_FAMILY_6097\
(MV88E6XXX_FLAG_PPU |   \
-MV88E6XXX_FLAG_VLANTABLE)
+MV88E6XXX_FLAG_VLANTABLE | \
+MV88E6XXX_FLAG_VTU)
 
 #define MV88E6XXX_FLAGS_FAMILY_6165\
(MV88E6XXX_FLAG_SWITCH_MAC |\
@@ -421,7 +429,8 @@ enum mv88e6xxx_cap {
 
 #define MV88E6XXX_FLAGS_FAMILY_6185\
(MV88E6XXX_FLAG_PPU |   \
-MV88E6XXX_FLAG_VLANTABLE)
+MV88E6XXX_FLAG_VLANTABLE | \
+MV88E6XXX_FLAG_VTU)
 
 #define MV88E6XXX_FLAGS_FAMILY_6320\
(MV88E6XXX_FLAG_EEE |   \
@@ -431,14 +440,16 @@ enum mv88e6xxx_cap {
 MV88E6XXX_FLAG_SWITCH_MAC |\
 MV88E6XXX_FLAG_TEMP |  \
 MV88E6XXX_FLAG_TEMP_LIMIT |\
-MV88E6XXX_FLAG_VLANTABLE)
+

[PATCH net-next 15/21] net: dsa: mv88e6xxx: factorize VLAN Ethertype

2016-05-06 Thread Vivien Didelot
The 6131 switch models have a Core Tag Type register. Add a
MV88E6XXX_FLAG_CORE_TAG_TYPE flag and set the VLAN Ethertype to 0x8100
in the shared setup code if it is present.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/mv88e6131.c |  5 -
 drivers/net/dsa/mv88e6xxx.c |  8 
 drivers/net/dsa/mv88e6xxx.h | 10 ++
 3 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/net/dsa/mv88e6131.c b/drivers/net/dsa/mv88e6131.c
index d05fc79..1c3b245 100644
--- a/drivers/net/dsa/mv88e6131.c
+++ b/drivers/net/dsa/mv88e6131.c
@@ -65,11 +65,6 @@ static int mv88e6131_setup_global(struct dsa_switch *ds)
int ret;
u32 reg;
 
-   /* Set the VLAN ethertype to 0x8100. */
-   ret = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_CORE_TAG_TYPE, 0x8100);
-   if (ret)
-   return ret;
-
/* Disable ARP mirroring, and configure the upstream port as
 * the port to which ingress and egress monitor frames are to
 * be sent.
diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index 1572719..b0245e7 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -2930,6 +2930,14 @@ static int _mv88e6xxx_setup_global(struct 
mv88e6xxx_priv_state *ps)
if (err)
return err;
 
+   /* Set the VLAN ethertype to 0x8100. */
+   if (mv88e6xxx_has(ps, MV88E6XXX_FLAG_CORE_TAG_TYPE)) {
+   err = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_CORE_TAG_TYPE,
+  0x8100);
+   if (err)
+   return err;
+   }
+
/* Set the default address aging time to 5 minutes, and
 * enable address learn messages to be sent to all message
 * ports.
diff --git a/drivers/net/dsa/mv88e6xxx.h b/drivers/net/dsa/mv88e6xxx.h
index 62f6fc9..555903f 100644
--- a/drivers/net/dsa/mv88e6xxx.h
+++ b/drivers/net/dsa/mv88e6xxx.h
@@ -365,6 +365,12 @@ enum mv88e6xxx_cap {
 */
MV88E6XXX_CAP_EEPROM,
 
+   /* Core Tag Type.
+* Used with Double Tagged frames and ports in UseCoreTag mode.
+* See GLOBAL_CORE_TAG_TYPE.
+*/
+   MV88E6XXX_CAP_CORE_TAG_TYPE,
+
/* Port State Filtering for 802.1D Spanning Tree.
 * See PORT_CONTROL_STATE_* values in the PORT_CONTROL register.
 */
@@ -411,6 +417,7 @@ enum mv88e6xxx_cap {
 #define MV88E6XXX_FLAG_ATU BIT(MV88E6XXX_CAP_ATU)
 #define MV88E6XXX_FLAG_EEE BIT(MV88E6XXX_CAP_EEE)
 #define MV88E6XXX_FLAG_EEPROM  BIT(MV88E6XXX_CAP_EEPROM)
+#define MV88E6XXX_FLAG_CORE_TAG_TYPE   BIT(MV88E6XXX_CAP_CORE_TAG_TYPE)
 #define MV88E6XXX_FLAG_PORTSTATE   BIT(MV88E6XXX_CAP_PORTSTATE)
 #define MV88E6XXX_FLAG_PPU BIT(MV88E6XXX_CAP_PPU)
 #define MV88E6XXX_FLAG_PPU_ACTIVE  BIT(MV88E6XXX_CAP_PPU_ACTIVE)
@@ -423,12 +430,14 @@ enum mv88e6xxx_cap {
 
 #define MV88E6XXX_FLAGS_FAMILY_6095\
(MV88E6XXX_FLAG_ATU |   \
+MV88E6XXX_FLAG_CORE_TAG_TYPE | \
 MV88E6XXX_FLAG_PPU |   \
 MV88E6XXX_FLAG_VLANTABLE | \
 MV88E6XXX_FLAG_VTU)
 
 #define MV88E6XXX_FLAGS_FAMILY_6097\
(MV88E6XXX_FLAG_ATU |   \
+MV88E6XXX_FLAG_CORE_TAG_TYPE | \
 MV88E6XXX_FLAG_PPU |   \
 MV88E6XXX_FLAG_VLANTABLE | \
 MV88E6XXX_FLAG_VTU)
@@ -439,6 +448,7 @@ enum mv88e6xxx_cap {
 
 #define MV88E6XXX_FLAGS_FAMILY_6185\
(MV88E6XXX_FLAG_ATU |   \
+MV88E6XXX_FLAG_CORE_TAG_TYPE | \
 MV88E6XXX_FLAG_PPU |   \
 MV88E6XXX_FLAG_VLANTABLE | \
 MV88E6XXX_FLAG_VTU)
-- 
2.8.2



rtk8168 driver help needed

2016-05-06 Thread Murali Karicheri
Hi,

I am trying to integrate the rtl8168 PCIe card to have Ethernet functional on my
Keystone EVM. I purchased the rtl8111c Gib card from Amazon. The Card is 
detected
by the RC and I can see it is enumerated and show up when doing lspci command.
However I can't get the Ethernet port functional. Does this need MSI interrupt?
I can't see it has requested any.

root@k2g-evm:~# cat /proc/interrupts 
   CPU0   
 18: 152104   GIC  27 Edge  arch_timer
 22:   3326   GIC 196 Edge  serial
 24:  0   GIC 120 Edge  253.i2c
 25:  0   GIC 121 Edge  2530400.i2c
 26: 14   GIC  98 Edge  21805800.spi
 27:  0   GIC 160 Edge  268.keystone-dwc3, xhci-hcd:usb1
 30:  0   GIC 228 Edge  gpmc
 31:  0   GIC 229 Edge  21c8000.elm
 32:159   GIC 356 Level 2a0.msgmgr rx_005_002
 49:  0   GIC 232 Edge  270.edma_ccint
 51:  0   GIC 249 Edge  270.edma_ccerrint
 52:   1840   GIC 240 Edge  2728000.edma_ccint
 54:  2   GIC 252 Edge  2728000.edma_ccerrint
 57:   5424   GIC 128 Edge  mmc0
 58:461   GIC 129 Edge  mmc1
285:  0  Keystone-PCIe-MSI-IRQ   0 Edge  PCIe PME, aerdrv
IPI0:  0  CPU wakeup interrupts
IPI1:  0  Timer broadcast interrupts
IPI2:  0  Rescheduling interrupts
IPI3:  0  Function call interrupts
IPI4:  0  Single function call interrupts
IPI5:  0  CPU stop interrupts
IPI6:  0  IRQ work interrupts
IPI7:  0  completion interrupts


root@k2g-evm:~# ifconfig -a
eth0  Link encap:Ethernet  HWaddr 00:00:00:00:00:00  
  BROADCAST MULTICAST  MTU:1500  Metric:1
  RX packets:0 errors:0 dropped:0 overruns:0 frame:0
  TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000 
  RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

gre0  Link encap:UNSPEC  HWaddr 
00-00-00-00-00-00-00-50-00-00-00-00-00-00-00-00  
  NOARP  MTU:1476  Metric:1
  RX packets:0 errors:0 dropped:0 overruns:0 frame:0
  TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1 
  RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

gretap0   Link encap:Ethernet  HWaddr 00:00:00:00:00:00  
  BROADCAST MULTICAST  MTU:1462  Metric:1
  RX packets:0 errors:0 dropped:0 overruns:0 frame:0
  TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000 
  RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

loLink encap:Local Loopback  
  inet addr:127.0.0.1  Mask:255.0.0.0
  inet6 addr: ::1/128 Scope:Host
  UP LOOPBACK RUNNING  MTU:65536  Metric:1
  RX packets:14 errors:0 dropped:0 overruns:0 frame:0
  TX packets:14 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1 
  RX bytes:968 (968.0 B)  TX bytes:968 (968.0 B)

tunl0 Link encap:UNSPEC  HWaddr 
00-00-00-00-31-00-00-50-00-00-00-00-00-00-00-00  
  NOARP  MTU:1480  Metric:1
  RX packets:0 errors:0 dropped:0 overruns:0 frame:0
  TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1 
  RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

root@k2g-evm:~# ifup eth0
ifconfig: SIOCSIFFLAGS: Cannot assign requested address


[2.303965] PCI host bridge /soc/pcie@2180 ranges:
[2.309108]   No bus range found for /soc/pcie@2180, using [bus 00-ff]
[2.316269]IO 0x2325..0x23253fff -> 0x
[2.321499]   MEM 0x5000..0x5fff -> 0x5000
[2.331666] keystone-pcie 21801000.pcie: PCI host bridge to bus :00
[2.338283] pci_bus :00: root bus resource [bus 00-ff]
[2.343937] pci_bus :00: root bus resource [io  0x-0x3fff]
[2.350114] pci_bus :00: root bus resource [mem 0x5000-0x5fff]
[2.357095] pci :00:00.0: [104c:b00b] type 01 class 0x060400
[2.357665] PCI: bus0: Fast back to back transfers disabled
[2.363717] pci :01:00.0: [10ec:8168] type 00 class 0x02
[2.363809] pci :01:00.0: reg 0x10: [io  0x-0x00ff]
[2.363867] pci :01:00.0: reg 0x18: [mem 0x-0x0fff 64bit]
[2.363909] pci :01:00.0: reg 0x20: [mem 0x-0x 64bit 
pref]
[2.363939] pci :01:00.0: reg 0x30: [mem 0x-0x0001 pref]
[2.364099] pci :01:00.0: supports D1 D2
[2.364116] pci :01:00.0: PME# supported from D0 D1 D2 D3hot D3cold
[2.381251] PCI: bus1: Fast back to back transfers disabled
[2.386989] pci :00:00.0: BAR 8: assigned [mem 0x5000-0x500f]
[2.393937] pci :00:00.0: BAR 9: assigned [mem 0x5010-0x501f 
pref]
[2.401221] pci :00:00.0: BAR 7: assigned [io  0x1000-0x1fff]
[2.407320] pci :01:00.0: BAR 6: assigned [mem 

[PATCH net-next 02/21] net: dsa: mv88e6xxx: factorize PHY access with PPU

2016-05-06 Thread Vivien Didelot
Add a MV88E6XXX_FLAG_PPU flag to describe switch models with a PHY
Polling Unit. This allows to merge PPU specific PHY access code in the
share code.

Make the mv88e6xxx_ppu_disable and mv88e6xxx_phy_{read,write}_ppu
functions use unlocked register accesses in order to call them in
mv88e6xxx_phy_{read,write} in a locked context.

Since the PPU code is shared, also remove NET_DSA_MV88E6XXX_NEED_PPU.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/Kconfig |  5 -
 drivers/net/dsa/mv88e6131.c | 39 ++-
 drivers/net/dsa/mv88e6xxx.c | 40 +---
 drivers/net/dsa/mv88e6xxx.h | 25 -
 4 files changed, 43 insertions(+), 66 deletions(-)

diff --git a/drivers/net/dsa/Kconfig b/drivers/net/dsa/Kconfig
index 90ba003..4aaadce 100644
--- a/drivers/net/dsa/Kconfig
+++ b/drivers/net/dsa/Kconfig
@@ -13,15 +13,10 @@ config NET_DSA_MV88E6060
  This enables support for the Marvell 88E6060 ethernet switch
  chip.
 
-config NET_DSA_MV88E6XXX_NEED_PPU
-   bool
-   default n
-
 config NET_DSA_MV88E6131
tristate "Marvell 88E6085/6095/6095F/6131 ethernet switch chip support"
depends on NET_DSA
select NET_DSA_MV88E6XXX
-   select NET_DSA_MV88E6XXX_NEED_PPU
select NET_DSA_TAG_DSA
---help---
  This enables support for the Marvell 88E6085/6095/6095F/6131
diff --git a/drivers/net/dsa/mv88e6131.c b/drivers/net/dsa/mv88e6131.c
index d477320..9d21d69 100644
--- a/drivers/net/dsa/mv88e6131.c
+++ b/drivers/net/dsa/mv88e6131.c
@@ -132,8 +132,6 @@ static int mv88e6131_setup(struct dsa_switch *ds)
if (ret < 0)
return ret;
 
-   mv88e6xxx_ppu_state_init(ps);
-
ret = mv88e6xxx_switch_reset(ps, false);
if (ret < 0)
return ret;
@@ -145,46 +143,13 @@ static int mv88e6131_setup(struct dsa_switch *ds)
return mv88e6xxx_setup_ports(ds);
 }
 
-static int mv88e6131_port_to_phy_addr(struct dsa_switch *ds, int port)
-{
-   struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
-
-   if (port >= 0 && port < ps->info->num_ports)
-   return port;
-
-   return -EINVAL;
-}
-
-static int
-mv88e6131_phy_read(struct dsa_switch *ds, int port, int regnum)
-{
-   int addr = mv88e6131_port_to_phy_addr(ds, port);
-
-   if (addr < 0)
-   return addr;
-
-   return mv88e6xxx_phy_read_ppu(ds, addr, regnum);
-}
-
-static int
-mv88e6131_phy_write(struct dsa_switch *ds,
- int port, int regnum, u16 val)
-{
-   int addr = mv88e6131_port_to_phy_addr(ds, port);
-
-   if (addr < 0)
-   return addr;
-
-   return mv88e6xxx_phy_write_ppu(ds, addr, regnum, val);
-}
-
 struct dsa_switch_driver mv88e6131_switch_driver = {
.tag_protocol   = DSA_TAG_PROTO_DSA,
.probe  = mv88e6131_drv_probe,
.setup  = mv88e6131_setup,
.set_addr   = mv88e6xxx_set_addr_direct,
-   .phy_read   = mv88e6131_phy_read,
-   .phy_write  = mv88e6131_phy_write,
+   .phy_read   = mv88e6xxx_phy_read,
+   .phy_write  = mv88e6xxx_phy_write,
.get_strings= mv88e6xxx_get_strings,
.get_ethtool_stats  = mv88e6xxx_get_ethtool_stats,
.get_sset_count = mv88e6xxx_get_sset_count,
diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index 470cfc7..a28b46c 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -241,24 +241,23 @@ static int _mv88e6xxx_phy_write(struct 
mv88e6xxx_priv_state *ps, int addr,
return 0;
 }
 
-#ifdef CONFIG_NET_DSA_MV88E6XXX_NEED_PPU
 static int mv88e6xxx_ppu_disable(struct mv88e6xxx_priv_state *ps)
 {
int ret;
unsigned long timeout;
 
-   ret = mv88e6xxx_reg_read(ps, REG_GLOBAL, GLOBAL_CONTROL);
+   ret = _mv88e6xxx_reg_read(ps, REG_GLOBAL, GLOBAL_CONTROL);
if (ret < 0)
return ret;
 
-   ret = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_CONTROL,
- ret & ~GLOBAL_CONTROL_PPU_ENABLE);
+   ret = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_CONTROL,
+  ret & ~GLOBAL_CONTROL_PPU_ENABLE);
if (ret)
return ret;
 
timeout = jiffies + 1 * HZ;
while (time_before(jiffies, timeout)) {
-   ret = mv88e6xxx_reg_read(ps, REG_GLOBAL, GLOBAL_STATUS);
+   ret = _mv88e6xxx_reg_read(ps, REG_GLOBAL, GLOBAL_STATUS);
if (ret < 0)
return ret;
 
@@ -361,35 +360,33 @@ void mv88e6xxx_ppu_state_init(struct mv88e6xxx_priv_state 
*ps)
ps->ppu_timer.function = mv88e6xxx_ppu_reenable_timer;
 }
 
-int mv88e6xxx_phy_read_ppu(struct dsa_switch *ds, int addr, int regnum)
+static int 

[PATCH net-next 05/21] net: dsa: mv88e6xxx: factorize temperature access

2016-05-06 Thread Vivien Didelot
Add MV88E6XXX_FLAG_TEMP and MV88E6XXX_FLAG_TEMP_LIMIT flags to describe
switch models featuring a temperature access. Use them to centralize the
access to the temperature feature.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/mv88e6123.c |  3 +++
 drivers/net/dsa/mv88e6131.c |  6 ++
 drivers/net/dsa/mv88e6171.c |  3 +++
 drivers/net/dsa/mv88e6xxx.c |  9 ++---
 drivers/net/dsa/mv88e6xxx.h | 22 ++
 5 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/drivers/net/dsa/mv88e6123.c b/drivers/net/dsa/mv88e6123.c
index 8330a8e..e234bdb 100644
--- a/drivers/net/dsa/mv88e6123.c
+++ b/drivers/net/dsa/mv88e6123.c
@@ -123,6 +123,9 @@ struct dsa_switch_driver mv88e6123_switch_driver = {
.adjust_link= mv88e6xxx_adjust_link,
 #ifdef CONFIG_NET_DSA_HWMON
.get_temp   = mv88e6xxx_get_temp,
+   .get_temp_limit = mv88e6xxx_get_temp_limit,
+   .set_temp_limit = mv88e6xxx_set_temp_limit,
+   .get_temp_alarm = mv88e6xxx_get_temp_alarm,
 #endif
.get_eeprom = mv88e6xxx_get_eeprom,
.set_eeprom = mv88e6xxx_set_eeprom,
diff --git a/drivers/net/dsa/mv88e6131.c b/drivers/net/dsa/mv88e6131.c
index ab8c507..089f9c0 100644
--- a/drivers/net/dsa/mv88e6131.c
+++ b/drivers/net/dsa/mv88e6131.c
@@ -155,6 +155,12 @@ struct dsa_switch_driver mv88e6131_switch_driver = {
.get_sset_count = mv88e6xxx_get_sset_count,
.get_eeprom = mv88e6xxx_get_eeprom,
.set_eeprom = mv88e6xxx_set_eeprom,
+#ifdef CONFIG_NET_DSA_HWMON
+   .get_temp   = mv88e6xxx_get_temp,
+   .get_temp_limit = mv88e6xxx_get_temp_limit,
+   .set_temp_limit = mv88e6xxx_set_temp_limit,
+   .get_temp_alarm = mv88e6xxx_get_temp_alarm,
+#endif
.adjust_link= mv88e6xxx_adjust_link,
.port_bridge_join   = mv88e6xxx_port_bridge_join,
.port_bridge_leave  = mv88e6xxx_port_bridge_leave,
diff --git a/drivers/net/dsa/mv88e6171.c b/drivers/net/dsa/mv88e6171.c
index a7afbaa..588b486 100644
--- a/drivers/net/dsa/mv88e6171.c
+++ b/drivers/net/dsa/mv88e6171.c
@@ -132,6 +132,9 @@ struct dsa_switch_driver mv88e6171_switch_driver = {
.adjust_link= mv88e6xxx_adjust_link,
 #ifdef CONFIG_NET_DSA_HWMON
.get_temp   = mv88e6xxx_get_temp,
+   .get_temp_limit = mv88e6xxx_get_temp_limit,
+   .set_temp_limit = mv88e6xxx_set_temp_limit,
+   .get_temp_alarm = mv88e6xxx_get_temp_alarm,
 #endif
.get_eeprom = mv88e6xxx_get_eeprom,
.set_eeprom = mv88e6xxx_set_eeprom,
diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index 921fe87..c010bb9 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -3191,6 +3191,9 @@ int mv88e6xxx_get_temp(struct dsa_switch *ds, int *temp)
 {
struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
 
+   if (!mv88e6xxx_has(ps, MV88E6XXX_FLAG_TEMP))
+   return -EOPNOTSUPP;
+
if (mv88e6xxx_6320_family(ps) || mv88e6xxx_6352_family(ps))
return mv88e63xx_get_temp(ds, temp);
 
@@ -3203,7 +3206,7 @@ int mv88e6xxx_get_temp_limit(struct dsa_switch *ds, int 
*temp)
int phy = mv88e6xxx_6320_family(ps) ? 3 : 0;
int ret;
 
-   if (!mv88e6xxx_6320_family(ps) && !mv88e6xxx_6352_family(ps))
+   if (!mv88e6xxx_has(ps, MV88E6XXX_FLAG_TEMP_LIMIT))
return -EOPNOTSUPP;
 
*temp = 0;
@@ -3223,7 +3226,7 @@ int mv88e6xxx_set_temp_limit(struct dsa_switch *ds, int 
temp)
int phy = mv88e6xxx_6320_family(ps) ? 3 : 0;
int ret;
 
-   if (!mv88e6xxx_6320_family(ps) && !mv88e6xxx_6352_family(ps))
+   if (!mv88e6xxx_has(ps, MV88E6XXX_FLAG_TEMP_LIMIT))
return -EOPNOTSUPP;
 
ret = mv88e6xxx_phy_page_read(ds, phy, 6, 26);
@@ -3240,7 +3243,7 @@ int mv88e6xxx_get_temp_alarm(struct dsa_switch *ds, bool 
*alarm)
int phy = mv88e6xxx_6320_family(ps) ? 3 : 0;
int ret;
 
-   if (!mv88e6xxx_6320_family(ps) && !mv88e6xxx_6352_family(ps))
+   if (!mv88e6xxx_has(ps, MV88E6XXX_FLAG_TEMP_LIMIT))
return -EOPNOTSUPP;
 
*alarm = false;
diff --git a/drivers/net/dsa/mv88e6xxx.h b/drivers/net/dsa/mv88e6xxx.h
index 0181f67..9ddb6d0 100644
--- a/drivers/net/dsa/mv88e6xxx.h
+++ b/drivers/net/dsa/mv88e6xxx.h
@@ -366,12 +366,20 @@ enum mv88e6xxx_cap {
 * GLOBAL2_SMI_OP, otherwise direct access to PHY registers is done.
 */
MV88E6XXX_CAP_SMI_PHY,
+
+   /* Internal temperature sensor.
+* Available from any enabled port's PHY register 26, page 6.
+*/
+   MV88E6XXX_CAP_TEMP,
+   MV88E6XXX_CAP_TEMP_LIMIT,
 };
 
 /* Bitmask of capabilities */
 #define MV88E6XXX_FLAG_EEPROM  BIT(MV88E6XXX_CAP_EEPROM)
 #define 

[PATCH net-next 06/21] net: dsa: mv88e6xxx: factorize MAC address setting

2016-05-06 Thread Vivien Didelot
Some switch models have a dedicated register for Switch MAC/WoF/WoL.
This register, when present, is used to indirectly set the switch MAC
address, instead of a direct write to 3 global registers.

Identify this feature and share a common mv88e6xxx_set_addr function.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/mv88e6123.c |  2 +-
 drivers/net/dsa/mv88e6131.c |  2 +-
 drivers/net/dsa/mv88e6171.c |  2 +-
 drivers/net/dsa/mv88e6352.c |  2 +-
 drivers/net/dsa/mv88e6xxx.c | 14 --
 drivers/net/dsa/mv88e6xxx.h | 17 ++---
 6 files changed, 30 insertions(+), 9 deletions(-)

diff --git a/drivers/net/dsa/mv88e6123.c b/drivers/net/dsa/mv88e6123.c
index e234bdb..c349fb7 100644
--- a/drivers/net/dsa/mv88e6123.c
+++ b/drivers/net/dsa/mv88e6123.c
@@ -114,7 +114,7 @@ struct dsa_switch_driver mv88e6123_switch_driver = {
.tag_protocol   = DSA_TAG_PROTO_EDSA,
.probe  = mv88e6123_drv_probe,
.setup  = mv88e6123_setup,
-   .set_addr   = mv88e6xxx_set_addr_indirect,
+   .set_addr   = mv88e6xxx_set_addr,
.phy_read   = mv88e6xxx_phy_read,
.phy_write  = mv88e6xxx_phy_write,
.get_strings= mv88e6xxx_get_strings,
diff --git a/drivers/net/dsa/mv88e6131.c b/drivers/net/dsa/mv88e6131.c
index 089f9c0..1e040c6 100644
--- a/drivers/net/dsa/mv88e6131.c
+++ b/drivers/net/dsa/mv88e6131.c
@@ -147,7 +147,7 @@ struct dsa_switch_driver mv88e6131_switch_driver = {
.tag_protocol   = DSA_TAG_PROTO_DSA,
.probe  = mv88e6131_drv_probe,
.setup  = mv88e6131_setup,
-   .set_addr   = mv88e6xxx_set_addr_direct,
+   .set_addr   = mv88e6xxx_set_addr,
.phy_read   = mv88e6xxx_phy_read,
.phy_write  = mv88e6xxx_phy_write,
.get_strings= mv88e6xxx_get_strings,
diff --git a/drivers/net/dsa/mv88e6171.c b/drivers/net/dsa/mv88e6171.c
index 588b486..f9b20e0 100644
--- a/drivers/net/dsa/mv88e6171.c
+++ b/drivers/net/dsa/mv88e6171.c
@@ -123,7 +123,7 @@ struct dsa_switch_driver mv88e6171_switch_driver = {
.tag_protocol   = DSA_TAG_PROTO_EDSA,
.probe  = mv88e6171_drv_probe,
.setup  = mv88e6171_setup,
-   .set_addr   = mv88e6xxx_set_addr_indirect,
+   .set_addr   = mv88e6xxx_set_addr,
.phy_read   = mv88e6xxx_phy_read,
.phy_write  = mv88e6xxx_phy_write,
.get_strings= mv88e6xxx_get_strings,
diff --git a/drivers/net/dsa/mv88e6352.c b/drivers/net/dsa/mv88e6352.c
index 3bb271e..d03c14a 100644
--- a/drivers/net/dsa/mv88e6352.c
+++ b/drivers/net/dsa/mv88e6352.c
@@ -140,7 +140,7 @@ struct dsa_switch_driver mv88e6352_switch_driver = {
.tag_protocol   = DSA_TAG_PROTO_EDSA,
.probe  = mv88e6352_drv_probe,
.setup  = mv88e6352_setup,
-   .set_addr   = mv88e6xxx_set_addr_indirect,
+   .set_addr   = mv88e6xxx_set_addr,
.phy_read   = mv88e6xxx_phy_read,
.phy_write  = mv88e6xxx_phy_write,
.get_strings= mv88e6xxx_get_strings,
diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index c010bb9..10cc277 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -173,7 +173,7 @@ int mv88e6xxx_reg_write(struct mv88e6xxx_priv_state *ps, 
int addr,
return ret;
 }
 
-int mv88e6xxx_set_addr_direct(struct dsa_switch *ds, u8 *addr)
+static int mv88e6xxx_set_addr_direct(struct dsa_switch *ds, u8 *addr)
 {
struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
int err;
@@ -192,7 +192,7 @@ int mv88e6xxx_set_addr_direct(struct dsa_switch *ds, u8 
*addr)
   (addr[4] << 8) | addr[5]);
 }
 
-int mv88e6xxx_set_addr_indirect(struct dsa_switch *ds, u8 *addr)
+static int mv88e6xxx_set_addr_indirect(struct dsa_switch *ds, u8 *addr)
 {
struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
int ret;
@@ -225,6 +225,16 @@ int mv88e6xxx_set_addr_indirect(struct dsa_switch *ds, u8 
*addr)
return 0;
 }
 
+int mv88e6xxx_set_addr(struct dsa_switch *ds, u8 *addr)
+{
+   struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
+
+   if (mv88e6xxx_has(ps, MV88E6XXX_FLAG_SWITCH_MAC))
+   return mv88e6xxx_set_addr_indirect(ds, addr);
+   else
+   return mv88e6xxx_set_addr_direct(ds, addr);
+}
+
 static int _mv88e6xxx_phy_read(struct mv88e6xxx_priv_state *ps, int addr,
   int regnum)
 {
diff --git a/drivers/net/dsa/mv88e6xxx.h b/drivers/net/dsa/mv88e6xxx.h
index 9ddb6d0..517e95f 100644
--- a/drivers/net/dsa/mv88e6xxx.h
+++ b/drivers/net/dsa/mv88e6xxx.h
@@ -367,6 +367,13 @@ enum 

[PATCH net-next 21/21] net: dsa: mv88e6xxx: factorize the switch driver

2016-05-06 Thread Vivien Didelot
Now that all drivers support the same set of functions and the same
setup code, drop every model-specific DSA switch driver and replace them
with a common mv88e6xxx driver.

This merges the info tables into one, removes the function exports, the
model-specific files, and update the defconfigs.

Signed-off-by: Vivien Didelot 
---
 arch/arm/configs/multi_v5_defconfig |   5 +-
 arch/arm/configs/mvebu_v7_defconfig |   2 +-
 arch/arm/configs/orion5x_defconfig  |   3 +-
 arch/tile/configs/tilegx_defconfig  |   3 +-
 arch/tile/configs/tilepro_defconfig |   3 +-
 drivers/net/dsa/Kconfig |  40 +---
 drivers/net/dsa/Makefile|  15 +-
 drivers/net/dsa/mv88e6123.c |  93 --
 drivers/net/dsa/mv88e6131.c | 101 --
 drivers/net/dsa/mv88e6171.c | 101 --
 drivers/net/dsa/mv88e6352.c | 121 
 drivers/net/dsa/mv88e6xxx.c | 354 +++-
 drivers/net/dsa/mv88e6xxx.h |  91 +++--
 13 files changed, 302 insertions(+), 630 deletions(-)
 delete mode 100644 drivers/net/dsa/mv88e6123.c
 delete mode 100644 drivers/net/dsa/mv88e6131.c
 delete mode 100644 drivers/net/dsa/mv88e6171.c
 delete mode 100644 drivers/net/dsa/mv88e6352.c

diff --git a/arch/arm/configs/multi_v5_defconfig 
b/arch/arm/configs/multi_v5_defconfig
index e11d99d..690352d 100644
--- a/arch/arm/configs/multi_v5_defconfig
+++ b/arch/arm/configs/multi_v5_defconfig
@@ -91,10 +91,7 @@ CONFIG_SATA_AHCI=y
 CONFIG_SATA_MV=y
 CONFIG_NETDEVICES=y
 CONFIG_NET_DSA_MV88E6060=y
-CONFIG_NET_DSA_MV88E6131=y
-CONFIG_NET_DSA_MV88E6123=y
-CONFIG_NET_DSA_MV88E6171=y
-CONFIG_NET_DSA_MV88E6352=y
+CONFIG_NET_DSA_MV88E6XXX=y
 CONFIG_MV643XX_ETH=y
 CONFIG_R8169=y
 CONFIG_MARVELL_PHY=y
diff --git a/arch/arm/configs/mvebu_v7_defconfig 
b/arch/arm/configs/mvebu_v7_defconfig
index dc5797a..6492407 100644
--- a/arch/arm/configs/mvebu_v7_defconfig
+++ b/arch/arm/configs/mvebu_v7_defconfig
@@ -66,7 +66,7 @@ CONFIG_SATA_AHCI=y
 CONFIG_AHCI_MVEBU=y
 CONFIG_SATA_MV=y
 CONFIG_NETDEVICES=y
-CONFIG_NET_DSA_MV88E6171=y
+CONFIG_NET_DSA_MV88E6XXX=y
 CONFIG_MV643XX_ETH=y
 CONFIG_MVNETA=y
 CONFIG_MVPP2=y
diff --git a/arch/arm/configs/orion5x_defconfig 
b/arch/arm/configs/orion5x_defconfig
index 6a5bc27..27a70a7 100644
--- a/arch/arm/configs/orion5x_defconfig
+++ b/arch/arm/configs/orion5x_defconfig
@@ -85,8 +85,7 @@ CONFIG_ATA=y
 CONFIG_SATA_MV=y
 CONFIG_NETDEVICES=y
 CONFIG_MII=y
-CONFIG_NET_DSA_MV88E6131=y
-CONFIG_NET_DSA_MV88E6123=y
+CONFIG_NET_DSA_MV88E6XXX=y
 CONFIG_MV643XX_ETH=y
 CONFIG_MARVELL_PHY=y
 # CONFIG_INPUT_MOUSEDEV is not set
diff --git a/arch/tile/configs/tilegx_defconfig 
b/arch/tile/configs/tilegx_defconfig
index 3f3dfb8..7189055 100644
--- a/arch/tile/configs/tilegx_defconfig
+++ b/arch/tile/configs/tilegx_defconfig
@@ -221,8 +221,7 @@ CONFIG_NETCONSOLE_DYNAMIC=y
 CONFIG_TUN=y
 CONFIG_VETH=m
 CONFIG_NET_DSA_MV88E6060=y
-CONFIG_NET_DSA_MV88E6131=y
-CONFIG_NET_DSA_MV88E6123=y
+CONFIG_NET_DSA_MV88E6XXX=y
 CONFIG_SKY2=y
 CONFIG_PTP_1588_CLOCK_TILEGX=y
 # CONFIG_WLAN is not set
diff --git a/arch/tile/configs/tilepro_defconfig 
b/arch/tile/configs/tilepro_defconfig
index ef9e27e..dc85468 100644
--- a/arch/tile/configs/tilepro_defconfig
+++ b/arch/tile/configs/tilepro_defconfig
@@ -340,8 +340,7 @@ CONFIG_NETCONSOLE_DYNAMIC=y
 CONFIG_TUN=y
 CONFIG_VETH=m
 CONFIG_NET_DSA_MV88E6060=y
-CONFIG_NET_DSA_MV88E6131=y
-CONFIG_NET_DSA_MV88E6123=y
+CONFIG_NET_DSA_MV88E6XXX=y
 # CONFIG_NET_VENDOR_3COM is not set
 CONFIG_E1000E=y
 # CONFIG_WLAN is not set
diff --git a/drivers/net/dsa/Kconfig b/drivers/net/dsa/Kconfig
index 7e01dce..200663c 100644
--- a/drivers/net/dsa/Kconfig
+++ b/drivers/net/dsa/Kconfig
@@ -1,10 +1,6 @@
 menu "Distributed Switch Architecture drivers"
depends on HAVE_NET_DSA
 
-config NET_DSA_MV88E6XXX
-   tristate
-   default n
-
 config NET_DSA_MV88E6060
tristate "Marvell 88E6060 ethernet switch chip support"
depends on NET_DSA
@@ -13,41 +9,13 @@ config NET_DSA_MV88E6060
  This enables support for the Marvell 88E6060 ethernet switch
  chip.
 
-config NET_DSA_MV88E6131
-   tristate "Marvell 88E6085/6095/6095F/6131 ethernet switch chip support"
-   depends on NET_DSA
-   select NET_DSA_MV88E6XXX
-   select NET_DSA_TAG_EDSA
-   ---help---
- This enables support for the Marvell 88E6085/6095/6095F/6131
- ethernet switch chips.
-
-config NET_DSA_MV88E6123
-   tristate "Marvell 88E6123/6161/6165 ethernet switch chip support"
-   depends on NET_DSA
-   select NET_DSA_MV88E6XXX
-   select NET_DSA_TAG_EDSA
-   ---help---
- This enables support for the Marvell 88E6123/6161/6165
- ethernet switch chips.
-
-config NET_DSA_MV88E6171
-   tristate "Marvell 88E6171/6175/6350/6351 ethernet switch chip support"
-   depends on NET_DSA
-   select NET_DSA_MV88E6XXX
-   select NET_DSA_TAG_EDSA
-   

[PATCH net-next 19/21] net: dsa: mv88e6xxx: factorize switch setup

2016-05-06 Thread Vivien Didelot
Provide a shared mv88e6xxx_setup function to the drivers.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/mv88e6123.c | 16 +--
 drivers/net/dsa/mv88e6131.c | 16 +--
 drivers/net/dsa/mv88e6171.c | 16 +--
 drivers/net/dsa/mv88e6352.c | 16 +--
 drivers/net/dsa/mv88e6xxx.c | 69 ++---
 drivers/net/dsa/mv88e6xxx.h |  3 +-
 6 files changed, 38 insertions(+), 98 deletions(-)

diff --git a/drivers/net/dsa/mv88e6123.c b/drivers/net/dsa/mv88e6123.c
index 8f3a7c5..2bc407b 100644
--- a/drivers/net/dsa/mv88e6123.c
+++ b/drivers/net/dsa/mv88e6123.c
@@ -51,24 +51,10 @@ static const char *mv88e6123_drv_probe(struct device 
*dsa_dev,
   ARRAY_SIZE(mv88e6123_table));
 }
 
-static int mv88e6123_setup(struct dsa_switch *ds)
-{
-   struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
-   int ret;
-
-   ps->ds = ds;
-
-   ret = mv88e6xxx_setup_common(ps);
-   if (ret < 0)
-   return ret;
-
-   return mv88e6xxx_setup_ports(ds);
-}
-
 struct dsa_switch_driver mv88e6123_switch_driver = {
.tag_protocol   = DSA_TAG_PROTO_EDSA,
.probe  = mv88e6123_drv_probe,
-   .setup  = mv88e6123_setup,
+   .setup  = mv88e6xxx_setup,
.set_addr   = mv88e6xxx_set_addr,
.phy_read   = mv88e6xxx_phy_read,
.phy_write  = mv88e6xxx_phy_write,
diff --git a/drivers/net/dsa/mv88e6131.c b/drivers/net/dsa/mv88e6131.c
index da28327..22952be 100644
--- a/drivers/net/dsa/mv88e6131.c
+++ b/drivers/net/dsa/mv88e6131.c
@@ -58,24 +58,10 @@ static const char *mv88e6131_drv_probe(struct device 
*dsa_dev,
   ARRAY_SIZE(mv88e6131_table));
 }
 
-static int mv88e6131_setup(struct dsa_switch *ds)
-{
-   struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
-   int ret;
-
-   ps->ds = ds;
-
-   ret = mv88e6xxx_setup_common(ps);
-   if (ret < 0)
-   return ret;
-
-   return mv88e6xxx_setup_ports(ds);
-}
-
 struct dsa_switch_driver mv88e6131_switch_driver = {
.tag_protocol   = DSA_TAG_PROTO_DSA,
.probe  = mv88e6131_drv_probe,
-   .setup  = mv88e6131_setup,
+   .setup  = mv88e6xxx_setup,
.set_addr   = mv88e6xxx_set_addr,
.phy_read   = mv88e6xxx_phy_read,
.phy_write  = mv88e6xxx_phy_write,
diff --git a/drivers/net/dsa/mv88e6171.c b/drivers/net/dsa/mv88e6171.c
index 83678ad..4bf517a 100644
--- a/drivers/net/dsa/mv88e6171.c
+++ b/drivers/net/dsa/mv88e6171.c
@@ -58,24 +58,10 @@ static const char *mv88e6171_drv_probe(struct device 
*dsa_dev,
   ARRAY_SIZE(mv88e6171_table));
 }
 
-static int mv88e6171_setup(struct dsa_switch *ds)
-{
-   struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
-   int ret;
-
-   ps->ds = ds;
-
-   ret = mv88e6xxx_setup_common(ps);
-   if (ret < 0)
-   return ret;
-
-   return mv88e6xxx_setup_ports(ds);
-}
-
 struct dsa_switch_driver mv88e6171_switch_driver = {
.tag_protocol   = DSA_TAG_PROTO_EDSA,
.probe  = mv88e6171_drv_probe,
-   .setup  = mv88e6171_setup,
+   .setup  = mv88e6xxx_setup,
.set_addr   = mv88e6xxx_set_addr,
.phy_read   = mv88e6xxx_phy_read,
.phy_write  = mv88e6xxx_phy_write,
diff --git a/drivers/net/dsa/mv88e6352.c b/drivers/net/dsa/mv88e6352.c
index 81810dd..d65a90d 100644
--- a/drivers/net/dsa/mv88e6352.c
+++ b/drivers/net/dsa/mv88e6352.c
@@ -77,24 +77,10 @@ static const char *mv88e6352_drv_probe(struct device 
*dsa_dev,
   ARRAY_SIZE(mv88e6352_table));
 }
 
-static int mv88e6352_setup(struct dsa_switch *ds)
-{
-   struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
-   int ret;
-
-   ps->ds = ds;
-
-   ret = mv88e6xxx_setup_common(ps);
-   if (ret < 0)
-   return ret;
-
-   return mv88e6xxx_setup_ports(ds);
-}
-
 struct dsa_switch_driver mv88e6352_switch_driver = {
.tag_protocol   = DSA_TAG_PROTO_EDSA,
.probe  = mv88e6352_drv_probe,
-   .setup  = mv88e6352_setup,
+   .setup  = mv88e6xxx_setup,
.set_addr   = mv88e6xxx_set_addr,
.phy_read   = mv88e6xxx_phy_read,
.phy_write  = mv88e6xxx_phy_write,
diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index 261ca9a..328349b 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -2632,14 +2632,12 @@ static int mv88e6xxx_power_on_serdes(struct 
mv88e6xxx_priv_state *ps)
return ret;
 }
 
-static int mv88e6xxx_setup_port(struct 

[PATCH net-next 16/21] net: dsa: mv88e6xxx: factorize GLOBAL_MONITOR_CONTROL setup

2016-05-06 Thread Vivien Didelot
All switch drivers configure the GLOBAL_MONITOR_CONTROL register with
slightly changes.

Assume the setup of the upstream port, and configure it as the port to
which ingress and egress and ARP monitor frames are to be sent.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/mv88e6123.c | 14 --
 drivers/net/dsa/mv88e6131.c | 13 -
 drivers/net/dsa/mv88e6171.c | 15 ---
 drivers/net/dsa/mv88e6352.c | 14 --
 drivers/net/dsa/mv88e6xxx.c | 12 
 5 files changed, 12 insertions(+), 56 deletions(-)

diff --git a/drivers/net/dsa/mv88e6123.c b/drivers/net/dsa/mv88e6123.c
index 1cd30ac..5df06d8 100644
--- a/drivers/net/dsa/mv88e6123.c
+++ b/drivers/net/dsa/mv88e6123.c
@@ -54,20 +54,6 @@ static const char *mv88e6123_drv_probe(struct device 
*dsa_dev,
 static int mv88e6123_setup_global(struct dsa_switch *ds)
 {
struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
-   u32 upstream_port = dsa_upstream_port(ds);
-   int ret;
-   u32 reg;
-
-   /* Configure the upstream port, and configure the upstream
-* port as the port to which ingress and egress monitor frames
-* are to be sent.
-*/
-   reg = upstream_port << GLOBAL_MONITOR_CONTROL_INGRESS_SHIFT |
-   upstream_port << GLOBAL_MONITOR_CONTROL_EGRESS_SHIFT |
-   upstream_port << GLOBAL_MONITOR_CONTROL_ARP_SHIFT;
-   ret = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_MONITOR_CONTROL, reg);
-   if (ret)
-   return ret;
 
/* Disable remote management for now, and set the switch's
 * DSA device number.
diff --git a/drivers/net/dsa/mv88e6131.c b/drivers/net/dsa/mv88e6131.c
index 1c3b245..c8e3974 100644
--- a/drivers/net/dsa/mv88e6131.c
+++ b/drivers/net/dsa/mv88e6131.c
@@ -61,20 +61,7 @@ static const char *mv88e6131_drv_probe(struct device 
*dsa_dev,
 static int mv88e6131_setup_global(struct dsa_switch *ds)
 {
struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
-   u32 upstream_port = dsa_upstream_port(ds);
int ret;
-   u32 reg;
-
-   /* Disable ARP mirroring, and configure the upstream port as
-* the port to which ingress and egress monitor frames are to
-* be sent.
-*/
-   reg = upstream_port << GLOBAL_MONITOR_CONTROL_INGRESS_SHIFT |
-   upstream_port << GLOBAL_MONITOR_CONTROL_EGRESS_SHIFT |
-   GLOBAL_MONITOR_CONTROL_ARP_DISABLED;
-   ret = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_MONITOR_CONTROL, reg);
-   if (ret)
-   return ret;
 
/* Disable cascade port functionality unless this device
 * is used in a cascade configuration, and set the switch's
diff --git a/drivers/net/dsa/mv88e6171.c b/drivers/net/dsa/mv88e6171.c
index 6c8554c..a848aef 100644
--- a/drivers/net/dsa/mv88e6171.c
+++ b/drivers/net/dsa/mv88e6171.c
@@ -61,21 +61,6 @@ static const char *mv88e6171_drv_probe(struct device 
*dsa_dev,
 static int mv88e6171_setup_global(struct dsa_switch *ds)
 {
struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
-   u32 upstream_port = dsa_upstream_port(ds);
-   int ret;
-   u32 reg;
-
-   /* Configure the upstream port, and configure the upstream
-* port as the port to which ingress and egress monitor frames
-* are to be sent.
-*/
-   reg = upstream_port << GLOBAL_MONITOR_CONTROL_INGRESS_SHIFT |
-   upstream_port << GLOBAL_MONITOR_CONTROL_EGRESS_SHIFT |
-   upstream_port << GLOBAL_MONITOR_CONTROL_ARP_SHIFT |
-   upstream_port << GLOBAL_MONITOR_CONTROL_MIRROR_SHIFT;
-   ret = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_MONITOR_CONTROL, reg);
-   if (ret)
-   return ret;
 
/* Disable remote management for now, and set the switch's
 * DSA device number.
diff --git a/drivers/net/dsa/mv88e6352.c b/drivers/net/dsa/mv88e6352.c
index a27616c..e098870 100644
--- a/drivers/net/dsa/mv88e6352.c
+++ b/drivers/net/dsa/mv88e6352.c
@@ -80,20 +80,6 @@ static const char *mv88e6352_drv_probe(struct device 
*dsa_dev,
 static int mv88e6352_setup_global(struct dsa_switch *ds)
 {
struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
-   u32 upstream_port = dsa_upstream_port(ds);
-   int ret;
-   u32 reg;
-
-   /* Configure the upstream port, and configure the upstream
-* port as the port to which ingress and egress monitor frames
-* are to be sent.
-*/
-   reg = upstream_port << GLOBAL_MONITOR_CONTROL_INGRESS_SHIFT |
-   upstream_port << GLOBAL_MONITOR_CONTROL_EGRESS_SHIFT |
-   upstream_port << GLOBAL_MONITOR_CONTROL_ARP_SHIFT;
-   ret = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_MONITOR_CONTROL, reg);
-   if (ret)
-   return ret;
 
/* Disable remote management for now, and set the switch's
 * DSA device number.
diff --git a/drivers/net/dsa/mv88e6xxx.c 

[PATCH net-next 11/21] net: dsa: mv88e6xxx: factorize ATU access

2016-05-06 Thread Vivien Didelot
Add a MV88E6XXX_FLAG_ATU flag to identify switch models with an Address
Translation Unit.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/mv88e6123.c |  4 
 drivers/net/dsa/mv88e6xxx.c | 14 ++
 drivers/net/dsa/mv88e6xxx.h | 24 ++--
 3 files changed, 36 insertions(+), 6 deletions(-)

diff --git a/drivers/net/dsa/mv88e6123.c b/drivers/net/dsa/mv88e6123.c
index da5aa9c..45cce4f 100644
--- a/drivers/net/dsa/mv88e6123.c
+++ b/drivers/net/dsa/mv88e6123.c
@@ -141,6 +141,10 @@ struct dsa_switch_driver mv88e6123_switch_driver = {
.port_vlan_add  = mv88e6xxx_port_vlan_add,
.port_vlan_del  = mv88e6xxx_port_vlan_del,
.port_vlan_dump = mv88e6xxx_port_vlan_dump,
+   .port_fdb_prepare   = mv88e6xxx_port_fdb_prepare,
+   .port_fdb_add   = mv88e6xxx_port_fdb_add,
+   .port_fdb_del   = mv88e6xxx_port_fdb_del,
+   .port_fdb_dump  = mv88e6xxx_port_fdb_dump,
 };
 
 MODULE_ALIAS("platform:mv88e6123");
diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index 7994486..dbbda4e 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -2267,6 +2267,11 @@ int mv88e6xxx_port_fdb_prepare(struct dsa_switch *ds, 
int port,
   const struct switchdev_obj_port_fdb *fdb,
   struct switchdev_trans *trans)
 {
+   struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
+
+   if (!mv88e6xxx_has(ps, MV88E6XXX_FLAG_ATU))
+   return -EOPNOTSUPP;
+
/* We don't need any dynamic resource from the kernel (yet),
 * so skip the prepare phase.
 */
@@ -2282,6 +2287,9 @@ void mv88e6xxx_port_fdb_add(struct dsa_switch *ds, int 
port,
GLOBAL_ATU_DATA_STATE_UC_STATIC;
struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
 
+   if (!mv88e6xxx_has(ps, MV88E6XXX_FLAG_ATU))
+   return;
+
mutex_lock(>smi_mutex);
if (_mv88e6xxx_port_fdb_load(ps, port, fdb->addr, fdb->vid, state))
netdev_err(ds->ports[port], "failed to load MAC address\n");
@@ -2294,6 +2302,9 @@ int mv88e6xxx_port_fdb_del(struct dsa_switch *ds, int 
port,
struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
int ret;
 
+   if (!mv88e6xxx_has(ps, MV88E6XXX_FLAG_ATU))
+   return -EOPNOTSUPP;
+
mutex_lock(>smi_mutex);
ret = _mv88e6xxx_port_fdb_load(ps, port, fdb->addr, fdb->vid,
   GLOBAL_ATU_DATA_STATE_UNUSED);
@@ -2399,6 +2410,9 @@ int mv88e6xxx_port_fdb_dump(struct dsa_switch *ds, int 
port,
u16 fid;
int err;
 
+   if (!mv88e6xxx_has(ps, MV88E6XXX_FLAG_ATU))
+   return -EOPNOTSUPP;
+
mutex_lock(>smi_mutex);
 
/* Dump port's default Filtering Information Database (VLAN ID 0) */
diff --git a/drivers/net/dsa/mv88e6xxx.h b/drivers/net/dsa/mv88e6xxx.h
index 4f21206..192292f 100644
--- a/drivers/net/dsa/mv88e6xxx.h
+++ b/drivers/net/dsa/mv88e6xxx.h
@@ -351,6 +351,11 @@ enum mv88e6xxx_family {
 };
 
 enum mv88e6xxx_cap {
+   /* Address Translation Unit.
+* The ATU is used to lookup and learn MAC addresses. See GLOBAL_ATU_OP.
+*/
+   MV88E6XXX_CAP_ATU,
+
/* Energy Efficient Ethernet.
 */
MV88E6XXX_CAP_EEE,
@@ -402,6 +407,7 @@ enum mv88e6xxx_cap {
 };
 
 /* Bitmask of capabilities */
+#define MV88E6XXX_FLAG_ATU BIT(MV88E6XXX_CAP_ATU)
 #define MV88E6XXX_FLAG_EEE BIT(MV88E6XXX_CAP_EEE)
 #define MV88E6XXX_FLAG_EEPROM  BIT(MV88E6XXX_CAP_EEPROM)
 #define MV88E6XXX_FLAG_PORTSTATE   BIT(MV88E6XXX_CAP_PORTSTATE)
@@ -414,12 +420,14 @@ enum mv88e6xxx_cap {
 #define MV88E6XXX_FLAG_VTU BIT(MV88E6XXX_CAP_VTU)
 
 #define MV88E6XXX_FLAGS_FAMILY_6095\
-   (MV88E6XXX_FLAG_PPU |   \
+   (MV88E6XXX_FLAG_ATU |   \
+MV88E6XXX_FLAG_PPU |   \
 MV88E6XXX_FLAG_VLANTABLE | \
 MV88E6XXX_FLAG_VTU)
 
 #define MV88E6XXX_FLAGS_FAMILY_6097\
-   (MV88E6XXX_FLAG_PPU |   \
+   (MV88E6XXX_FLAG_ATU |   \
+MV88E6XXX_FLAG_PPU |   \
 MV88E6XXX_FLAG_VLANTABLE | \
 MV88E6XXX_FLAG_VTU)
 
@@ -428,12 +436,14 @@ enum mv88e6xxx_cap {
 MV88E6XXX_FLAG_TEMP)
 
 #define MV88E6XXX_FLAGS_FAMILY_6185\
-   (MV88E6XXX_FLAG_PPU |   \
+   (MV88E6XXX_FLAG_ATU |   \
+MV88E6XXX_FLAG_PPU |   \
 MV88E6XXX_FLAG_VLANTABLE | \
 MV88E6XXX_FLAG_VTU)
 
 #define MV88E6XXX_FLAGS_FAMILY_6320\
-   (MV88E6XXX_FLAG_EEE |   \
+   (MV88E6XXX_FLAG_ATU |   \
+MV88E6XXX_FLAG_EEE |   \
 MV88E6XXX_FLAG_EEPROM |\
 MV88E6XXX_FLAG_PORTSTATE | \
 MV88E6XXX_FLAG_SMI_PHY |   \
@@ -444,7 +454,8 @@ enum mv88e6xxx_cap {
 

[PATCH net-next 13/21] net: dsa: mv88e6xxx: factorize global setup

2016-05-06 Thread Vivien Didelot
Every driver is calling mv88e6xxx_setup_global after
mv88e6xxx_setup_common. Call the former in the latter.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/mv88e6123.c |   4 --
 drivers/net/dsa/mv88e6131.c |   4 --
 drivers/net/dsa/mv88e6171.c |   4 --
 drivers/net/dsa/mv88e6352.c |   4 --
 drivers/net/dsa/mv88e6xxx.c | 109 +++-
 drivers/net/dsa/mv88e6xxx.h |   1 -
 6 files changed, 56 insertions(+), 70 deletions(-)

diff --git a/drivers/net/dsa/mv88e6123.c b/drivers/net/dsa/mv88e6123.c
index fadec7a..d74695a 100644
--- a/drivers/net/dsa/mv88e6123.c
+++ b/drivers/net/dsa/mv88e6123.c
@@ -58,10 +58,6 @@ static int mv88e6123_setup_global(struct dsa_switch *ds)
int ret;
u32 reg;
 
-   ret = mv88e6xxx_setup_global(ds);
-   if (ret)
-   return ret;
-
/* Disable the PHY polling unit (since there won't be any
 * external PHYs to poll), don't discard packets with
 * excessive collisions, and mask all interrupt sources.
diff --git a/drivers/net/dsa/mv88e6131.c b/drivers/net/dsa/mv88e6131.c
index 25ed823..e22ca7b 100644
--- a/drivers/net/dsa/mv88e6131.c
+++ b/drivers/net/dsa/mv88e6131.c
@@ -65,10 +65,6 @@ static int mv88e6131_setup_global(struct dsa_switch *ds)
int ret;
u32 reg;
 
-   ret = mv88e6xxx_setup_global(ds);
-   if (ret)
-   return ret;
-
/* Enable the PHY polling unit, don't discard packets with
 * excessive collisions, use a weighted fair queueing scheme
 * to arbitrate between packet queues, set the maximum frame
diff --git a/drivers/net/dsa/mv88e6171.c b/drivers/net/dsa/mv88e6171.c
index caaa4b6..4bbf2e1 100644
--- a/drivers/net/dsa/mv88e6171.c
+++ b/drivers/net/dsa/mv88e6171.c
@@ -65,10 +65,6 @@ static int mv88e6171_setup_global(struct dsa_switch *ds)
int ret;
u32 reg;
 
-   ret = mv88e6xxx_setup_global(ds);
-   if (ret)
-   return ret;
-
/* Discard packets with excessive collisions, mask all
 * interrupt sources, enable PPU.
 */
diff --git a/drivers/net/dsa/mv88e6352.c b/drivers/net/dsa/mv88e6352.c
index 470789a..3e0be87 100644
--- a/drivers/net/dsa/mv88e6352.c
+++ b/drivers/net/dsa/mv88e6352.c
@@ -84,10 +84,6 @@ static int mv88e6352_setup_global(struct dsa_switch *ds)
int ret;
u32 reg;
 
-   ret = mv88e6xxx_setup_global(ds);
-   if (ret)
-   return ret;
-
/* Discard packets with excessive collisions,
 * mask all interrupt sources, enable PPU (bit 14, undocumented).
 */
diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index b744ed5..92f4b00 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -2912,36 +2912,11 @@ int mv88e6xxx_setup_ports(struct dsa_switch *ds)
return 0;
 }
 
-int mv88e6xxx_setup_common(struct mv88e6xxx_priv_state *ps)
+static int _mv88e6xxx_setup_global(struct mv88e6xxx_priv_state *ps)
 {
int err;
-
-   mutex_init(>smi_mutex);
-
-   INIT_WORK(>bridge_work, mv88e6xxx_bridge_work);
-
-   if (mv88e6xxx_has(ps, MV88E6XXX_FLAG_EEPROM))
-   mutex_init(>eeprom_mutex);
-
-   if (mv88e6xxx_has(ps, MV88E6XXX_FLAG_PPU))
-   mv88e6xxx_ppu_state_init(ps);
-
-   mutex_lock(>smi_mutex);
-
-   err = _mv88e6xxx_switch_reset(ps);
-
-   mutex_unlock(>smi_mutex);
-
-   return err;
-}
-
-int mv88e6xxx_setup_global(struct dsa_switch *ds)
-{
-   struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
-   int err;
int i;
 
-   mutex_lock(>smi_mutex);
/* Set the default address aging time to 5 minutes, and
 * enable address learn messages to be sent to all message
 * ports.
@@ -2949,45 +2924,45 @@ int mv88e6xxx_setup_global(struct dsa_switch *ds)
err = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_ATU_CONTROL,
   0x0140 | GLOBAL_ATU_CONTROL_LEARN2ALL);
if (err)
-   goto unlock;
+   return err;
 
/* Configure the IP ToS mapping registers. */
err = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_IP_PRI_0, 0x);
if (err)
-   goto unlock;
+   return err;
err = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_IP_PRI_1, 0x);
if (err)
-   goto unlock;
+   return err;
err = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_IP_PRI_2, 0x);
if (err)
-   goto unlock;
+   return err;
err = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_IP_PRI_3, 0x);
if (err)
-   goto unlock;
+   return err;
err = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_IP_PRI_4, 0x);
if (err)
-   goto unlock;
+   return err;
err = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_IP_PRI_5, 0x);
 

[PATCH net-next 17/21] net: dsa: mv88e6xxx: factorize GLOBAL_CONTROL_2 setup

2016-05-06 Thread Vivien Didelot
All switch models setup the GLOBAL_CONTROL_2 register with slightly
differences.

Since the cascade mode is valid even in a single chip setup, factorize
such configuration.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/mv88e6123.c | 15 ---
 drivers/net/dsa/mv88e6131.c | 16 
 drivers/net/dsa/mv88e6171.c | 15 ---
 drivers/net/dsa/mv88e6352.c | 14 --
 drivers/net/dsa/mv88e6xxx.c |  7 +++
 5 files changed, 7 insertions(+), 60 deletions(-)

diff --git a/drivers/net/dsa/mv88e6123.c b/drivers/net/dsa/mv88e6123.c
index 5df06d8..8f3a7c5 100644
--- a/drivers/net/dsa/mv88e6123.c
+++ b/drivers/net/dsa/mv88e6123.c
@@ -51,17 +51,6 @@ static const char *mv88e6123_drv_probe(struct device 
*dsa_dev,
   ARRAY_SIZE(mv88e6123_table));
 }
 
-static int mv88e6123_setup_global(struct dsa_switch *ds)
-{
-   struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
-
-   /* Disable remote management for now, and set the switch's
-* DSA device number.
-*/
-   return mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_CONTROL_2,
-  ds->index & 0x1f);
-}
-
 static int mv88e6123_setup(struct dsa_switch *ds)
 {
struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
@@ -73,10 +62,6 @@ static int mv88e6123_setup(struct dsa_switch *ds)
if (ret < 0)
return ret;
 
-   ret = mv88e6123_setup_global(ds);
-   if (ret < 0)
-   return ret;
-
return mv88e6xxx_setup_ports(ds);
 }
 
diff --git a/drivers/net/dsa/mv88e6131.c b/drivers/net/dsa/mv88e6131.c
index c8e3974..b6ca07b 100644
--- a/drivers/net/dsa/mv88e6131.c
+++ b/drivers/net/dsa/mv88e6131.c
@@ -61,22 +61,6 @@ static const char *mv88e6131_drv_probe(struct device 
*dsa_dev,
 static int mv88e6131_setup_global(struct dsa_switch *ds)
 {
struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
-   int ret;
-
-   /* Disable cascade port functionality unless this device
-* is used in a cascade configuration, and set the switch's
-* DSA device number.
-*/
-   if (ds->dst->pd->nr_chips > 1)
-   ret = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_CONTROL_2,
- GLOBAL_CONTROL_2_MULTIPLE_CASCADE |
- (ds->index & 0x1f));
-   else
-   ret = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_CONTROL_2,
- GLOBAL_CONTROL_2_NO_CASCADE |
- (ds->index & 0x1f));
-   if (ret)
-   return ret;
 
/* Force the priority of IGMP/MLD snoop frames and ARP frames
 * to the highest setting.
diff --git a/drivers/net/dsa/mv88e6171.c b/drivers/net/dsa/mv88e6171.c
index a848aef..83678ad 100644
--- a/drivers/net/dsa/mv88e6171.c
+++ b/drivers/net/dsa/mv88e6171.c
@@ -58,17 +58,6 @@ static const char *mv88e6171_drv_probe(struct device 
*dsa_dev,
   ARRAY_SIZE(mv88e6171_table));
 }
 
-static int mv88e6171_setup_global(struct dsa_switch *ds)
-{
-   struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
-
-   /* Disable remote management for now, and set the switch's
-* DSA device number.
-*/
-   return mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_CONTROL_2,
-  ds->index & 0x1f);
-}
-
 static int mv88e6171_setup(struct dsa_switch *ds)
 {
struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
@@ -80,10 +69,6 @@ static int mv88e6171_setup(struct dsa_switch *ds)
if (ret < 0)
return ret;
 
-   ret = mv88e6171_setup_global(ds);
-   if (ret < 0)
-   return ret;
-
return mv88e6xxx_setup_ports(ds);
 }
 
diff --git a/drivers/net/dsa/mv88e6352.c b/drivers/net/dsa/mv88e6352.c
index e098870..81810dd 100644
--- a/drivers/net/dsa/mv88e6352.c
+++ b/drivers/net/dsa/mv88e6352.c
@@ -77,16 +77,6 @@ static const char *mv88e6352_drv_probe(struct device 
*dsa_dev,
   ARRAY_SIZE(mv88e6352_table));
 }
 
-static int mv88e6352_setup_global(struct dsa_switch *ds)
-{
-   struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
-
-   /* Disable remote management for now, and set the switch's
-* DSA device number.
-*/
-   return mv88e6xxx_reg_write(ps, REG_GLOBAL, 0x1c, ds->index & 0x1f);
-}
-
 static int mv88e6352_setup(struct dsa_switch *ds)
 {
struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
@@ -98,10 +88,6 @@ static int mv88e6352_setup(struct dsa_switch *ds)
if (ret < 0)
return ret;
 
-   ret = mv88e6352_setup_global(ds);
-   if (ret < 0)
-   return ret;
-
return mv88e6xxx_setup_ports(ds);
 }
 
diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index 2d89829..261ca9a 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ 

[PATCH net-next 14/21] net: dsa: mv88e6xxx: factorize GLOBAL_CONTROL setup

2016-05-06 Thread Vivien Didelot
All switch models configure the GLOBAL_CONTROL register with slightly
differences.

Discarding packets with excessive collisions
(GLOBAL_CONTROL_DISCARD_EXCESS) is specific to 6352 and similar
switches, and setting a maximum frame size
(GLOBAL_CONTROL_MAX_FRAME_1632) is specific to 6185 and similar
switches.

As we are centralizing the chips setup, skip these settings and don't
discard any frames yet, until we found out that such discarding by the
hardware is necessary.

Assume a common setup to enable the PHY Polling Unit if present, don't
discard any packets, and mask all interrupt sources.

Tested on 88E6352 and 88E6185.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/mv88e6123.c |  8 
 drivers/net/dsa/mv88e6131.c | 11 ---
 drivers/net/dsa/mv88e6171.c |  9 -
 drivers/net/dsa/mv88e6352.c |  9 -
 drivers/net/dsa/mv88e6xxx.c | 13 +
 5 files changed, 13 insertions(+), 37 deletions(-)

diff --git a/drivers/net/dsa/mv88e6123.c b/drivers/net/dsa/mv88e6123.c
index d74695a..1cd30ac 100644
--- a/drivers/net/dsa/mv88e6123.c
+++ b/drivers/net/dsa/mv88e6123.c
@@ -58,14 +58,6 @@ static int mv88e6123_setup_global(struct dsa_switch *ds)
int ret;
u32 reg;
 
-   /* Disable the PHY polling unit (since there won't be any
-* external PHYs to poll), don't discard packets with
-* excessive collisions, and mask all interrupt sources.
-*/
-   ret = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_CONTROL, 0x);
-   if (ret)
-   return ret;
-
/* Configure the upstream port, and configure the upstream
 * port as the port to which ingress and egress monitor frames
 * are to be sent.
diff --git a/drivers/net/dsa/mv88e6131.c b/drivers/net/dsa/mv88e6131.c
index e22ca7b..d05fc79 100644
--- a/drivers/net/dsa/mv88e6131.c
+++ b/drivers/net/dsa/mv88e6131.c
@@ -65,17 +65,6 @@ static int mv88e6131_setup_global(struct dsa_switch *ds)
int ret;
u32 reg;
 
-   /* Enable the PHY polling unit, don't discard packets with
-* excessive collisions, use a weighted fair queueing scheme
-* to arbitrate between packet queues, set the maximum frame
-* size to 1632, and mask all interrupt sources.
-*/
-   ret = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_CONTROL,
- GLOBAL_CONTROL_PPU_ENABLE |
- GLOBAL_CONTROL_MAX_FRAME_1632);
-   if (ret)
-   return ret;
-
/* Set the VLAN ethertype to 0x8100. */
ret = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_CORE_TAG_TYPE, 0x8100);
if (ret)
diff --git a/drivers/net/dsa/mv88e6171.c b/drivers/net/dsa/mv88e6171.c
index 4bbf2e1..6c8554c 100644
--- a/drivers/net/dsa/mv88e6171.c
+++ b/drivers/net/dsa/mv88e6171.c
@@ -65,15 +65,6 @@ static int mv88e6171_setup_global(struct dsa_switch *ds)
int ret;
u32 reg;
 
-   /* Discard packets with excessive collisions, mask all
-* interrupt sources, enable PPU.
-*/
-   ret = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_CONTROL,
- GLOBAL_CONTROL_PPU_ENABLE |
- GLOBAL_CONTROL_DISCARD_EXCESS);
-   if (ret)
-   return ret;
-
/* Configure the upstream port, and configure the upstream
 * port as the port to which ingress and egress monitor frames
 * are to be sent.
diff --git a/drivers/net/dsa/mv88e6352.c b/drivers/net/dsa/mv88e6352.c
index 3e0be87..a27616c 100644
--- a/drivers/net/dsa/mv88e6352.c
+++ b/drivers/net/dsa/mv88e6352.c
@@ -84,15 +84,6 @@ static int mv88e6352_setup_global(struct dsa_switch *ds)
int ret;
u32 reg;
 
-   /* Discard packets with excessive collisions,
-* mask all interrupt sources, enable PPU (bit 14, undocumented).
-*/
-   ret = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_CONTROL,
- GLOBAL_CONTROL_PPU_ENABLE |
- GLOBAL_CONTROL_DISCARD_EXCESS);
-   if (ret)
-   return ret;
-
/* Configure the upstream port, and configure the upstream
 * port as the port to which ingress and egress monitor frames
 * are to be sent.
diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index 92f4b00..1572719 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -2914,9 +2914,22 @@ int mv88e6xxx_setup_ports(struct dsa_switch *ds)
 
 static int _mv88e6xxx_setup_global(struct mv88e6xxx_priv_state *ps)
 {
+   u16 reg;
int err;
int i;
 
+   /* Enable the PHY Polling Unit if present, don't discard any packets,
+* and mask all interrupt sources.
+*/
+   reg = 0;
+   if (mv88e6xxx_has(ps, MV88E6XXX_FLAG_PPU) ||
+   mv88e6xxx_has(ps, MV88E6XXX_FLAG_PPU_ACTIVE))
+   reg |= 

[PATCH net-next 12/21] net: dsa: mv88e6xxx: factorize switch reset

2016-05-06 Thread Vivien Didelot
Add a MV88E6XXX_FLAG_PPU_ACTIVE flag to describe how to reset the
switch, and merge the reset call to the common setup code.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/mv88e6123.c |   4 --
 drivers/net/dsa/mv88e6131.c |   4 --
 drivers/net/dsa/mv88e6171.c |   4 --
 drivers/net/dsa/mv88e6352.c |   4 --
 drivers/net/dsa/mv88e6xxx.c | 137 +++-
 drivers/net/dsa/mv88e6xxx.h |   6 +-
 6 files changed, 76 insertions(+), 83 deletions(-)

diff --git a/drivers/net/dsa/mv88e6123.c b/drivers/net/dsa/mv88e6123.c
index 45cce4f..fadec7a 100644
--- a/drivers/net/dsa/mv88e6123.c
+++ b/drivers/net/dsa/mv88e6123.c
@@ -99,10 +99,6 @@ static int mv88e6123_setup(struct dsa_switch *ds)
if (ret < 0)
return ret;
 
-   ret = mv88e6xxx_switch_reset(ps, false);
-   if (ret < 0)
-   return ret;
-
ret = mv88e6123_setup_global(ds);
if (ret < 0)
return ret;
diff --git a/drivers/net/dsa/mv88e6131.c b/drivers/net/dsa/mv88e6131.c
index 5d25244..25ed823 100644
--- a/drivers/net/dsa/mv88e6131.c
+++ b/drivers/net/dsa/mv88e6131.c
@@ -132,10 +132,6 @@ static int mv88e6131_setup(struct dsa_switch *ds)
if (ret < 0)
return ret;
 
-   ret = mv88e6xxx_switch_reset(ps, false);
-   if (ret < 0)
-   return ret;
-
ret = mv88e6131_setup_global(ds);
if (ret < 0)
return ret;
diff --git a/drivers/net/dsa/mv88e6171.c b/drivers/net/dsa/mv88e6171.c
index a98e7d3..caaa4b6 100644
--- a/drivers/net/dsa/mv88e6171.c
+++ b/drivers/net/dsa/mv88e6171.c
@@ -108,10 +108,6 @@ static int mv88e6171_setup(struct dsa_switch *ds)
if (ret < 0)
return ret;
 
-   ret = mv88e6xxx_switch_reset(ps, true);
-   if (ret < 0)
-   return ret;
-
ret = mv88e6171_setup_global(ds);
if (ret < 0)
return ret;
diff --git a/drivers/net/dsa/mv88e6352.c b/drivers/net/dsa/mv88e6352.c
index d03c14a..470789a 100644
--- a/drivers/net/dsa/mv88e6352.c
+++ b/drivers/net/dsa/mv88e6352.c
@@ -125,10 +125,6 @@ static int mv88e6352_setup(struct dsa_switch *ds)
if (ret < 0)
return ret;
 
-   ret = mv88e6xxx_switch_reset(ps, true);
-   if (ret < 0)
-   return ret;
-
ret = mv88e6352_setup_global(ds);
if (ret < 0)
return ret;
diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index dbbda4e..b744ed5 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -2551,6 +2551,68 @@ restore_page_0:
return ret;
 }
 
+static int _mv88e6xxx_switch_reset(struct mv88e6xxx_priv_state *ps)
+{
+   bool ppu_active = mv88e6xxx_has(ps, MV88E6XXX_FLAG_PPU_ACTIVE);
+   u16 is_reset = (ppu_active ? 0x8800 : 0xc800);
+   struct gpio_desc *gpiod = ps->ds->pd->reset;
+   unsigned long timeout;
+   int ret;
+   int i;
+
+   /* Set all ports to the disabled state. */
+   for (i = 0; i < ps->info->num_ports; i++) {
+   ret = _mv88e6xxx_reg_read(ps, REG_PORT(i), PORT_CONTROL);
+   if (ret < 0)
+   return ret;
+
+   ret = _mv88e6xxx_reg_write(ps, REG_PORT(i), PORT_CONTROL,
+  ret & 0xfffc);
+   if (ret)
+   return ret;
+   }
+
+   /* Wait for transmit queues to drain. */
+   usleep_range(2000, 4000);
+
+   /* If there is a gpio connected to the reset pin, toggle it */
+   if (gpiod) {
+   gpiod_set_value_cansleep(gpiod, 1);
+   usleep_range(1, 2);
+   gpiod_set_value_cansleep(gpiod, 0);
+   usleep_range(1, 2);
+   }
+
+   /* Reset the switch. Keep the PPU active if requested. The PPU
+* needs to be active to support indirect phy register access
+* through global registers 0x18 and 0x19.
+*/
+   if (ppu_active)
+   ret = _mv88e6xxx_reg_write(ps, REG_GLOBAL, 0x04, 0xc000);
+   else
+   ret = _mv88e6xxx_reg_write(ps, REG_GLOBAL, 0x04, 0xc400);
+   if (ret)
+   return ret;
+
+   /* Wait up to one second for reset to complete. */
+   timeout = jiffies + 1 * HZ;
+   while (time_before(jiffies, timeout)) {
+   ret = _mv88e6xxx_reg_read(ps, REG_GLOBAL, 0x00);
+   if (ret < 0)
+   return ret;
+
+   if ((ret & is_reset) == is_reset)
+   break;
+   usleep_range(1000, 2000);
+   }
+   if (time_after(jiffies, timeout))
+   ret = -ETIMEDOUT;
+   else
+   ret = 0;
+
+   return ret;
+}
+
 static int mv88e6xxx_power_on_serdes(struct mv88e6xxx_priv_state *ps)
 {
int ret;
@@ -2852,6 +2914,8 @@ int mv88e6xxx_setup_ports(struct dsa_switch *ds)
 
 int 

[PATCH net-next 18/21] net: dsa: mv88e6xxx: factorize frames priorities

2016-05-06 Thread Vivien Didelot
6131 is the only driver to setup the priority of IGMP/MLD snoop frames
and ARP frames to the highest setting. Drop such change until we figure
out a common configuration for all switch models.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/mv88e6131.c | 18 --
 1 file changed, 18 deletions(-)

diff --git a/drivers/net/dsa/mv88e6131.c b/drivers/net/dsa/mv88e6131.c
index b6ca07b..da28327 100644
--- a/drivers/net/dsa/mv88e6131.c
+++ b/drivers/net/dsa/mv88e6131.c
@@ -58,20 +58,6 @@ static const char *mv88e6131_drv_probe(struct device 
*dsa_dev,
   ARRAY_SIZE(mv88e6131_table));
 }
 
-static int mv88e6131_setup_global(struct dsa_switch *ds)
-{
-   struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
-
-   /* Force the priority of IGMP/MLD snoop frames and ARP frames
-* to the highest setting.
-*/
-   return mv88e6xxx_reg_write(ps, REG_GLOBAL2, GLOBAL2_PRIO_OVERRIDE,
-  GLOBAL2_PRIO_OVERRIDE_FORCE_SNOOP |
-  7 << GLOBAL2_PRIO_OVERRIDE_SNOOP_SHIFT |
-  GLOBAL2_PRIO_OVERRIDE_FORCE_ARP |
-  7 << GLOBAL2_PRIO_OVERRIDE_ARP_SHIFT);
-}
-
 static int mv88e6131_setup(struct dsa_switch *ds)
 {
struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
@@ -83,10 +69,6 @@ static int mv88e6131_setup(struct dsa_switch *ds)
if (ret < 0)
return ret;
 
-   ret = mv88e6131_setup_global(ds);
-   if (ret < 0)
-   return ret;
-
return mv88e6xxx_setup_ports(ds);
 }
 
-- 
2.8.2



[PATCH net-next 20/21] net: dsa: mv88e6xxx: factorize tag protocol

2016-05-06 Thread Vivien Didelot
6131 is the only driver to set the tag protocol to DSA_TAG_PROTO_DSA.
Since it works fine with DSA_TAG_PROTO_EDSA, change its value, like all
other mv88e6xxx drivers.

Signed-off-by: Vivien Didelot 
---
 drivers/net/dsa/Kconfig | 2 +-
 drivers/net/dsa/mv88e6131.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/dsa/Kconfig b/drivers/net/dsa/Kconfig
index 4aaadce..7e01dce 100644
--- a/drivers/net/dsa/Kconfig
+++ b/drivers/net/dsa/Kconfig
@@ -17,7 +17,7 @@ config NET_DSA_MV88E6131
tristate "Marvell 88E6085/6095/6095F/6131 ethernet switch chip support"
depends on NET_DSA
select NET_DSA_MV88E6XXX
-   select NET_DSA_TAG_DSA
+   select NET_DSA_TAG_EDSA
---help---
  This enables support for the Marvell 88E6085/6095/6095F/6131
  ethernet switch chips.
diff --git a/drivers/net/dsa/mv88e6131.c b/drivers/net/dsa/mv88e6131.c
index 22952be..bbad199 100644
--- a/drivers/net/dsa/mv88e6131.c
+++ b/drivers/net/dsa/mv88e6131.c
@@ -59,7 +59,7 @@ static const char *mv88e6131_drv_probe(struct device *dsa_dev,
 }
 
 struct dsa_switch_driver mv88e6131_switch_driver = {
-   .tag_protocol   = DSA_TAG_PROTO_DSA,
+   .tag_protocol   = DSA_TAG_PROTO_EDSA,
.probe  = mv88e6131_drv_probe,
.setup  = mv88e6xxx_setup,
.set_addr   = mv88e6xxx_set_addr,
-- 
2.8.2



Re: [REGRESSION] asix: Lots of asix_rx_fixup() errors and slow transmissions

2016-05-06 Thread John Stultz
On Tue, May 3, 2016 at 2:16 PM, Dean Jenkins  wrote:
>
> [  239.027993] asix 1-1.1:1.0 eth0: asix_rx_fixup() Data Header
> synchronisation was lost, remaining 988
>
> This error message consistently shows the remaining value to be 988, at
> least for the 3 examples provided by John. This does not suggest a random
> failure unless there are other examples of a non 988 remaining value error
> message. 988 is well within a Ethernet frame length so seems to be valid.
>
> I think a good step would be to add some debug to print the rx->remaining
> value at entry to asix_rx_fixup_internal(). This would generate a lot of
> debug but a pattern of the values might emerge.


So I've been trying to add some print messages here to better
understand whats going on.

Again, I'm a bit new to this code, so forgive me for my lack of
understanding things. Since the remaining value seems to be key, I
tried to look around and figure out where it was being set. It seems
like its only set in this function, is that right?  So this made me
guess something might be happening in a previous iteration that was
causing this to trigger.

I added some debug prints to every time we set the remaining value, or
modify it, as well as to print the value if we enter the fixup
function with a non-zero remaining value.

When we set the remaining value, usually its to 1514, when the skblen is 1518.

However, right before we catch the problem, I see this:


[   84.844337] JDB set remaining to 1514 (skblen: 1518)
[   84.844379] JDB set remaining to 1514 (skblen: 1518)
[   84.844429] JDB set remaining to 1514 (skblen: 1518)
[   84.844458] JDB set remaining to 1514 (skblen: 1518)
[   84.844483] JDB set remaining to 1514 (skblen: 1518)
[   84.844507] JDB set remaining to 1514 (skblen: 1518)
[   84.844559] JDB set remaining to 1514 (skblen: 2048)
[   84.844583] JDB set remaining to 1514 (skblen: 2048)
[   84.844595] JDB: 1514 > 2048 - 1522
[   84.844606] JDB: dropping remaining by 526
[   84.844624] asix_rx_fixup_internal()  remaining: 988,  skb->len: 2048
[   84.844672] asix 1-5:1.0 eth1: asix_rx_fixup() Data Header
synchronisation was lost, remaining 988
[   84.844945] asix 1-5:1.0 eth1: asix_rx_fixup() Bad Header Length
0xdd5f8f9b, offset 4
[   84.845217] asix 1-5:1.0 eth1: asix_rx_fixup() Bad Header Length
0x3ce1ad3c, offset 4
[   84.845451] JDB set remaining to 1514 (skblen: 1518)
[   84.845485] JDB set remaining to 1514 (skblen: 1518)
[   84.845511] JDB set remaining to 1514 (skblen: 1518)
[   84.851003] JDB set remaining to 1514 (skblen: 1518)


So when the issue happens, it seems to be due to an larger then usual
skb (2048). The first time through the wile loop we set the remaining
to 1514, but offset is fairly small, so we set copy_length to 1514,
and clear remaining. The offset is bumped by a little more then the
copy length and we loop again.  Then the second time through we set
remaining to 1514, but since offset is bigger now, the if
(rx->remaining > skb->len - offset)  case is true..

This is where it feels a little strange..

We calculate the copy_length as the difference between the offset and
the skb->len (so how much is left in the skb, which is 526), then
decrement remaining by that amount.  Not really sure what remaining
(now 988) is supposed to represent here. We copy the 526 bytes, and
then exit the loop.

Now the next time we are called, we enter and we have a remaining
value still of 988, which triggers the header synchronization error
path.

Now, I'm not sure if the remainder handling logic is sane, or if the
skb->len being 2048 is problematic, or what.  The skb->lens can vary
in sizes, usually 1518 during high throughput, but I've seen 1588,
1640, and other larger numbers that don't trigger the same problem.

ie:
[  106.946473] JDB set remaining to 1514 (skblen: 1518)
[  106.946525] JDB set remaining to 1514 (skblen: 1640)
[  106.946546] JDB set remaining to 118 (skblen: 1640)
[  106.946586] JDB set remaining to 1514 (skblen: 1518)


So yea.. maybe that will help clue things in a bit? I'm still a bit lost. :)

thanks
-john


Re: [RFC PATCH net-next 19/20] net: dsa: mv88e6xxx: factorize tag protocol

2016-05-06 Thread Vivien Didelot
Hi Andrew,

Andrew Lunn  writes:

> On Thu, May 05, 2016 at 06:41:03PM -0400, Vivien Didelot wrote:
>> 6131 is the only driver to set the tag protocol to DSA_TAG_PROTO_DSA.
>> Since it works fine with DSA_TAG_PROTO_EDSA, change its value, like all
>> other mv88e6xxx drivers.
>
> You might as well remove net/dsa/tag_dsa.c as well, since it is now
> unused. Please do that as a separate patch.

Why not, but isn't it a bit out of the scope of this patchset?

Thanks,

Vivien


Re: [PATCH v2 next-next 02/12] net: define gso types for IPx over IPv4 and IPv6

2016-05-06 Thread Alexander Duyck
On Fri, May 6, 2016 at 1:43 PM, Tom Herbert  wrote:
> On Fri, May 6, 2016 at 1:34 PM, Alexander Duyck
>  wrote:
>> On Fri, May 6, 2016 at 1:04 PM, Tom Herbert  wrote:
>>> This patch defines two new GDO definitions SKB_GSO_IPXIP4 and
>>> SKB_GSO_IPXIP6 along with corresponding NETIF_F_GSO_IPXIP4 and
>>> NETIF_F_GSO_IPXIP6. These are used to described IP in IP
>>> tunnel and what the outer protocol is. The inner protocol
>>> can be deduced from other GSO types (e.g. SKB_GSO_TCPV4 and
>>> SKB_GSO_TCPV6). The GSO types of SKB_GSO_IPIP and SKB_GSO_SIT
>>> are removed (these are both instances of SKB_GSO_IPXIP4).
>>> SKB_GSO_IPXIP6 will be used when support for GSO with IP
>>> encapsulation over IPv6 is added.
>>>
>>> Signed-off-by: Tom Herbert 
>>> ---
>>>  drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c  |  5 ++---
>>>  drivers/net/ethernet/broadcom/bnxt/bnxt.c |  4 ++--
>>>  drivers/net/ethernet/intel/i40e/i40e_main.c   |  3 +--
>>>  drivers/net/ethernet/intel/i40e/i40e_txrx.c   |  3 +--
>>>  drivers/net/ethernet/intel/i40evf/i40e_txrx.c |  3 +--
>>>  drivers/net/ethernet/intel/i40evf/i40evf_main.c   |  3 +--
>>>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |  3 +--
>>>  drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c |  3 +--
>>>  include/linux/netdev_features.h   | 12 ++--
>>>  include/linux/netdevice.h |  4 ++--
>>>  include/linux/skbuff.h|  4 ++--
>>>  net/core/ethtool.c|  4 ++--
>>>  net/ipv4/af_inet.c|  2 +-
>>>  net/ipv4/ipip.c   |  2 +-
>>>  net/ipv6/ip6_offload.c|  4 ++--
>>>  net/ipv6/sit.c|  4 ++--
>>>  net/netfilter/ipvs/ip_vs_xmit.c   | 11 ++-
>>>  17 files changed, 30 insertions(+), 44 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c 
>>> b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
>>> index d465bd7..0a5b770 100644
>>> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
>>> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
>>> @@ -13259,12 +13259,11 @@ static int bnx2x_init_dev(struct bnx2x *bp, 
>>> struct pci_dev *pdev,
>>> NETIF_F_RXHASH | NETIF_F_HW_VLAN_CTAG_TX;
>>> if (!chip_is_e1x) {
>>> dev->hw_features |= NETIF_F_GSO_GRE | 
>>> NETIF_F_GSO_UDP_TUNNEL |
>>> -   NETIF_F_GSO_IPIP | NETIF_F_GSO_SIT;
>>> +   NETIF_F_GSO_IPXIP4;
>>> dev->hw_enc_features =
>>> NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | NETIF_F_SG |
>>> NETIF_F_TSO | NETIF_F_TSO_ECN | NETIF_F_TSO6 |
>>> -   NETIF_F_GSO_IPIP |
>>> -   NETIF_F_GSO_SIT |
>>> +   NETIF_F_GSO_IPXIP4 |
>>> NETIF_F_GSO_GRE | NETIF_F_GSO_UDP_TUNNEL;
>>> }
>>>
>>> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
>>> b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
>>> index fd85b6d..e449228 100644
>>> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
>>> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
>>> @@ -6218,7 +6218,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const 
>>> struct pci_device_id *ent)
>>> dev->hw_features = NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | NETIF_F_SG 
>>> |
>>>NETIF_F_TSO | NETIF_F_TSO6 |
>>>NETIF_F_GSO_UDP_TUNNEL | NETIF_F_GSO_GRE |
>>> -  NETIF_F_GSO_IPIP | NETIF_F_GSO_SIT |
>>> +  NETIF_F_GSO_IPXIP4 |
>>>NETIF_F_GSO_UDP_TUNNEL_CSUM | 
>>> NETIF_F_GSO_GRE_CSUM |
>>>NETIF_F_GSO_PARTIAL | NETIF_F_RXHASH |
>>>NETIF_F_RXCSUM | NETIF_F_LRO | NETIF_F_GRO;
>>> @@ -6228,7 +6228,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const 
>>> struct pci_device_id *ent)
>>> NETIF_F_TSO | NETIF_F_TSO6 |
>>> NETIF_F_GSO_UDP_TUNNEL | NETIF_F_GSO_GRE |
>>> NETIF_F_GSO_UDP_TUNNEL_CSUM | NETIF_F_GSO_GRE_CSUM |
>>> -   NETIF_F_GSO_IPIP | NETIF_F_GSO_SIT |
>>> +   NETIF_F_GSO_IPXIP4;
>>> NETIF_F_GSO_PARTIAL;
>>> dev->gso_partial_features = NETIF_F_GSO_UDP_TUNNEL_CSUM |
>>> NETIF_F_GSO_GRE_CSUM;
>>> diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
>>> b/drivers/net/ethernet/intel/i40e/i40e_main.c
>>> index f6da6b7..c2a4c10 100644
>>> --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
>>> +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
>>> @@ -9131,8 +9131,7 @@ static int 

Re: [REGRESSION] asix: Lots of asix_rx_fixup() errors and slow transmissions

2016-05-06 Thread David B. Robins

On 2016-05-04 03:58, Dean Jenkins wrote:

On 04/05/16 01:28, David B. Robins wrote:




Here is the code snippet from the patch with my annotations between #
#, I will try to explain my intentions. Feel free to point out any
flaws:

if (rx->remaining && (rx->remaining + sizeof(u32) <= skb->len)) {
# Only runs when rx->remaining !=0 and the end of the 
Ethernet

frame + next 32-bit header word is within the URB buffer. #
# Therefore, this code does not run when the end of an
Ethernet frame has been reached in the previous URB #
# or when the end of the Ethernet frame + next 32-bit header
word will be in a later URB buffer #


It may well be. I don't have the setup with me now, but I can try 
tomorrow to reproduce an environment where I can add some more 
detailed logging.


Since the URB length has to be >= than the remaining data plus a u32, 
the devices that John Stultz and I are using (AX88772B in my case) may 
be adding some additional data/padding after an Ethernet frame, 
expecting it to be discarded, and running into this check and its 
consequences. This may mean the device is badly behaved, if it is 
specified not to send anything extra; in any case, a well-intentioned 
error correction has gone badly, but I better understand the intent 
now. I am curious to know how often the device you are using benefits 
from this block of code.


The issue is that the driver should be robust to cope with missing
URBs. Whilst testing with D-Link DUB-E100 C1 AX88772 USB to Ethernet
adaptor in our ARM embedded system which runs in hostile environments,
it was noticed that URBs could be lost (probably due to a bug
elsewhere or low memory issue). Without this patch, a missing URB
causes bad Ethernet frames to be passed up to the IP stack because
rx->remaining spans multiple URBs.

In the good case of an Ethernet frame spanning 2 URBs, the 1st URB is
processed and copies the 1st part of the Ethernet frame into the
netdev buffer, for the 2nd URB the remaining part of the Ethernet
frame is copied into the same netdev buffer to complete the Ethernet
frame. The netdev buffer is then sent up to the IP stack.

In the case of a missing URB, a bad Ethernet frame is created as 
follows:

The 1st URB is processed and copies the 1st part of the Ethernet frame
into the netdev buffer, the 2nd URB is lost (somehow),  the 3rd URB is
processed and blindly copies what it thinks is the remaining part of
the Ethernet frame in the same netdev buffer which corrupts the
Ethernet frame. The netdev buffer is then sent up to the IP stack. The
3rd URB and subsequent URBs are processed but synchronisation has been
lost so can misread data as a 32-bit header word. It is likely that
some good Ethernet frames get discarded whilst trying to
resynchronise.

A recovery strategy for regaining lock with the 32-bit header word is
necessary otherwise the driver will have difficulty in recovering from
a lost URB.

In the "olden days", the 32-bit header word was always at the start of
the URB buffer so previous URBs did not influence the current URB. So
no recovery strategy was needed at that time. But now we have to
remember what happened in the previous URB and a lost URB can cause a
discontinuity in the data stream because the data is not always
aligned to the start of the URB buffer.

I agree that your environment may never suffer from lost URBs so
removal of the patch would work OK.

I will try to find some time to setup a test environment.


I got the chance to add some more logging (and add back the header 
synchronization code) to a system which is receiving video data on an 
8-port POE device using AX88772Bs - it replaces the "Data header 
synchronization was lost" log and logs a few more local items ("u32" = 
rx->header):


[3105478.073908] asix 1-1.4.2.2:1.0 poe7: asix_rx_fixup() SYNC_LOSS 
remain 988 len 2048 offset 992 u32 0xecf95000
[3105478.073918] asix 1-1.4.2.2:1.0 poe7: asix_rx_fixup() Bad Header 
Length 0xf05b7c5f, offset 4
[3105478.074022] asix 1-1.4.2.2:1.0 poe7: asix_rx_fixup() skb->len = 
2048
[3105478.074032] asix 1-1.4.2.2:1.0 poe7: asix_rx_fixup() Bad Header 
Length 0xffaae2a5, offset 4
[3105478.074146] asix 1-1.4.1.4:1.0 poe6: asix_rx_fixup() skb->len = 
1518
[3105478.074280] asix 1-1.4.2.2:1.0 poe7: asix_rx_fixup() skb->len = 
2048
[3105478.074291] asix 1-1.4.2.2:1.0 poe7: asix_rx_fixup() Bad Header 
Length 0xe1478400, offset 4

[3105478.074399] asix 1-1.4.2.2:1.0 poe7: asix_rx_fixup() skb->len = 916
[3105478.074409] asix 1-1.4.2.2:1.0 poe7: asix_rx_fixup() Bad Header 
Length 0x9fa84eb5, offset 4
[3105478.074657] asix 1-1.4.1.4:1.0 poe6: asix_rx_fixup() skb->len = 
1518


Grepping for just the SYNC_LOSS messages shows a similar pattern to that 
seen elsewhere in the thread:


[3104761.056033] asix 1-1.4.1.4:1.0 poe6: asix_rx_fixup() SYNC_LOSS 
remain 988 len 2048 offset 992 u32 0xecf95000
[3104761.058035] asix 1-1.4.1.3:1.0 poe2: asix_rx_fixup() SYNC_LOSS 
remain 988 len 2048 offset 992 u32 

Re: [PATCH v2 next-next 02/12] net: define gso types for IPx over IPv4 and IPv6

2016-05-06 Thread Tom Herbert
On Fri, May 6, 2016 at 1:34 PM, Alexander Duyck
 wrote:
> On Fri, May 6, 2016 at 1:04 PM, Tom Herbert  wrote:
>> This patch defines two new GDO definitions SKB_GSO_IPXIP4 and
>> SKB_GSO_IPXIP6 along with corresponding NETIF_F_GSO_IPXIP4 and
>> NETIF_F_GSO_IPXIP6. These are used to described IP in IP
>> tunnel and what the outer protocol is. The inner protocol
>> can be deduced from other GSO types (e.g. SKB_GSO_TCPV4 and
>> SKB_GSO_TCPV6). The GSO types of SKB_GSO_IPIP and SKB_GSO_SIT
>> are removed (these are both instances of SKB_GSO_IPXIP4).
>> SKB_GSO_IPXIP6 will be used when support for GSO with IP
>> encapsulation over IPv6 is added.
>>
>> Signed-off-by: Tom Herbert 
>> ---
>>  drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c  |  5 ++---
>>  drivers/net/ethernet/broadcom/bnxt/bnxt.c |  4 ++--
>>  drivers/net/ethernet/intel/i40e/i40e_main.c   |  3 +--
>>  drivers/net/ethernet/intel/i40e/i40e_txrx.c   |  3 +--
>>  drivers/net/ethernet/intel/i40evf/i40e_txrx.c |  3 +--
>>  drivers/net/ethernet/intel/i40evf/i40evf_main.c   |  3 +--
>>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |  3 +--
>>  drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c |  3 +--
>>  include/linux/netdev_features.h   | 12 ++--
>>  include/linux/netdevice.h |  4 ++--
>>  include/linux/skbuff.h|  4 ++--
>>  net/core/ethtool.c|  4 ++--
>>  net/ipv4/af_inet.c|  2 +-
>>  net/ipv4/ipip.c   |  2 +-
>>  net/ipv6/ip6_offload.c|  4 ++--
>>  net/ipv6/sit.c|  4 ++--
>>  net/netfilter/ipvs/ip_vs_xmit.c   | 11 ++-
>>  17 files changed, 30 insertions(+), 44 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c 
>> b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
>> index d465bd7..0a5b770 100644
>> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
>> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
>> @@ -13259,12 +13259,11 @@ static int bnx2x_init_dev(struct bnx2x *bp, struct 
>> pci_dev *pdev,
>> NETIF_F_RXHASH | NETIF_F_HW_VLAN_CTAG_TX;
>> if (!chip_is_e1x) {
>> dev->hw_features |= NETIF_F_GSO_GRE | NETIF_F_GSO_UDP_TUNNEL 
>> |
>> -   NETIF_F_GSO_IPIP | NETIF_F_GSO_SIT;
>> +   NETIF_F_GSO_IPXIP4;
>> dev->hw_enc_features =
>> NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | NETIF_F_SG |
>> NETIF_F_TSO | NETIF_F_TSO_ECN | NETIF_F_TSO6 |
>> -   NETIF_F_GSO_IPIP |
>> -   NETIF_F_GSO_SIT |
>> +   NETIF_F_GSO_IPXIP4 |
>> NETIF_F_GSO_GRE | NETIF_F_GSO_UDP_TUNNEL;
>> }
>>
>> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
>> b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
>> index fd85b6d..e449228 100644
>> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
>> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
>> @@ -6218,7 +6218,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const 
>> struct pci_device_id *ent)
>> dev->hw_features = NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | NETIF_F_SG |
>>NETIF_F_TSO | NETIF_F_TSO6 |
>>NETIF_F_GSO_UDP_TUNNEL | NETIF_F_GSO_GRE |
>> -  NETIF_F_GSO_IPIP | NETIF_F_GSO_SIT |
>> +  NETIF_F_GSO_IPXIP4 |
>>NETIF_F_GSO_UDP_TUNNEL_CSUM | 
>> NETIF_F_GSO_GRE_CSUM |
>>NETIF_F_GSO_PARTIAL | NETIF_F_RXHASH |
>>NETIF_F_RXCSUM | NETIF_F_LRO | NETIF_F_GRO;
>> @@ -6228,7 +6228,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const 
>> struct pci_device_id *ent)
>> NETIF_F_TSO | NETIF_F_TSO6 |
>> NETIF_F_GSO_UDP_TUNNEL | NETIF_F_GSO_GRE |
>> NETIF_F_GSO_UDP_TUNNEL_CSUM | NETIF_F_GSO_GRE_CSUM |
>> -   NETIF_F_GSO_IPIP | NETIF_F_GSO_SIT |
>> +   NETIF_F_GSO_IPXIP4;
>> NETIF_F_GSO_PARTIAL;
>> dev->gso_partial_features = NETIF_F_GSO_UDP_TUNNEL_CSUM |
>> NETIF_F_GSO_GRE_CSUM;
>> diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
>> b/drivers/net/ethernet/intel/i40e/i40e_main.c
>> index f6da6b7..c2a4c10 100644
>> --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
>> +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
>> @@ -9131,8 +9131,7 @@ static int i40e_config_netdev(struct i40e_vsi *vsi)
>>NETIF_F_TSO6 |
>>NETIF_F_GSO_GRE  |
>>   

Re: [PATCH] netdev: enc28j60 kernel panic fix.

2016-05-06 Thread Francois Romieu
(please don't top post)

David Russell  :
> I kind of thought my patch was at best incomplete.  When you state
> this change silences the bug but does not fix it, what are the
> implications of systems running this patch?  We have some production
> systems using this patch.  They reboot daily, but have been solid.

If my assumption is right it should drop an extra packet here and there.
No leak.

However transmit errors + transmit packets should still match the number
of times the driver calls enc28j60_send_packet (you would have to cook
your own stat to check the latter though).

> In addition, if we sent you a pi and the ethernet controller and a
> small but reasonable sum of money for your labor, would you be able to
> properly fix it ?

I'd rather see you testing my crap. :o)

Pi as multi-core (the expected race needs several cores or a netconsole
style transmit from an irq/bh context) ?

> Short of that, do you have any recommendations on quick overviews of
> the networking stack in the kernel and then documentation on the
> various flags and such?

A tad bit too high-level a question... Plain ctags + printk for a start ?

Does the patch below make a difference ?

Takes longer to crash counts as a difference.

diff --git a/drivers/net/ethernet/microchip/enc28j60.c 
b/drivers/net/ethernet/microchip/enc28j60.c
index 7066954..405fe3f 100644
--- a/drivers/net/ethernet/microchip/enc28j60.c
+++ b/drivers/net/ethernet/microchip/enc28j60.c
@@ -1170,7 +1170,8 @@ static void enc28j60_irq_work_handler(struct work_struct 
*work)
enc28j60_dump_tsv(priv, "Tx Done", tsv);
}
enc28j60_tx_clear(ndev, err);
-   locked_reg_bfclr(priv, EIR, EIR_TXIF);
+   locked_reg_bfclr(priv, EIR, EIR_TXIF | EIR_TXERIF);
+   intflags &= ~EIR_TXERIF;
}
/* TX Error handler */
if ((intflags & EIR_TXERIF) != 0) {
@@ -1190,6 +1191,7 @@ static void enc28j60_irq_work_handler(struct work_struct 
*work)
nolock_reg_bfclr(priv, ECON1, ECON1_TXRST);
nolock_txfifo_init(priv, TXSTART_INIT, TXEND_INIT);
mutex_unlock(>lock);
+   locked_reg_bfclr(priv, EIR, EIR_TXIF | EIR_TXERIF);
/* Transmit Late collision check for retransmit */
if (TSV_GETBIT(tsv, TSV_TXLATECOLLISION)) {
if (netif_msg_tx_err(priv))
@@ -1203,7 +1205,6 @@ static void enc28j60_irq_work_handler(struct work_struct 
*work)
enc28j60_tx_clear(ndev, true);
} else
enc28j60_tx_clear(ndev, true);
-   locked_reg_bfclr(priv, EIR, EIR_TXERIF);
}
/* RX Error handler */
if ((intflags & EIR_RXERIF) != 0) {


Re: [PATCH v2 next-next 02/12] net: define gso types for IPx over IPv4 and IPv6

2016-05-06 Thread Alexander Duyck
On Fri, May 6, 2016 at 1:04 PM, Tom Herbert  wrote:
> This patch defines two new GDO definitions SKB_GSO_IPXIP4 and
> SKB_GSO_IPXIP6 along with corresponding NETIF_F_GSO_IPXIP4 and
> NETIF_F_GSO_IPXIP6. These are used to described IP in IP
> tunnel and what the outer protocol is. The inner protocol
> can be deduced from other GSO types (e.g. SKB_GSO_TCPV4 and
> SKB_GSO_TCPV6). The GSO types of SKB_GSO_IPIP and SKB_GSO_SIT
> are removed (these are both instances of SKB_GSO_IPXIP4).
> SKB_GSO_IPXIP6 will be used when support for GSO with IP
> encapsulation over IPv6 is added.
>
> Signed-off-by: Tom Herbert 
> ---
>  drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c  |  5 ++---
>  drivers/net/ethernet/broadcom/bnxt/bnxt.c |  4 ++--
>  drivers/net/ethernet/intel/i40e/i40e_main.c   |  3 +--
>  drivers/net/ethernet/intel/i40e/i40e_txrx.c   |  3 +--
>  drivers/net/ethernet/intel/i40evf/i40e_txrx.c |  3 +--
>  drivers/net/ethernet/intel/i40evf/i40evf_main.c   |  3 +--
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |  3 +--
>  drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c |  3 +--
>  include/linux/netdev_features.h   | 12 ++--
>  include/linux/netdevice.h |  4 ++--
>  include/linux/skbuff.h|  4 ++--
>  net/core/ethtool.c|  4 ++--
>  net/ipv4/af_inet.c|  2 +-
>  net/ipv4/ipip.c   |  2 +-
>  net/ipv6/ip6_offload.c|  4 ++--
>  net/ipv6/sit.c|  4 ++--
>  net/netfilter/ipvs/ip_vs_xmit.c   | 11 ++-
>  17 files changed, 30 insertions(+), 44 deletions(-)
>
> diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c 
> b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> index d465bd7..0a5b770 100644
> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
> @@ -13259,12 +13259,11 @@ static int bnx2x_init_dev(struct bnx2x *bp, struct 
> pci_dev *pdev,
> NETIF_F_RXHASH | NETIF_F_HW_VLAN_CTAG_TX;
> if (!chip_is_e1x) {
> dev->hw_features |= NETIF_F_GSO_GRE | NETIF_F_GSO_UDP_TUNNEL |
> -   NETIF_F_GSO_IPIP | NETIF_F_GSO_SIT;
> +   NETIF_F_GSO_IPXIP4;
> dev->hw_enc_features =
> NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | NETIF_F_SG |
> NETIF_F_TSO | NETIF_F_TSO_ECN | NETIF_F_TSO6 |
> -   NETIF_F_GSO_IPIP |
> -   NETIF_F_GSO_SIT |
> +   NETIF_F_GSO_IPXIP4 |
> NETIF_F_GSO_GRE | NETIF_F_GSO_UDP_TUNNEL;
> }
>
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
> b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index fd85b6d..e449228 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> @@ -6218,7 +6218,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const 
> struct pci_device_id *ent)
> dev->hw_features = NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | NETIF_F_SG |
>NETIF_F_TSO | NETIF_F_TSO6 |
>NETIF_F_GSO_UDP_TUNNEL | NETIF_F_GSO_GRE |
> -  NETIF_F_GSO_IPIP | NETIF_F_GSO_SIT |
> +  NETIF_F_GSO_IPXIP4 |
>NETIF_F_GSO_UDP_TUNNEL_CSUM | NETIF_F_GSO_GRE_CSUM 
> |
>NETIF_F_GSO_PARTIAL | NETIF_F_RXHASH |
>NETIF_F_RXCSUM | NETIF_F_LRO | NETIF_F_GRO;
> @@ -6228,7 +6228,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const 
> struct pci_device_id *ent)
> NETIF_F_TSO | NETIF_F_TSO6 |
> NETIF_F_GSO_UDP_TUNNEL | NETIF_F_GSO_GRE |
> NETIF_F_GSO_UDP_TUNNEL_CSUM | NETIF_F_GSO_GRE_CSUM |
> -   NETIF_F_GSO_IPIP | NETIF_F_GSO_SIT |
> +   NETIF_F_GSO_IPXIP4;
> NETIF_F_GSO_PARTIAL;
> dev->gso_partial_features = NETIF_F_GSO_UDP_TUNNEL_CSUM |
> NETIF_F_GSO_GRE_CSUM;
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
> b/drivers/net/ethernet/intel/i40e/i40e_main.c
> index f6da6b7..c2a4c10 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
> @@ -9131,8 +9131,7 @@ static int i40e_config_netdev(struct i40e_vsi *vsi)
>NETIF_F_TSO6 |
>NETIF_F_GSO_GRE  |
>NETIF_F_GSO_GRE_CSUM |
> -  NETIF_F_GSO_IPIP |
> -  

Re: [PATCH v2 next-next 04/12] udp: Don't set skb->encapsulation with RCO

2016-05-06 Thread Alexander Duyck
On Fri, May 6, 2016 at 1:04 PM, Tom Herbert  wrote:
> When RCO is in effect we want to ensure that the outer checksum is
> properly offloaded. Don't set skb->encapsulation in this case to
> ensure that checksum offload is later considered for hw_features
> instead of hw_enc_features.
>
> Signed-off-by: Tom Herbert 
> ---
>  net/ipv4/udp_offload.c | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> index b556ef6..92a9222 100644
> --- a/net/ipv4/udp_offload.c
> +++ b/net/ipv4/udp_offload.c
> @@ -94,11 +94,12 @@ static struct sk_buff *__skb_udp_tunnel_segment(struct 
> sk_buff *skb,
> do {
> unsigned int len;
>
> -   if (remcsum)
> +   if (remcsum) {
> skb->ip_summed = CHECKSUM_NONE;
> -
> -   /* Set up inner headers if we are offloading inner checksum */
> -   if (skb->ip_summed == CHECKSUM_PARTIAL) {
> +   } else if (skb->ip_summed == CHECKSUM_PARTIAL) {
> +   /* Set up inner headers if we are offloading inner
> +* checksum
> +*/
> skb_reset_inner_headers(skb);
> skb->encapsulation = 1;
> }

You can just drop this patch.  It doesn't actually change anything
behaviorwise.  If remcsum is set then skb->ip_summed is set to
CHECKSUM_NONE so it isn't going to hit the CHECKSUM_PARTIAL code so
you don't need the else.

- Alex


[patch net-next] mlxsw: spectrum: Fix ordering in mlxsw_sp_fini

2016-05-06 Thread Jiri Pirko
From: Jiri Pirko 

Fixes: 0f433fa0ec ("mlxsw: spectrum_buffers: Implement shared buffer 
configuration")
Signed-off-by: Jiri Pirko 
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c 
b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 4758516..4a72737 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -2449,8 +2449,8 @@ static void mlxsw_sp_fini(struct mlxsw_core *mlxsw_core)
 {
struct mlxsw_sp *mlxsw_sp = mlxsw_core_driver_priv(mlxsw_core);
 
-   mlxsw_sp_buffers_fini(mlxsw_sp);
mlxsw_sp_switchdev_fini(mlxsw_sp);
+   mlxsw_sp_buffers_fini(mlxsw_sp);
mlxsw_sp_traps_fini(mlxsw_sp);
mlxsw_sp_event_unregister(mlxsw_sp, MLXSW_TRAP_ID_PUDE);
mlxsw_sp_ports_remove(mlxsw_sp);
-- 
2.5.5



Re: [PATCH net v3 2/2] udp_offload: Set encapsulation before inner completes.

2016-05-06 Thread Tom Herbert
On Fri, May 6, 2016 at 12:34 PM, David Miller  wrote:
> From: Jarno Rajahalme 
> Date: Tue,  3 May 2016 16:10:21 -0700
>
>> UDP tunnel segmentation code relies on the inner offsets being set for
>> an UDP tunnel GSO packet, but the inner *_complete() functions will
>> set the inner offsets only if 'encapsulation' is set before calling
>> them.  Currently, udp_gro_complete() sets 'encapsulation' only after
>> the inner *_complete() functions are done.  This causes the inner
>> offsets having invalid values after udp_gro_complete() returns, which
>> in turn will make it impossible to properly segment the packet in case
>> it needs to be forwarded, which would be visible to the user either as
>> invalid packets being sent or as packet loss.
>>
>> This patch fixes this by setting skb's 'encapsulation' in
>> udp_gro_complete() before calling into the inner complete functions,
>> and by making each possible UDP tunnel gro_complete() callback set the
>> inner_mac_header to the beginning of the tunnel payload.
>>
>> Signed-off-by: Jarno Rajahalme 
>> ---
>> v3: Added setting inner_mac_header from all possible callbacks to cover
>> cases where there is no inner mac header.
>
> Alex and Tom, can you please review this new version since you guys had
> so much feedback for v2?
>

I'm okay with the patch.Clarifying exactly what skb->encaspulation
means is future work.

> THanks.


[patch net 2/2] mlxsw: spectrum: Add missing rollback in flood configuration

2016-05-06 Thread Jiri Pirko
From: Ido Schimmel 

When we fail to set the flooding configuration for the broadcast and
unregistered multicast traffic, we should revert the flooding
configuration of the unknown unicast traffic.

Fixes: 0293038e0c36 ("mlxsw: spectrum: Add support for flood control")
Signed-off-by: Ido Schimmel 
Signed-off-by: Jiri Pirko 
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c 
b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
index e1c74ef..9cd6f47 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
@@ -214,7 +214,15 @@ static int __mlxsw_sp_port_flood_set(struct mlxsw_sp_port 
*mlxsw_sp_port,
mlxsw_reg_sftr_pack(sftr_pl, MLXSW_SP_FLOOD_TABLE_BM, idx_begin,
table_type, range, local_port, set);
err = mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(sftr), sftr_pl);
+   if (err)
+   goto err_flood_bm_set;
+   else
+   goto buffer_out;
 
+err_flood_bm_set:
+   mlxsw_reg_sftr_pack(sftr_pl, MLXSW_SP_FLOOD_TABLE_UC, idx_begin,
+   table_type, range, local_port, !set);
+   mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(sftr), sftr_pl);
 buffer_out:
kfree(sftr_pl);
return err;
-- 
2.5.5



[patch net 0/2] mlxsw: Couple of fixes

2016-05-06 Thread Jiri Pirko
From: Jiri Pirko 

Ido Schimmel (2):
  mlxsw: spectrum: Fix rollback order in LAG join failure
  mlxsw: spectrum: Add missing rollback in flood configuration

 drivers/net/ethernet/mellanox/mlxsw/spectrum.c   | 4 ++--
 drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c | 8 
 2 files changed, 10 insertions(+), 2 deletions(-)

-- 
2.5.5



[patch net 1/2] mlxsw: spectrum: Fix rollback order in LAG join failure

2016-05-06 Thread Jiri Pirko
From: Ido Schimmel 

Make the leave procedure in the error path symmetric to the join
procedure and first remove the port from the collector before
potentially destroying the LAG.

Fixes: 0d65fc13042f ("mlxsw: spectrum: Implement LAG port join/leave")
Signed-off-by: Ido Schimmel 
Signed-off-by: Jiri Pirko 
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c 
b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 4afbc3e..668b2f4 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -2541,11 +2541,11 @@ static int mlxsw_sp_port_lag_join(struct mlxsw_sp_port 
*mlxsw_sp_port,
lag->ref_count++;
return 0;
 
+err_col_port_enable:
+   mlxsw_sp_lag_col_port_remove(mlxsw_sp_port, lag_id);
 err_col_port_add:
if (!lag->ref_count)
mlxsw_sp_lag_destroy(mlxsw_sp, lag_id);
-err_col_port_enable:
-   mlxsw_sp_lag_col_port_remove(mlxsw_sp_port, lag_id);
return err;
 }
 
-- 
2.5.5



Re: [patch net 0/3] mlxsw: Couple of fixes

2016-05-06 Thread David Miller
From: Jiri Pirko 
Date: Fri, 6 May 2016 22:13:49 +0200

> Fri, May 06, 2016 at 10:00:32PM CEST, da...@davemloft.net wrote:
>>From: Jiri Pirko 
>>Date: Fri,  6 May 2016 11:17:21 +0200
>>
>>> From: Jiri Pirko 
>>> 
>>> Ido Schimmel (2):
>>>   mlxsw: spectrum: Fix rollback order in LAG join failure
>>>   mlxsw: spectrum: Add missing rollback in flood configuration
>>> 
>>> Jiri Pirko (1):
>>>   mlxsw: spectrum: Fix ordering in mlxsw_sp_fini
>>
>>What tree is this for?  Because on 'net' this makes the build fail.
>>
>>drivers/net/ethernet/mellanox/mlxsw/spectrum.c: In function ‘mlxsw_sp_fini’:
>>drivers/net/ethernet/mellanox/mlxsw/spectrum.c:2162:2: error: implicit 
>>declaration of function ‘mlxsw_sp_buffers_fini’ 
>>[-Werror=implicit-function-declaration]
>>
> 
> Oh. Ido's patches are for -net, mine if for -net-next. I did't realize,
> sorry. Do you want me to repost?

Yes, please post them properly.

Thanks.


Re: [patch net 0/3] mlxsw: Couple of fixes

2016-05-06 Thread Jiri Pirko
Fri, May 06, 2016 at 10:00:32PM CEST, da...@davemloft.net wrote:
>From: Jiri Pirko 
>Date: Fri,  6 May 2016 11:17:21 +0200
>
>> From: Jiri Pirko 
>> 
>> Ido Schimmel (2):
>>   mlxsw: spectrum: Fix rollback order in LAG join failure
>>   mlxsw: spectrum: Add missing rollback in flood configuration
>> 
>> Jiri Pirko (1):
>>   mlxsw: spectrum: Fix ordering in mlxsw_sp_fini
>
>What tree is this for?  Because on 'net' this makes the build fail.
>
>drivers/net/ethernet/mellanox/mlxsw/spectrum.c: In function ‘mlxsw_sp_fini’:
>drivers/net/ethernet/mellanox/mlxsw/spectrum.c:2162:2: error: implicit 
>declaration of function ‘mlxsw_sp_buffers_fini’ 
>[-Werror=implicit-function-declaration]
>

Oh. Ido's patches are for -net, mine if for -net-next. I did't realize,
sorry. Do you want me to repost?


Re: [PATCH 2/3] net/mlx5e: make VXLAN support conditional

2016-05-06 Thread Alexander Duyck
On Thu, May 5, 2016 at 11:09 AM, Arnd Bergmann  wrote:
> On Thursday 05 May 2016 19:44:36 Saeed Mahameed wrote:
>> On Wed, May 4, 2016 at 3:31 PM, Arnd Bergmann  wrote:
>> > VXLAN can be disabled at compile-time or it can be a loadable
>> > module while mlx5 is built-in, which leads to a link error:
>> >
>> > drivers/net/built-in.o: In function `mlx5e_create_netdev':
>> > ntb_netdev.c:(.text+0x106de4): undefined reference to `vxlan_get_rx_port'
>> >
>> > This avoids the link error and makes the vxlan code optional,
>> > like the other ethernet drivers do as well.
>> >
>> > Signed-off-by: Arnd Bergmann 
>> > Link: https://patchwork.ozlabs.org/patch/589296/
>> > Fixes: b3f63c3d5e2c ("net/mlx5e: Add netdev support for VXLAN tunneling")
>> > ---
>> > I sent it originally on Feb 26 2016, but misread Saeed Mahameed's
>> > reply as saying that he'd fix it up himself. The new version
>> > should address the original comment.
>> > ---
>>
>> Hi Arnd,
>>
>> I didn't post a fix up since it is not needed anymore, see
>> b7aade15485a ('vxlan: break dependency with netdev drivers') in
>> net-next.
>>
>> The new issue is introduced due to : "net/mlx5: Kconfig: Fix
>> MLX5_EN/VXLAN build issue" which was merged from net tree.
>>
>> Dave shouldn't have merged it into net-next, I explicitly asked him
>> that in the cover letter.  Maybe he missed it.
>>
>> I just checked and It is sufficient to only take the revert patch:
>> [PATCH 1/3] Revert "net/mlx5: Kconfig: Fix MLX5_EN/VXLAN build issue"
>> to net-next.
>>
>> Can you please confirm that with only the revert patch, you don't see
>> the issue ?
>
> Yes, it works, but not it is different from all the other drivers
> (MLX4, BENET, IXGBE, I40E, FM10K, QLCNIC, and QEDE). If the 'select
> VXLAN' is not the preferred way to handle this, we should change
> the other ones the same way, right?
>
> For reference, I've tried it out on the MLX4 driver, and it does
> seem nicer that way, see below.
>
> Arnd
> ---
> Signed-off-by: Arnd Bergmann 
> in case someone wants to pick up that patch and do the other
> ones as well.
>
> diff --git a/drivers/net/ethernet/mellanox/mlx4/Kconfig 
> b/drivers/net/ethernet/mellanox/mlx4/Kconfig
> index 9ca3734ebb6b..88fff4484200 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/Kconfig
> +++ b/drivers/net/ethernet/mellanox/mlx4/Kconfig
> @@ -6,6 +6,7 @@ config MLX4_EN
> tristate "Mellanox Technologies 1/10/40Gbit Ethernet support"
> depends on MAY_USE_DEVLINK
> depends on PCI
> +   select VXLAN
> select MLX4_CORE
> select PTP_1588_CLOCK
> ---help---

This piece is unnecessary and unwanted.  We just recently added the
ability to load the modules without the need for VXLAN.  Lets not take
that in the wrong direction by having the drivers select a module they
don't have to have.

The rest of this code is probably fine.  After the dependency was
broken via b7aade15485a ('vxlan: break dependency with netdev
drivers') you could probably go through and just pull all the VXLAN
ifdefs from all the drivers since I don't think there is anything that
explicitly relies on that module anymore as the only export still
hanging around is vxlan_dev_create and I don't think any Ethernet
drivers are directly spawning VXLAN interfaces.

> @@ -24,13 +25,6 @@ config MLX4_EN_DCB
>
>   If unsure, set to Y
>
> -config MLX4_EN_VXLAN
> -   bool "VXLAN offloads Support"
> -   default y
> -   depends on MLX4_EN && VXLAN && !(MLX4_EN=y && VXLAN=m)
> -   ---help---
> - Say Y here if you want to use VXLAN offloads in the driver.
> -
>  config MLX4_CORE
> tristate
> depends on PCI
> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c 
> b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> index 6f28ac58251c..ad887c425f2d 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> @@ -1691,10 +1691,8 @@ int mlx4_en_start_port(struct net_device *dev)
> /* Schedule multicast task to populate multicast list */
> queue_work(mdev->workqueue, >rx_mode_task);
>
> -#ifdef CONFIG_MLX4_EN_VXLAN
> if (priv->mdev->dev->caps.tunnel_offload_mode == 
> MLX4_TUNNEL_OFFLOAD_MODE_VXLAN)
> vxlan_get_rx_port(dev);
> -#endif
> priv->port_up = true;
> netif_tx_start_all_queues(dev);
> netif_device_attach(dev);
> @@ -2337,7 +2335,6 @@ static int mlx4_en_get_phys_port_id(struct net_device 
> *dev,
> return 0;
>  }
>
> -#ifdef CONFIG_MLX4_EN_VXLAN
>  static void mlx4_en_add_vxlan_offloads(struct work_struct *work)
>  {
> int ret;
> @@ -2448,7 +2445,6 @@ static netdev_features_t mlx4_en_features_check(struct 
> sk_buff *skb,
>
> return features;
>  }
> -#endif
>
>  static int mlx4_en_set_tx_maxrate(struct net_device *dev, int queue_index, 
> u32 maxrate)
>  {
> @@ -2501,11 +2497,9 @@ static 

  1   2   3   >