Re: [PATCH 0/6] Constant Time Memory Comparisons Are Important

2017-06-11 Thread Emmanuel Grumbach
On Mon, Jun 12, 2017 at 12:30 AM, Emil Lenngren <emil.lenng...@gmail.com> wrote:
> 2017-06-11 22:48 GMT+02:00 Emmanuel Grumbach <egrumb...@gmail.com>:
>> On Sun, Jun 11, 2017 at 4:36 PM, Kees Cook <keesc...@chromium.org> wrote:
>>>
>>> On Sun, Jun 11, 2017 at 1:13 AM, Kalle Valo <kv...@codeaurora.org> wrote:
>>> > "Jason A. Donenfeld" <ja...@zx2c4.com> writes:
>>> >
>>> >> Whenever you're comparing two MACs, it's important to do this using
>>> >> crypto_memneq instead of memcmp. With memcmp, you leak timing 
>>> >> information,
>>> >> which could then be used to iteratively forge a MAC.
>>> >
>>> > Do you have any pointers where I could learn more about this?
>>>
>>> While not using C specifically, this talks about the problem generally:
>>> https://www.chosenplaintext.ca/articles/beginners-guide-constant-time-cryptography.html
>>>
>>
>> Sorry for the stupid question, but the MAC address is in plaintext in
>> the air anyway or easily accessible via user space tools. I fail to
>> see what it is so secret about a MAC address in that code where that
>> same MAC address is accessible via myriads of ways.
>
> I think you're mixing up Media Access Control (MAC) addresses with
> Message Authentication Code (MAC). The second one is a cryptographic
> signature of a message.

Obviously... Sorry for the noise.


Re: [PATCH 0/6] Constant Time Memory Comparisons Are Important

2017-06-11 Thread Emmanuel Grumbach
On Sun, Jun 11, 2017 at 4:36 PM, Kees Cook  wrote:
>
> On Sun, Jun 11, 2017 at 1:13 AM, Kalle Valo  wrote:
> > "Jason A. Donenfeld"  writes:
> >
> >> Whenever you're comparing two MACs, it's important to do this using
> >> crypto_memneq instead of memcmp. With memcmp, you leak timing information,
> >> which could then be used to iteratively forge a MAC.
> >
> > Do you have any pointers where I could learn more about this?
>
> While not using C specifically, this talks about the problem generally:
> https://www.chosenplaintext.ca/articles/beginners-guide-constant-time-cryptography.html
>

Sorry for the stupid question, but the MAC address is in plaintext in
the air anyway or easily accessible via user space tools. I fail to
see what it is so secret about a MAC address in that code where that
same MAC address is accessible via myriads of ways.


Re: [PATCH RESEND] iwlwifi, Do not implement thermal zone unless ucode is loaded

2016-07-11 Thread Emmanuel Grumbach
On Mon, Jul 11, 2016 at 6:18 PM, Prarit Bhargava <pra...@redhat.com> wrote:
>
> Didn't get any feedback or review comments on this patch.  Resending ...
>
> P.

This change is obviously completely broken. It simply disables the
registration to thermal zone core.

>
> ---8<---
>
> The iwlwifi driver implements a thermal zone and hwmon device, but
> returns -EIO on temperature reads if the firmware isn't loaded.  This
> results in the error
>
> iwlwifi-virtual-0
> Adapter: Virtual device
> ERROR: Can't get value of subfeature temp1_input: I/O error
> temp1:N/A
>
> being output when using sensors from the lm-sensors package.  Since
> the temperature cannot be read unless the ucode is loaded there is no
> reason to add the interface only to have it return an error 100% of
> the time.
>
> This patch moves the firmware check to iwl_mvm_thermal_zone_register() and
> stops the thermal zone from being created if the ucode hasn't been loaded.
>
> Signed-off-by: Prarit Bhargava <pra...@redhat.com>
> Cc: Johannes Berg <johannes.b...@intel.com>
> Cc: Emmanuel Grumbach <emmanuel.grumb...@intel.com>
> Cc: Luca Coelho <luciano.coe...@intel.com>
> Cc: Intel Linux Wireless <linuxw...@intel.com>
> Cc: Kalle Valo <kv...@codeaurora.org>
> Cc: Chaya Rachel Ivgi <chaya.rachel.i...@intel.com>
> Cc: Sara Sharon <sara.sha...@intel.com>
> Cc: linux-wirel...@vger.kernel.org
> Cc: netdev@vger.kernel.org
> ---
>  drivers/net/wireless/intel/iwlwifi/mvm/tt.c |   13 +++--
>  1 file changed, 3 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/tt.c 
> b/drivers/net/wireless/intel/iwlwifi/mvm/tt.c
> index 58fc7b3c711c..64802659711f 100644
> --- a/drivers/net/wireless/intel/iwlwifi/mvm/tt.c
> +++ b/drivers/net/wireless/intel/iwlwifi/mvm/tt.c
> @@ -634,11 +634,6 @@ static int iwl_mvm_tzone_get_temp(struct 
> thermal_zone_device *device,
>
> mutex_lock(>mutex);
>
> -   if (!mvm->ucode_loaded || !(mvm->cur_ucode == IWL_UCODE_REGULAR)) {
> -   ret = -EIO;
> -   goto out;
> -   }
> -
> ret = iwl_mvm_get_temp(mvm, );
> if (ret)
> goto out;
> @@ -684,11 +679,6 @@ static int iwl_mvm_tzone_set_trip_temp(struct 
> thermal_zone_device *device,
>
> mutex_lock(>mutex);
>
> -   if (!mvm->ucode_loaded || !(mvm->cur_ucode == IWL_UCODE_REGULAR)) {
> -   ret = -EIO;
> -   goto out;
> -   }
> -
> if (trip < 0 || trip >= IWL_MAX_DTS_TRIPS) {
> ret = -EINVAL;
> goto out;
> @@ -750,6 +740,9 @@ static void iwl_mvm_thermal_zone_register(struct iwl_mvm 
> *mvm)
> return;
> }
>
> +   if (!mvm->ucode_loaded || !(mvm->cur_ucode == IWL_UCODE_REGULAR))
> +   return;
> +
> BUILD_BUG_ON(ARRAY_SIZE(name) >= THERMAL_NAME_LENGTH);
>
> mvm->tz_device.tzone = thermal_zone_device_register(name,
> --
> 1.7.9.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT] Networking

2016-05-17 Thread Emmanuel Grumbach
On Wed, May 18, 2016 at 4:00 AM, Linus Torvalds
 wrote:
> On Tue, May 17, 2016 at 12:11 PM, David Miller  wrote:
>>
>> Highlights:
>
> Lowlights:
>
>  1) the iwlwifi driver seems to be broken
>
> My laptop that uses the intel 7680 iwlwifi module no longer connects
> to the network. It fails with a "Microcode SW error detected." and
> spews out register state over and over again.

Can we have the register state and the ASSERT / NMI / whatever that
goes along with it?
This clearly means that the firmware is crashing, but I don't know why,
I copied here the lines that I need from another bug with another
device with another firmware,
but the log that we will still explain what I need:

[  800.880402] iwlwifi :02:00.0: Start IWL Error Log Dump:
[  800.880406] iwlwifi :02:00.0: Status: 0x, count: 6
[  800.880409] iwlwifi :02:00.0: Loaded firmware version: 21.311951.0
[  800.880413] iwlwifi :02:00.0: 0x0394 | ADVANCED_SYSASSERT
[  800.880416] iwlwifi :02:00.0: 0x0220 | trm_hw_status0
[  800.880419] iwlwifi :02:00.0: 0x | trm_hw_status1
[  800.880422] iwlwifi :02:00.0: 0x0BD8 | branchlink2
[  800.880425] iwlwifi :02:00.0: 0x00026AC4 | interruptlink1
[  800.880428] iwlwifi :02:00.0: 0x | interruptlink2
[  800.880431] iwlwifi :02:00.0: 0x0001 | data1
[  800.880434] iwlwifi :02:00.0: 0x02039845 | data2
[  800.880437] iwlwifi :02:00.0: 0x0056 | data3
[  800.880440] iwlwifi :02:00.0: 0x8E4184A7 | beacon time
[  800.880443] iwlwifi :02:00.0: 0x30E2CB41 | tsf low
[  800.880446] iwlwifi :02:00.0: 0x0027 | tsf hi
[  800.880449] iwlwifi :02:00.0: 0x | time gp1
[  800.880451] iwlwifi :02:00.0: 0x2F842F8A | time gp2
[  800.880454] iwlwifi :02:00.0: 0x | uCode revision type
[  800.880457] iwlwifi :02:00.0: 0x0015 | uCode version major
[  800.880460] iwlwifi :02:00.0: 0x0004C28F | uCode version minor
[  800.880463] iwlwifi :02:00.0: 0x0201 | hw version
[  800.880466] iwlwifi :02:00.0: 0x00489008 | board version
[  800.880469] iwlwifi :02:00.0: 0x001C | hcmd
[  800.880472] iwlwifi :02:00.0: 0x24022000 | isr0
[  800.880475] iwlwifi :02:00.0: 0x0100 | isr1
[  800.880478] iwlwifi :02:00.0: 0x580A | isr2
[  800.880481] iwlwifi :02:00.0: 0x4041FCC1 | isr3
[  800.880483] iwlwifi :02:00.0: 0x | isr4
[  800.880486] iwlwifi :02:00.0: 0x00800110 | last cmd Id
[  800.880489] iwlwifi :02:00.0: 0x | wait_event
[  800.880492] iwlwifi :02:00.0: 0x02C8 | l2p_control
[  800.880495] iwlwifi :02:00.0: 0x00018030 | l2p_duration
[  800.880498] iwlwifi :02:00.0: 0x00BF | l2p_mhvalid
[  800.880501] iwlwifi :02:00.0: 0x00EF | l2p_addr_match
[  800.880503] iwlwifi :02:00.0: 0x000D | lmpm_pmg_sel
[  800.880506] iwlwifi :02:00.0: 0x30031805 | timestamp
[  800.880509] iwlwifi :02:00.0: 0xE0F0 | flow_handler




>
> The last thing it says before falling over is:
>
>   wlp1s0: authenticate with xx:xx:xx:xx:xx:xx
>   wlp1s0: send auth to xx:xx:xx:xx:xx:xx (try 1/3)
>   wlp1s0: send auth to xx:xx:xx:xx:xx:xx (try 2/3)
>
> and then it goes all titsup.
>
> I thought that it might be because I had downloaded one of the daily
> firmware versions (it calls itself iwlwifi-7260-17.ucode, but isn't a
> real release afaik - but it has worked fien for me before), but the
> problem persists with the ver-16 ucode too, so that wasn't it.
>
> I haven't bisected it, but there is absolutely nothing odd in my hardware.
>
> I do have a 802.11ac network, which apparently not everybody does,
> judging by previous bug-reports of mine..
>
> Intel iwlwifi people: please check this out.
>
>Linus
> --
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next v5] rtnetlink: add new RTM_GETSTATS message to dump link stats

2016-04-19 Thread Emmanuel Grumbach
On Tue, Apr 19, 2016 at 1:09 PM, Johannes Berg
 wrote:
> On Mon, 2016-04-18 at 23:52 -0400, David Miller wrote:
>> From: David Miller 
>> Date: Mon, 18 Apr 2016 21:48:51 -0400 (EDT)
>>
>> > I think the time has probably come to have a new netlink attribute
>> > format that doesn't have this multi-decade old problem.
>>  ...
>>
>> Scratch this whole idea, I think the padding attribute works a lot
>> better and is backwards compatible with every properly written
>> netlink application.
>
> This talk about attribute flags reminded me about something else.
>
> At netconf, we talked about how awkward it can be that one doesn't know
> if an attribute was accepted by the kernel or simply ignored because
> it's not supported (older kernel version).
>
> I considered (and Emmanuel has started to cook up a patch for it)
> adding a flag here meaning "reject if not parsed" (or so).
>
> Now, I realize that with all the existing netlink commands one can't
> actually rely on that (since it requires some infrastructure), but new
> commands in existing families and also entirely new generic netlink
> families would allow let userspace rely on such a thing.
>
> Thoughts?


FWIW:

diff --git a/include/net/netlink.h b/include/net/netlink.h
index 2a5dbcc..2dfd46d 100644
--- a/include/net/netlink.h
+++ b/include/net/netlink.h
@@ -663,6 +663,15 @@ static inline int nla_type(const struct nlattr *nla)
 }

 /**
+ * nla_must_parse - returns true if the attribute must be parsed
+ * @nla: netlink attribute
+ */
+static inline bool nla_must_parse(const struct nlattr *nla)
+{
+   return nla->nla_type & NLA_F_NET_MUST_PARSE;
+}
+
+/**
  * nla_data - head of payload
  * @nla: netlink attribute
  */
diff --git a/include/uapi/linux/netlink.h b/include/uapi/linux/netlink.h
index 1a85940..e81a8d4 100644
--- a/include/uapi/linux/netlink.h
+++ b/include/uapi/linux/netlink.h
@@ -165,17 +165,21 @@ struct nlattr {

 /*
  * nla_type (16 bits)
- * +---+---+---+
- * | N | O | Attribute Type|
- * +---+---+---+
+ * +---+---+---++
+ * | N | O | M | Attribute Type |
+ * +---+---+---++
  * N := Carries nested attributes
  * O := Payload stored in network byte order
+ * M := Attribute can't be ignored
  *
  * Note: The N and O flag are mutually exclusive.
  */
 #define NLA_F_NESTED   (1 << 15)
 #define NLA_F_NET_BYTEORDER(1 << 14)
-#define NLA_TYPE_MASK  ~(NLA_F_NESTED | NLA_F_NET_BYTEORDER)
+#define NLA_F_NET_MUST_PARSE   (1 << 13)
+#define NLA_TYPE_MASK  ~(NLA_F_NESTED |\
+ NLA_F_NET_BYTEORDER | \
+ NLA_F_NET_MUST_PARSE)

 #define NLA_ALIGNTO4
 #define NLA_ALIGN(len) (((len) + NLA_ALIGNTO - 1) & ~(NLA_ALIGNTO - 1))
diff --git a/lib/nlattr.c b/lib/nlattr.c
index f5907d2..fa15e6f 100644
--- a/lib/nlattr.c
+++ b/lib/nlattr.c
@@ -198,6 +198,10 @@ int nla_parse(struct nlattr **tb, int maxtype,
const struct nlattr *head,
}

tb[type] = (struct nlattr *)nla;
+   } else if (nla_must_parse(nla)) {
+   err = -EINVAL;
+   goto errout;
}
}


Re: [PATCH] wlcore: spi: add wl18xx support

2016-03-30 Thread Emmanuel Grumbach
On Wed, Mar 30, 2016 at 3:58 PM, Eyal Reizer  wrote:
>
> From: Eyal 

Are you trying to hide or something? :)

>
> Add support for using with both wl12xx and wl18xx.
>
> - all wilink family needs special init command for entering wspi mode.
>   extra clock cycles should be sent after the spi init command while the
>   cs pin is high.
> - switch to controling the cs pin from the spi driver for achieveing the
>   above.
> - the selected cs gpio is read from the spi device-tree node using the
>   cs-gpios field and setup as a gpio.
> - See the example below for specifying the cs gpio using the cs-gpios entry
>
>{
> status = "okay";
> pinctrl-names = "default";
> pinctrl-0 = <_pins>;
> cs-gpios = < 5 0>;
> #address-cells = <1>;
> #size-cells = <0>;
> wlcore: wlcore@0 {
> compatible = "ti,wl1835";
> vwlan-supply = <_en_reg>;
> spi-max-frequency = <4800>;
> reg = <0>;  /* chip select 0 on spi0, ie spi0.0 */
> interrupt-parent = <>;
> interrupts = <27 IRQ_TYPE_EDGE_RISING>;
> };
> };
>
> Signed-off-by: Eyal Reizer 
> ---
>  drivers/net/wireless/ti/wlcore/spi.c |  176 
> ++
>  1 file changed, 157 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/net/wireless/ti/wlcore/spi.c 
> b/drivers/net/wireless/ti/wlcore/spi.c
> index 96d9c9d..6c5a369 100644
> --- a/drivers/net/wireless/ti/wlcore/spi.c
> +++ b/drivers/net/wireless/ti/wlcore/spi.c
> @@ -32,6 +32,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include "wlcore.h"
>  #include "wl12xx_80211.h"
> @@ -70,16 +71,30 @@
>  #define WSPI_MAX_CHUNK_SIZE4092
>
>  /*
> - * only support SPI for 12xx - this code should be reworked when 18xx
> - * support is introduced
> + * wl18xx driver aggregation buffer size is (13 * PAGE_SIZE) compared to
> + * (4 * PAGE_SIZE) for wl12xx, so use the larger buffer needed for wl18xx
>   */
> -#define SPI_AGGR_BUFFER_SIZE (4 * PAGE_SIZE)
> +#define SPI_AGGR_BUFFER_SIZE (13 * PAGE_SIZE)
>
>  /* Maximum number of SPI write chunks */
>  #define WSPI_MAX_NUM_OF_CHUNKS \
> ((SPI_AGGR_BUFFER_SIZE / WSPI_MAX_CHUNK_SIZE) + 1)
>
>
> +struct wilink_familiy_data {
> +   char name[8];
> +};
> +
> +const struct wilink_familiy_data *wilink_data;
> +
> +static const struct wilink_familiy_data wl18xx_data = {
> +   .name = "wl18xx",
> +};
> +
> +static const struct wilink_familiy_data wl12xx_data = {
> +   .name = "wl12xx",
> +};
> +
>  struct wl12xx_spi_glue {
> struct device *dev;
> struct platform_device *core;
> @@ -120,6 +135,8 @@ static void wl12xx_spi_init(struct device *child)
> struct spi_transfer t;
> struct spi_message m;
> u8 *cmd = kzalloc(WSPI_INIT_CMD_LEN, GFP_KERNEL);
> +   struct spi_device *spi = (struct spi_device *)glue->dev;
> +   struct spi_master *master = spi->master;
>
> if (!cmd) {
> dev_err(child->parent,
> @@ -127,6 +144,15 @@ static void wl12xx_spi_init(struct device *child)
> return;
> }
>
> +   if (!master->cs_gpios) {
> +   dev_err(child->parent,
> +   "spi chip select pin missing in platform data!\n");
> +   return;
> +   }
> +
> +   /* Drive CS line low */
> +   gpio_direction_output(master->cs_gpios[0], 0);
> +
> memset(, 0, sizeof(t));
> spi_message_init();
>
> @@ -163,6 +189,26 @@ static void wl12xx_spi_init(struct device *child)
> spi_message_add_tail(, );
>
> spi_sync(to_spi_device(glue->dev), );
> +
> +   /* Send extra clocks with CS high. this is required by the wilink
> +* family in order for successfully enter WSPI mode
> +*/
> +   gpio_direction_output(master->cs_gpios[0], 1);
> +
> +   memset(, 0, sizeof(m));
> +   spi_message_init();
> +
> +   cmd[0] = 0xff;
> +   cmd[1] = 0xff;
> +   cmd[2] = 0xff;
> +   cmd[3] = 0xff;
> +   swab32s((u32 *)cmd);
> +
> +   t.tx_buf = cmd;
> +   t.len = 4;
> +   spi_message_add_tail(, );
> +   spi_sync(to_spi_device(glue->dev), );
> +
> kfree(cmd);
>  }
>
> @@ -213,6 +259,16 @@ static int __must_check wl12xx_spi_raw_read(struct 
> device *child, int addr,
> u32 *busy_buf;
> u32 *cmd;
> u32 chunk_len;
> +   struct spi_device *spi = (struct spi_device *)glue->dev;
> +   struct spi_master *master = spi->master;
> +
> +   if (!master->cs_gpios) {
> +   dev_err(child->parent,
> +   "spi chip select pin missing in platform data!\n");
> +   return -EINVAL;
> +   }
> +   /* Drive CS line low */
> +   gpio_direction_output(master->cs_gpios[0], 0);
>
> while (len > 0) {
> chunk_len = min_t(size_t, 

Re: [PATCH] codel: cast the output of MS2TIME to codel_time_t

2016-02-22 Thread Emmanuel Grumbach
On Mon, Feb 22, 2016 at 9:53 PM, David Miller <da...@davemloft.net> wrote:
> From: Emmanuel Grumbach <emmanuel.grumb...@intel.com>
> Date: Mon, 22 Feb 2016 13:49:47 +0200
>
>> This will allow to pass the typecheck in the comparators:
>> codel_time_{after,before}
>>
>> Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com>
>
> I do not see any existing code where this would matter.  Maybe it was
> not meant to be fed directly?  And if you need to you should add the
> explicit casts yourself.

Well.. This header file is kinda special. As Eric pointed out, I
shouldn't include it since it contains huge inline functions. I only
really need small parts of it for iwlwifi and I copied those into the
driver. In iwlwifi I do need to feed MS2TIME output into the
comparators so I guess I'll do that there. I just thought it'd be nice
to feed this change back in the original implementation of the code.
If another user of the small helpers in this file comes up, it may
make sense to split this file into two header files: one that can be
included, and another one which can't. If that were to happen, I'd be
glad to have this patch in. We're not there yet.

Bottom line, I am perfectly fine if you don't apply this.

>
> I'm not applying this, sorry.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] codel: cast the output of MS2TIME to codel_time_t

2016-02-22 Thread Emmanuel Grumbach
This will allow to pass the typecheck in the comparators:
codel_time_{after,before}

Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com>
---
 include/net/codel.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/net/codel.h b/include/net/codel.h
index 267e702..0d92e39 100644
--- a/include/net/codel.h
+++ b/include/net/codel.h
@@ -62,7 +62,7 @@
 typedef u32 codel_time_t;
 typedef s32 codel_tdiff_t;
 #define CODEL_SHIFT 10
-#define MS2TIME(a) ((a * NSEC_PER_MSEC) >> CODEL_SHIFT)
+#define MS2TIME(a) ((codel_time_t)((a * NSEC_PER_MSEC) >> CODEL_SHIFT))
 
 static inline codel_time_t codel_get_time(void)
 {
-- 
2.5.0



[PATCH] codel: add forgotten inline to functions in header file

2016-02-11 Thread Emmanuel Grumbach
Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com>
---
 include/net/codel.h | 44 ++--
 1 file changed, 22 insertions(+), 22 deletions(-)

diff --git a/include/net/codel.h b/include/net/codel.h
index 267e702..0775c24 100644
--- a/include/net/codel.h
+++ b/include/net/codel.h
@@ -92,18 +92,18 @@ struct codel_skb_cb {
codel_time_t enqueue_time;
 };
 
-static struct codel_skb_cb *get_codel_cb(const struct sk_buff *skb)
+static inline struct codel_skb_cb *get_codel_cb(const struct sk_buff *skb)
 {
qdisc_cb_private_validate(skb, sizeof(struct codel_skb_cb));
return (struct codel_skb_cb *)qdisc_skb_cb(skb)->data;
 }
 
-static codel_time_t codel_get_enqueue_time(const struct sk_buff *skb)
+static inline codel_time_t codel_get_enqueue_time(const struct sk_buff *skb)
 {
return get_codel_cb(skb)->enqueue_time;
 }
 
-static void codel_set_enqueue_time(struct sk_buff *skb)
+static inline void codel_set_enqueue_time(struct sk_buff *skb)
 {
get_codel_cb(skb)->enqueue_time = codel_get_time();
 }
@@ -174,8 +174,8 @@ struct codel_stats {
 
 #define CODEL_DISABLED_THRESHOLD INT_MAX
 
-static void codel_params_init(struct codel_params *params,
- const struct Qdisc *sch)
+static inline void codel_params_init(struct codel_params *params,
+const struct Qdisc *sch)
 {
params->interval = MS2TIME(100);
params->target = MS2TIME(5);
@@ -184,12 +184,12 @@ static void codel_params_init(struct codel_params *params,
params->ecn = false;
 }
 
-static void codel_vars_init(struct codel_vars *vars)
+static inline void codel_vars_init(struct codel_vars *vars)
 {
memset(vars, 0, sizeof(*vars));
 }
 
-static void codel_stats_init(struct codel_stats *stats)
+static inline void codel_stats_init(struct codel_stats *stats)
 {
stats->maxpacket = 0;
 }
@@ -200,7 +200,7 @@ static void codel_stats_init(struct codel_stats *stats)
  *
  * Here, invsqrt is a fixed point number (< 1.0), 32bit mantissa, aka Q0.32
  */
-static void codel_Newton_step(struct codel_vars *vars)
+static inline void codel_Newton_step(struct codel_vars *vars)
 {
u32 invsqrt = ((u32)vars->rec_inv_sqrt) << REC_INV_SQRT_SHIFT;
u32 invsqrt2 = ((u64)invsqrt * invsqrt) >> 32;
@@ -217,19 +217,19 @@ static void codel_Newton_step(struct codel_vars *vars)
  * We maintain in rec_inv_sqrt the reciprocal value of sqrt(count) to avoid
  * both sqrt() and divide operation.
  */
-static codel_time_t codel_control_law(codel_time_t t,
- codel_time_t interval,
- u32 rec_inv_sqrt)
+static inline codel_time_t codel_control_law(codel_time_t t,
+codel_time_t interval,
+u32 rec_inv_sqrt)
 {
return t + reciprocal_scale(interval, rec_inv_sqrt << 
REC_INV_SQRT_SHIFT);
 }
 
-static bool codel_should_drop(const struct sk_buff *skb,
- struct Qdisc *sch,
- struct codel_vars *vars,
- struct codel_params *params,
- struct codel_stats *stats,
- codel_time_t now)
+static inline bool codel_should_drop(const struct sk_buff *skb,
+struct Qdisc *sch,
+struct codel_vars *vars,
+struct codel_params *params,
+struct codel_stats *stats,
+codel_time_t now)
 {
bool ok_to_drop;
 
@@ -265,11 +265,11 @@ static bool codel_should_drop(const struct sk_buff *skb,
 typedef struct sk_buff * (*codel_skb_dequeue_t)(struct codel_vars *vars,
struct Qdisc *sch);
 
-static struct sk_buff *codel_dequeue(struct Qdisc *sch,
-struct codel_params *params,
-struct codel_vars *vars,
-struct codel_stats *stats,
-codel_skb_dequeue_t dequeue_func)
+static inline struct sk_buff *codel_dequeue(struct Qdisc *sch,
+   struct codel_params *params,
+   struct codel_vars *vars,
+   struct codel_stats *stats,
+   codel_skb_dequeue_t dequeue_func)
 {
struct sk_buff *skb = dequeue_func(vars, sch);
codel_time_t now;
-- 
2.5.0



Re: [RFC v2] iwlwifi: pcie: transmit queue auto-sizing

2016-02-08 Thread Emmanuel Grumbach
On Mon, Feb 8, 2016 at 12:00 PM, Michal Kazior  wrote:
> On 5 February 2016 at 17:47, Dave Taht  wrote:
>>> A bursted txop can be as big as 5-10ms. If you consider you want to
>>> queue 5-10ms worth of data for *each* station at any given time you
>>> obviously introduce a lot of lag. If you have 10 stations you might
>>> end up with service period at 10*10ms = 100ms. This gets even worse if
>>> you consider MU-MIMO because you need to do an expensive sounding
>>> procedure before transmitting. So while SU aggregation can probably
>>> still work reasonably well with shorter bursts (1-2ms) MU needs at
>>> least 3ms to get *any* gain when compared to SU (which obviously means
>>> you want more to actually make MU pay off).
>>
>> I am not sure where you get these numbers. Got a spreadsheet?
>
> Here's a nice summary on some of it:
>
>   
> http://chimera.labs.oreilly.com/books/123401739/ch03.html#figure-mac-ampdu
>
> Even if your single A-MPDU can be shorter than a txop you can burst a
> few of them if my understanding is correct.
>
> The overhead associated with MU sounding is something I've been told
> only. Apparently for MU to pay off you need fairly big bursts. This
> implies that the more stations you have to service the less it makes
> sense to attempt MU if you consider latency.
>
>
>> Gradually reducing the maximum sized txop as a function of the number
>> of stations makes sense. If you have 10 stations pending delivery and
>> reduced the max txop to 1ms, you hurt bandwidth at that instant, but
>> by offering more service to more stations, in less time, they will
>> converge on a reasonable share of the bandwidth for each, faster[1].
>> And I'm sure that the person videoconferencing on a link like that
>> would appreciate getting some service inside of a 10ms interval,
>> rather than a 100ms.
>>
>> yes, there's overhead, and that's not the right number, which would
>> vary as to g,n,ac and successors.
>>
>> You will also get more opportunities to use mu-mimo with shorter
>> bursts extant and more stations being regularly serviced.
>>
>> [1] https://www.youtube.com/watch?v=Rb-UnHDw02o at about 13:50
>
> This is my thinking as well, at least for most common use cases.
>
> If you try to optimize for throughput by introducing extra induced
> latency you might end up not being able to use aggregation in practice
> anyway because you won't be able to start up connections and ask for
> enough data - or at least that's what my intuition tells me.
>
> But, like I've mentioned, there's interest in making it possible to
> maximize for throughput (regardless of latency). This surely makes
> sense for synthetic UDP benchmarks. But does it make sense for any
> real-world application? No idea.
>
>
>>> The rule of thumb is the
>>> longer you wait the bigger capacity you can get.
>>
>> This is not strictly true as the "fountain" of packets is regulated by
>> acks on the other side of the link, and ramp up or down as a function
>> of service time and loss.
>
> Yes, if you consider real world cases, i.e. TCP, web traffic, etc.
> then you're correct.
>
>
>>> Apparently there's interest in maximizing throughput but it stands in
>>> direct opposition of keeping the latency down so I've been thinking
>>> how to satisfy both.
>>>
>>> The current approach ath10k is taking (patches in review [1][2]) is to
>>> use mac80211 software queues for per-station queuing, exposing queue
>>> state to firmware (it decides where frames should be dequeued from)
>>> and making it possible to stop/wake per-station tx subqueue with fake
>>> netdev queues. I'm starting to think this is not the right way though
>>> because it's inherently hard to control latency and there's a huge
>>> memory overhead associated with the fake netdev queues.
>>
>> What is this overhead?
>
> E.g. if you want to be able to maximize throughput for 50 MU clients
> you need to be able to queue, in theory, 50*200 (roughly) frames. This
> translates to both huge memory usage and latency *and* renders
> (fq_)codel qdisc rather.. moot.

Ok - now I understand. So yes the conclusion below (make fq_codel
station aware) makes a lot sense.

>
>
>> Applying things  like codel tend to dramatically shorten the amount of
>> skbs extant...
>
>> modern 802.11ac capable hardware has tons more
>> memory...
>
> I don't think it does. QCA988x is able to handle "only" 1424 tx
> descriptors (IOW 1500-byte long MSDUs) in the driver-to-firmware tx
> queue (it's a flat queue). QCA99x0 is able to handle 2500 if asked
> politely.

As I said, our design is not flat which removes for the firmware to
classify the packets by station to be able to build aggregates, but
the downside is the number of clients you can service.

>
> This is still not enough to satisfy the insane "maximize the
> capacity/throughput" expectations though.
>
> You could actually argue it's too much from the bufferbloat problem
> point of view anyway and Emmanuel's patch 

Re: [RFC v2] iwlwifi: pcie: transmit queue auto-sizing

2016-02-08 Thread Emmanuel Grumbach
On Fri, Feb 5, 2016 at 10:44 AM, Michal Kazior <michal.kaz...@tieto.com> wrote:
> On 4 February 2016 at 22:14, Ben Greear <gree...@candelatech.com> wrote:
>> On 02/04/2016 12:56 PM, Grumbach, Emmanuel wrote:
>>> On 02/04/2016 10:46 PM, Ben Greear wrote:
>>>> On 02/04/2016 12:16 PM, Emmanuel Grumbach wrote:
>>>>>
>>>>> As many (all?) WiFi devices, Intel WiFi devices have
>>>>> transmit queues which have 256 transmit descriptors
>>>>> each and each descriptor corresponds to an MPDU.
>>>>> This means that when it is full, the queue contains
>>>>> 256 * ~1500 bytes to be transmitted (if we don't have
>>>>> A-MSDUs). The purpose of those queues is to have enough
>>>>> packets to be ready for transmission so that when the device
>>>>> gets an opportunity to transmit (TxOP), it can take as many
>>>>> packets as the spec allows and aggregate them into one
>>>>> A-MPDU or even several A-MPDUs if we are using bursts.
>>>>
>>>> I guess this is only really usable if you have exactly one
>>>> peer connected (ie, in station mode)?
>>>>
>>>> Otherwise, you could have one slow peer and one fast one,
>>>> and then I suspect this would not work so well?
>>>
>>>
>>> Yes. I guess this one (big) limitation. I guess that what would happen
>>> in this case is that the the latency would constantly jitter. But I also
>
> Hmm.. You'd probably need to track per-station packet sojourn time as
> well and make it possible to stop/wake queues per station.

Clearly here comes the difference between the devices you work on, and
the devices I work on. Intel devices are more client oriented. Our AP
mode doesn't handle many clients etc..

>
>
>>> noticed that I could reduce the transmit queue to 130 descriptors
>>> (instead of 256) and still reach maximal throughput because we can
>>> refill the queues quickly enough.
>>> In iwlwifi, we have plans to have one queue for each peer.
>>> This is under development. Not sure when it'll be ready. It also requires
>>> firmware change obviously.
>>
>> Per-peer queues will probably be nice, especially if we can keep the
>> buffer bloat manageable.
>
> Per-station queues sound tricky if you consider bufferbloat.

iwlwifi's A-MPDU model is different from athXk's I guess. In iwlwifi
(the Intel devices really since it is mostly firmware) the firmware
will try to use a TxOP whenever there is data in the queue. Then, once
we have a chance to transmit, we will look at what we have in the
queue and send the biggest aggregates we can (and bursts if allowed).
But we will not defer packets' transmission to get bigger aggregates.
Testing shows that under ideal conditions, we can have enough packets
in the queue to build big aggregates without creating artificial
latency.

>
> To maximize use of airtime (i.e. txop) you need to send big
> aggregates. Since aggregates are per station-tid to maximize
> multi-station performance (in AP mode) you'll need to queue a lot of
> frames, per each station, depending on the chosen tx rate.

Sure.

>
> A bursted txop can be as big as 5-10ms. If you consider you want to
> queue 5-10ms worth of data for *each* station at any given time you
> obviously introduce a lot of lag. If you have 10 stations you might
> end up with service period at 10*10ms = 100ms. This gets even worse if
> you consider MU-MIMO because you need to do an expensive sounding
> procedure before transmitting. So while SU aggregation can probably
> still work reasonably well with shorter bursts (1-2ms) MU needs at
> least 3ms to get *any* gain when compared to SU (which obviously means
> you want more to actually make MU pay off). The rule of thumb is the
> longer you wait the bigger capacity you can get.

I am not an expert about MU-MIMO, but I'll believe you :)
We can chat about this in a few days :)

Queueing frames under good conditions is fine in a way that it is a
"Good queue" (hey Stephen), you need those queues to maximize the
throughput because of the bursty nature of WiFi and the queue "moves"
quickly since you have high throughput so that the sojourn time in
your queue is relatively small, but when the link conditions gets less
good you need to reduce the queue length because it doesn't really
help you anyway.  This is what my patch aims at fixing.
All this is true when you have a small number of stations...
I understand from your comment that even in ideal conditions you still
need to create a lot of latency to gain TPT. Then there isn't much we
can do without impacting either TPT or latency. Then, there is a real
tradeoff.
I g

[RFC] iwlwifi: pcie: transmit queue auto-sizing

2016-02-04 Thread Emmanuel Grumbach
As many (all?) WiFi devices, Intel WiFi devices have
transmit queues which have 256 transmit descriptors
each and each descriptor corresponds to an MPDU.
This means that when it is full, the queue contains
256 * ~1500 bytes to be transmitted (if we don't have
A-MSDUs). The purpose of those queues is to have enough
packets to be ready for transmission so that when the device
gets an opportunity to transmit (TxOP), it can take as many
packets as the spec allows and aggregate them into one
A-MPDU or even several A-MPDUs if we are using bursts.

The problem is that the packets that are in these queues are
already out of control of the Qdisc and can stay in those
queues for a fairly long time when the link condition is
not good. This leads to the well known bufferbloat problem.

This patch adds a way to tune the size of the transmit queue
so that it won't cause excessive latency. When the link
condition is good, the packets will flow smoothly and the
transmit queue will grow quickly allowing A-MPDUs and
maximal throughput. When the link is not optimal, we will
have retransmissions, lower transmit rates or signal
detection (CCA) which will cause a delay in the packet
transmission. The driver will sense this higher latency
and will reduce the size of the transmit queue.
This means that the packets that continue to arrive
will pile up in the Qdisc rather than in the device
queues. The major advantage of this approach is that
codel can now do its work.

The algorithm is really (too?) simple:
every 5 seconds, starts from a short queue again.
If a packet has been in the queue for less than 10ms,
allow 10 more MPDUs in.
If a packet has been in the queue for more than 20ms,
reduce by 10 the size of the transmit queue.

The implementation is really naive and way too simple:
 * reading jiffies for every Tx / Tx status is not a
   good idead.
 * jiffies are not fine-grained enough on all platforms
 * the constants chosen are really arbitrary and can't be
   tuned.
 * This may be implemented in mac80211 probably and help
   other drivers.
 * etc...

But already this gives nice results. I ran a very simple
experiment: I put the device in a controlled environment
and ran traffic while running default sized ping in the
background. In this configuration, our device quickly
raises its transmission rate to the best rate.
Then, I force the device to use the lowest rate (6Mbps).
Of course, the throughput collapses, but the ping RTT
shoots up.
Using codel helps, but the latency is still high. Codel
with this patch gives much better results:

pfifo_fast:
rtt min/avg/max/mdev = 1932.616/2393.284/2833.407/315.941 ms, pipe 3, ipg/ewma 
2215.707/2446.884 ms

fq_codel + Tx queue auto-sizing:
rtt min/avg/max/mdev = 13.541/32.396/54.791/9.610 ms, ipg/ewma 200.685/32.202 ms

fq_codel without Tx queue auto-sizing:
rtt min/avg/max/mdev = 140.821/257.303/331.889/31.074 ms, pipe 2, ipg/ewma 
258.147/252.847 ms

Clearly, there is more work to do to be able to merge this,
but it seems that the wireless problems mentioned in
https://lwn.net/Articles/616241/ may have a solution.

Cc: Stephen Hemminger <step...@networkplumber.org>
Cc: Dave Taht <dave.t...@bufferbloat.net>
Cc: Jonathan Corbet <cor...@lwn.net>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com>
---
 drivers/net/wireless/intel/iwlwifi/pcie/internal.h |  6 
 drivers/net/wireless/intel/iwlwifi/pcie/tx.c   | 32 --
 2 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h 
b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h
index 2f95916..d83eb56 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h
@@ -192,6 +192,11 @@ struct iwl_cmd_meta {
u32 flags;
 };
 
+struct iwl_txq_auto_size {
+   int min_space;
+   unsigned long reset_ts;
+};
+
 /*
  * Generic queue structure
  *
@@ -293,6 +298,7 @@ struct iwl_txq {
bool block;
unsigned long wd_timeout;
struct sk_buff_head overflow_q;
+   struct iwl_txq_auto_size auto_sz;
 };
 
 static inline dma_addr_t
diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c 
b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c
index 837a7d5..4d1dee6 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c
@@ -572,6 +572,8 @@ static int iwl_pcie_txq_init(struct iwl_trans *trans, 
struct iwl_txq *txq,
 
spin_lock_init(>lock);
__skb_queue_head_init(>overflow_q);
+   txq->auto_sz.min_space = 240;
+   txq->auto_sz.reset_ts = jiffies;
 
/*
 * Tell nic where to find circular buffer of Tx Frame Descriptors for
@@ -1043,10 +1045,14 @@ void iwl_trans_pcie_reclaim(struct iwl_trans *trans, 
int txq_id, int ssn,
 q->read_ptr != tfd_num;
 q->read_ptr = iwl_queue_inc_wrap(q->read_ptr)) {
struct sk_

[RFC v2] iwlwifi: pcie: transmit queue auto-sizing

2016-02-04 Thread Emmanuel Grumbach
As many (all?) WiFi devices, Intel WiFi devices have
transmit queues which have 256 transmit descriptors
each and each descriptor corresponds to an MPDU.
This means that when it is full, the queue contains
256 * ~1500 bytes to be transmitted (if we don't have
A-MSDUs). The purpose of those queues is to have enough
packets to be ready for transmission so that when the device
gets an opportunity to transmit (TxOP), it can take as many
packets as the spec allows and aggregate them into one
A-MPDU or even several A-MPDUs if we are using bursts.

The problem is that the packets that are in these queues are
already out of control of the Qdisc and can stay in those
queues for a fairly long time when the link condition is
not good. This leads to the well known bufferbloat problem.

This patch adds a way to tune the size of the transmit queue
so that it won't cause excessive latency. When the link
condition is good, the packets will flow smoothly and the
transmit queue will grow quickly allowing A-MPDUs and
maximal throughput. When the link is not optimal, we will
have retransmissions, lower transmit rates or signal
detection (CCA) which will cause a delay in the packet
transmission. The driver will sense this higher latency
and will reduce the size of the transmit queue.
This means that the packets that continue to arrive
will pile up in the Qdisc rather than in the device
queues. The major advantage of this approach is that
codel can now do its work.

The algorithm is really (too?) simple:
every 5 seconds, starts from a short queue again.
If a packet has been in the queue for less than 10ms,
allow 10 more MPDUs in.
If a packet has been in the queue for more than 20ms,
reduce by 10 the size of the transmit queue.

The implementation is really naive and way too simple:
 * reading jiffies for every Tx / Tx status is not a
   good idead.
 * jiffies are not fine-grained enough on all platforms
 * the constants chosen are really arbitrary and can't be
   tuned.
 * This may be implemented in mac80211 probably and help
   other drivers.
 * etc...

But already this gives nice results. I ran a very simple
experiment: I put the device in a controlled environment
and ran traffic while running default sized ping in the
background. In this configuration, our device quickly
raises its transmission rate to the best rate.
Then, I force the device to use the lowest rate (6Mbps).
Of course, the throughput collapses, but the ping RTT
shoots up.
Using codel helps, but the latency is still high. Codel
with this patch gives much better results:

pfifo_fast:
rtt min/avg/max/mdev = 1932.616/2393.284/2833.407/315.941 ms, pipe 3, ipg/ewma 
2215.707/2446.884 ms

fq_codel + Tx queue auto-sizing:
rtt min/avg/max/mdev = 13.541/32.396/54.791/9.610 ms, ipg/ewma 200.685/32.202 ms

fq_codel without Tx queue auto-sizing:
rtt min/avg/max/mdev = 140.821/257.303/331.889/31.074 ms, pipe 2, ipg/ewma 
258.147/252.847 ms

Clearly, there is more work to do to be able to merge this,
but it seems that the wireless problems mentioned in
https://lwn.net/Articles/616241/ may have a solution.

Cc: Stephen Hemminger <step...@networkplumber.org>
Cc: Dave Taht <dave.t...@gmail.com>
Cc: Jonathan Corbet <cor...@lwn.net>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com>
---
Fix Dave's email address
---
 drivers/net/wireless/intel/iwlwifi/pcie/internal.h |  6 
 drivers/net/wireless/intel/iwlwifi/pcie/tx.c   | 32 --
 2 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h 
b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h
index 2f95916..d83eb56 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h
@@ -192,6 +192,11 @@ struct iwl_cmd_meta {
u32 flags;
 };
 
+struct iwl_txq_auto_size {
+   int min_space;
+   unsigned long reset_ts;
+};
+
 /*
  * Generic queue structure
  *
@@ -293,6 +298,7 @@ struct iwl_txq {
bool block;
unsigned long wd_timeout;
struct sk_buff_head overflow_q;
+   struct iwl_txq_auto_size auto_sz;
 };
 
 static inline dma_addr_t
diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c 
b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c
index 837a7d5..4d1dee6 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c
@@ -572,6 +572,8 @@ static int iwl_pcie_txq_init(struct iwl_trans *trans, 
struct iwl_txq *txq,
 
spin_lock_init(>lock);
__skb_queue_head_init(>overflow_q);
+   txq->auto_sz.min_space = 240;
+   txq->auto_sz.reset_ts = jiffies;
 
/*
 * Tell nic where to find circular buffer of Tx Frame Descriptors for
@@ -1043,10 +1045,14 @@ void iwl_trans_pcie_reclaim(struct iwl_trans *trans, 
int txq_id, int ssn,
 q->read_ptr != tfd_num;
 q->read_ptr = iwl_queue_inc_wrap(q->read_ptr)) {
 

[RFC RESEND] iwlwifi: pcie: transmit queue auto-sizing

2016-02-04 Thread Emmanuel Grumbach
As many (all?) WiFi devices, Intel WiFi devices have
transmit queues which have 256 transmit descriptors
each and each descriptor corresponds to an MPDU.
This means that when it is full, the queue contains
256 * ~1500 bytes to be transmitted (if we don't have
A-MSDUs). The purpose of those queues is to have enough
packets to be ready for transmission so that when the device
gets an opportunity to transmit (TxOP), it can take as many
packets as the spec allows and aggregate them into one
A-MPDU or even several A-MPDUs if we are using bursts.

The problem is that the packets that are in these queues are
already out of control of the Qdisc and can stay in those
queues for a fairly long time when the link condition is
not good. This leads to the well known bufferbloat problem.

This patch adds a way to tune the size of the transmit queue
so that it won't cause excessive latency. When the link
condition is good, the packets will flow smoothly and the
transmit queue will grow quickly allowing A-MPDUs and
maximal throughput. When the link is not optimal, we will
have retransmissions, lower transmit rates or signal
detection (CCA) which will cause a delay in the packet
transmission. The driver will sense this higher latency
and will reduce the size of the transmit queue.
This means that the packets that continue to arrive
will pile up in the Qdisc rather than in the device
queues. The major advantage of this approach is that
codel can now do its work.

The algorithm is really (too?) simple:
every 5 seconds, starts from a short queue again.
If a packet has been in the queue for less than 10ms,
allow 10 more MPDUs in.
If a packet has been in the queue for more than 20ms,
reduce by 10 the size of the transmit queue.

The implementation is really naive and way too simple:
 * reading jiffies for every Tx / Tx status is not a
   good idead.
 * jiffies are not fine-grained enough on all platforms
 * the constants chosen are really arbitrary and can't be
   tuned.
 * This may be implemented in mac80211 probably and help
   other drivers.
 * etc...

But already this gives nice results. I ran a very simple
experiment: I put the device in a controlled environment
and ran traffic while running default sized ping in the
background. In this configuration, our device quickly
raises its transmission rate to the best rate.
Then, I force the device to use the lowest rate (6Mbps).
Of course, the throughput collapses, but the ping RTT
shoots up.
Using codel helps, but the latency is still high. Codel
with this patch gives much better results:

pfifo_fast:
rtt min/avg/max/mdev = 1932.616/2393.284/2833.407/315.941 ms, pipe 3, ipg/ewma 
2215.707/2446.884 ms

fq_codel + Tx queue auto-sizing:
rtt min/avg/max/mdev = 13.541/32.396/54.791/9.610 ms, ipg/ewma 200.685/32.202 ms

fq_codel without Tx queue auto-sizing:
rtt min/avg/max/mdev = 140.821/257.303/331.889/31.074 ms, pipe 2, ipg/ewma 
258.147/252.847 ms

Clearly, there is more work to do to be able to merge this,
but it seems that the wireless problems mentioned in
https://lwn.net/Articles/616241/ may have a solution.

Cc: Stephen Hemminger <step...@networkplumber.org>
Cc: Dave Taht <dave.t...@gmail.com>
Cc: Jonathan Corbet <cor...@lwn.net>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com>
---
Fix Dave's email address
---
 drivers/net/wireless/intel/iwlwifi/pcie/internal.h |  6 
 drivers/net/wireless/intel/iwlwifi/pcie/tx.c   | 32 --
 2 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h 
b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h
index 2f95916..d83eb56 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h
@@ -192,6 +192,11 @@ struct iwl_cmd_meta {
u32 flags;
 };
 
+struct iwl_txq_auto_size {
+   int min_space;
+   unsigned long reset_ts;
+};
+
 /*
  * Generic queue structure
  *
@@ -293,6 +298,7 @@ struct iwl_txq {
bool block;
unsigned long wd_timeout;
struct sk_buff_head overflow_q;
+   struct iwl_txq_auto_size auto_sz;
 };
 
 static inline dma_addr_t
diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c 
b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c
index 837a7d5..4d1dee6 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c
@@ -572,6 +572,8 @@ static int iwl_pcie_txq_init(struct iwl_trans *trans, 
struct iwl_txq *txq,
 
spin_lock_init(>lock);
__skb_queue_head_init(>overflow_q);
+   txq->auto_sz.min_space = 240;
+   txq->auto_sz.reset_ts = jiffies;
 
/*
 * Tell nic where to find circular buffer of Tx Frame Descriptors for
@@ -1043,10 +1045,14 @@ void iwl_trans_pcie_reclaim(struct iwl_trans *trans, 
int txq_id, int ssn,
 q->read_ptr != tfd_num;
 q->read_ptr = iwl_queue_inc_wrap(q->read_ptr)) {
 

Re: [PATCH v2 10/27] iwlwifi: move under intel vendor directory

2015-11-18 Thread Emmanuel Grumbach
Hi Kalle,

On Wed, Nov 18, 2015 at 4:45 PM, Kalle Valo  wrote:
>
> Part of reorganising wireless drivers directory and Kconfig.
>

I am sorry but I'd prefer to wait with this. We have a big machinery
of scripts / builds and alike that will break. I did give a heads up a
while ago to the people in charge of these scripts that this change is
on the way, but they are not done yet. I will get back to them and ask
them when we can be ready for such a change and then it'll come from
my tree. I promise to try to make it happen ASAP.
The above is relevant for iwlwifi only. As far as I am concerned, you
can move the iwlwifi and ipw.

Thank you.

> Signed-off-by: Kalle Valo 
> ---
>  MAINTAINERS|2 +-
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3] net: tso: add support for IPv6

2015-10-26 Thread Emmanuel Grumbach
Adding IPv6 for the TSO helper API is trivial:
* Don't play with the id (which doesn't exist in IPv6)
* Correctly update the payload_len (don't include the
  length of the IP header itself)

Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com>
---
v3: use vlan_get_protocol and call it once in tso_start
store the result in tso_t
---
 include/net/tso.h |  1 +
 net/core/tso.c| 18 +-
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/include/net/tso.h b/include/net/tso.h
index 47e5444..b7be852 100644
--- a/include/net/tso.h
+++ b/include/net/tso.h
@@ -8,6 +8,7 @@ struct tso_t {
void *data;
size_t size;
u16 ip_id;
+   bool ipv6;
u32 tcp_seq;
 };
 
diff --git a/net/core/tso.c b/net/core/tso.c
index 630b30b..5dca7ce 100644
--- a/net/core/tso.c
+++ b/net/core/tso.c
@@ -1,4 +1,5 @@
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -14,18 +15,24 @@ EXPORT_SYMBOL(tso_count_descs);
 void tso_build_hdr(struct sk_buff *skb, char *hdr, struct tso_t *tso,
   int size, bool is_last)
 {
-   struct iphdr *iph;
struct tcphdr *tcph;
int hdr_len = skb_transport_offset(skb) + tcp_hdrlen(skb);
int mac_hdr_len = skb_network_offset(skb);
 
memcpy(hdr, skb->data, hdr_len);
-   iph = (struct iphdr *)(hdr + mac_hdr_len);
-   iph->id = htons(tso->ip_id);
-   iph->tot_len = htons(size + hdr_len - mac_hdr_len);
+   if (!tso->ipv6) {
+   struct iphdr *iph = (void *)(hdr + mac_hdr_len);
+
+   iph->id = htons(tso->ip_id);
+   iph->tot_len = htons(size + hdr_len - mac_hdr_len);
+   tso->ip_id++;
+   } else {
+   struct ipv6hdr *iph = (void *)(hdr + mac_hdr_len);
+
+   iph->payload_len = htons(size + tcp_hdrlen(skb));
+   }
tcph = (struct tcphdr *)(hdr + skb_transport_offset(skb));
put_unaligned_be32(tso->tcp_seq, >seq);
-   tso->ip_id++;
 
if (!is_last) {
/* Clear all special flags for not last packet */
@@ -61,6 +68,7 @@ void tso_start(struct sk_buff *skb, struct tso_t *tso)
tso->ip_id = ntohs(ip_hdr(skb)->id);
tso->tcp_seq = ntohl(tcp_hdr(skb)->seq);
tso->next_frag_idx = 0;
+   tso->ipv6 = vlan_get_protocol(skb) == htons(ETH_P_IPV6);
 
/* Build first data */
tso->size = skb_headlen(skb) - hdr_len;
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] iwlwifi: pcie: allow to build an A-MSDU using TSO core

2015-10-26 Thread Emmanuel Grumbach
Hi Eric,

>
> When the op_mode sends an skb whose payload is bigger than
> MSS, PCIe will create an A-MSDU out of it. PCIe assumes
> that the skb that is coming from the op_mode can fit in one
> A-MSDU. It is the op_mode's responsibility to make sure
> that this guarantee holds.
>
> Additional headers need to be built for the subframes.
> The TSO core code takes care of the IP / TCP headers and
> the driver takes care of the 802.11 subframe headers.
>
> These headers are stored on a per-cpu page that is re-used
> for all the packets handled on that same CPU. Each skb
> holds a reference to that page and releases the page when
> it is reclaimed. When the page gets full, it is released
> and a new one is allocated.
>
> Since any SKB that doesn't go through the fast-xmit path
> of mac80211 will be segmented, we can assume here that the
> packet is not WEP / TKIP and has a proper SNAP header.
>
> Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com>

Assuming your review queue works as a FIFO and you reviewed the TSO
helper patch, I can assume you ACK this one? :)
Or at least, don't NACK it :)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] net: tso: add support for IPv6

2015-10-25 Thread Emmanuel Grumbach
Adding IPv6 for the TSO helper API is trivial:
* Don't play with the id (which doesn't exist in IPv6)
* Correctly update the payload_len (don't include the
  length of the IP header itself)

Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com>
---
 net/core/tso.c | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/net/core/tso.c b/net/core/tso.c
index 630b30b..ece4605 100644
--- a/net/core/tso.c
+++ b/net/core/tso.c
@@ -14,18 +14,25 @@ EXPORT_SYMBOL(tso_count_descs);
 void tso_build_hdr(struct sk_buff *skb, char *hdr, struct tso_t *tso,
   int size, bool is_last)
 {
-   struct iphdr *iph;
struct tcphdr *tcph;
int hdr_len = skb_transport_offset(skb) + tcp_hdrlen(skb);
int mac_hdr_len = skb_network_offset(skb);
 
memcpy(hdr, skb->data, hdr_len);
-   iph = (struct iphdr *)(hdr + mac_hdr_len);
-   iph->id = htons(tso->ip_id);
-   iph->tot_len = htons(size + hdr_len - mac_hdr_len);
+   if (skb->protocol == htons(ETH_P_IP)) {
+   struct iphdr *iph = (void *)(hdr + mac_hdr_len);
+
+   iph->id = htons(tso->ip_id);
+   iph->tot_len = htons(size + hdr_len - mac_hdr_len);
+   tso->ip_id++;
+   }
+   if (skb->protocol == htons(ETH_P_IPV6)) {
+   struct ipv6hdr *iph = (void *)(hdr + mac_hdr_len);
+
+   iph->payload_len = htons(size + tcp_hdrlen(skb));
+   }
tcph = (struct tcphdr *)(hdr + skb_transport_offset(skb));
put_unaligned_be32(tso->tcp_seq, >seq);
-   tso->ip_id++;
 
if (!is_last) {
/* Clear all special flags for not last packet */
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] net: tso: add support for IPv6

2015-10-25 Thread Emmanuel Grumbach
Adding IPv6 for the TSO helper API is trivial:
* Don't play with the id (which doesn't exist in IPv6)
* Correctly update the payload_len (don't include the
  length of the IP header itself)

Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com>
---
v2: add else if

NOTE: instead of checking the skb->protocol, I can add a bool
in tso_t. This might be better in terms of cache
handling. Let me know if you prefer that option
---
 net/core/tso.c | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/net/core/tso.c b/net/core/tso.c
index 630b30b..bf71bdc5 100644
--- a/net/core/tso.c
+++ b/net/core/tso.c
@@ -14,18 +14,24 @@ EXPORT_SYMBOL(tso_count_descs);
 void tso_build_hdr(struct sk_buff *skb, char *hdr, struct tso_t *tso,
   int size, bool is_last)
 {
-   struct iphdr *iph;
struct tcphdr *tcph;
int hdr_len = skb_transport_offset(skb) + tcp_hdrlen(skb);
int mac_hdr_len = skb_network_offset(skb);
 
memcpy(hdr, skb->data, hdr_len);
-   iph = (struct iphdr *)(hdr + mac_hdr_len);
-   iph->id = htons(tso->ip_id);
-   iph->tot_len = htons(size + hdr_len - mac_hdr_len);
+   if (skb->protocol == htons(ETH_P_IP)) {
+   struct iphdr *iph = (void *)(hdr + mac_hdr_len);
+
+   iph->id = htons(tso->ip_id);
+   iph->tot_len = htons(size + hdr_len - mac_hdr_len);
+   tso->ip_id++;
+   } else if (skb->protocol == htons(ETH_P_IPV6)) {
+   struct ipv6hdr *iph = (void *)(hdr + mac_hdr_len);
+
+   iph->payload_len = htons(size + tcp_hdrlen(skb));
+   }
tcph = (struct tcphdr *)(hdr + skb_transport_offset(skb));
put_unaligned_be32(tso->tcp_seq, >seq);
-   tso->ip_id++;
 
if (!is_last) {
/* Clear all special flags for not last packet */
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC v4 1/2] iwlwifi: pcie: allow to build an A-MSDU using TSO core

2015-10-22 Thread Emmanuel Grumbach
When the op_mode sends an skb whose payload is bigger than
MSS, PCIe will create an A-MSDU out of it. PCIe assumes
that the skb that is coming from the op_mode can fit in one
A-MSDU. It is the op_mode's responsibility to make sure
that this guarantee holds.

Additional headers need to be built for the subframes.
The TSO core code takes care of the IP / TCP headers and
the driver takes care of the 802.11 subframe headers.

These headers are stored on a per-cpu page that is re-used
for all the packets handled on that same CPU. Each skb
holds a reference to that page and releases the page when
it is reclaimed. When the page gets full, it is released
and a new one is allocated.

Since any SKB that doesn't go through the fast-xmit path
of mac80211 will be segmented, we can assume here that the
packet is not WEP / TKIP and has a proper SNAP header.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com>
---
v4: * Create a helper to free the skb's ref to the hdr page
* add comment on that clarifies that the headers fit in a page
* return -ENOMEM if the per_cpu variable allocation failed
* don't zero the per_cpu struct after allocation
---
 drivers/net/wireless/iwlwifi/iwl-devtrace-data.h |  16 ++
 drivers/net/wireless/iwlwifi/iwl-trans.h |   6 +-
 drivers/net/wireless/iwlwifi/pcie/internal.h |   7 +
 drivers/net/wireless/iwlwifi/pcie/trans.c|  15 ++
 drivers/net/wireless/iwlwifi/pcie/tx.c   | 287 ++-
 5 files changed, 326 insertions(+), 5 deletions(-)

diff --git a/drivers/net/wireless/iwlwifi/iwl-devtrace-data.h 
b/drivers/net/wireless/iwlwifi/iwl-devtrace-data.h
index 71a78ce..59d9edf 100644
--- a/drivers/net/wireless/iwlwifi/iwl-devtrace-data.h
+++ b/drivers/net/wireless/iwlwifi/iwl-devtrace-data.h
@@ -51,6 +51,22 @@ TRACE_EVENT(iwlwifi_dev_tx_data,
TP_printk("[%s] TX frame data", __get_str(dev))
 );
 
+TRACE_EVENT(iwlwifi_dev_tx_tso_chunk,
+   TP_PROTO(const struct device *dev,
+u8 *data_src, size_t data_len),
+   TP_ARGS(dev, data_src, data_len),
+   TP_STRUCT__entry(
+   DEV_ENTRY
+
+   __dynamic_array(u8, data, data_len)
+   ),
+   TP_fast_assign(
+   DEV_ASSIGN;
+   memcpy(__get_dynamic_array(data), data_src, data_len);
+   ),
+   TP_printk("[%s] TX frame data", __get_str(dev))
+);
+
 TRACE_EVENT(iwlwifi_dev_rx_data,
TP_PROTO(const struct device *dev,
 const struct iwl_trans *trans,
diff --git a/drivers/net/wireless/iwlwifi/iwl-trans.h 
b/drivers/net/wireless/iwlwifi/iwl-trans.h
index 0ceff69..6919243 100644
--- a/drivers/net/wireless/iwlwifi/iwl-trans.h
+++ b/drivers/net/wireless/iwlwifi/iwl-trans.h
@@ -379,7 +379,11 @@ static inline void iwl_free_rxb(struct iwl_rx_cmd_buffer 
*r)
 }
 
 #define MAX_NO_RECLAIM_CMDS6
-
+/*
+ * The first entry in driver_data array in ieee80211_tx_info
+ * that can be used by the transport.
+ */
+#define IWL_FIRST_DRIVER_DATA 2
 #define IWL_MASK(lo, hi) ((1 << (hi)) | ((1 << (hi)) - (1 << (lo
 
 /*
diff --git a/drivers/net/wireless/iwlwifi/pcie/internal.h 
b/drivers/net/wireless/iwlwifi/pcie/internal.h
index be168d1..7da5643 100644
--- a/drivers/net/wireless/iwlwifi/pcie/internal.h
+++ b/drivers/net/wireless/iwlwifi/pcie/internal.h
@@ -295,6 +295,11 @@ iwl_pcie_get_scratchbuf_dma(struct iwl_txq *txq, int idx)
   sizeof(struct iwl_pcie_txq_scratch_buf) * idx;
 }
 
+struct iwl_tso_hdr_page {
+   struct page *page;
+   u8 *pos;
+};
+
 /**
  * struct iwl_trans_pcie - PCIe transport specific data
  * @rxq: all the RX queue data
@@ -332,6 +337,8 @@ struct iwl_trans_pcie {
struct net_device napi_dev;
struct napi_struct napi;
 
+   struct __percpu iwl_tso_hdr_page *tso_hdr_page;
+
/* INT ICT Table */
__le32 *ict_tbl;
dma_addr_t ict_tbl_dma;
diff --git a/drivers/net/wireless/iwlwifi/pcie/trans.c 
b/drivers/net/wireless/iwlwifi/pcie/trans.c
index a275318..f03c3e1 100644
--- a/drivers/net/wireless/iwlwifi/pcie/trans.c
+++ b/drivers/net/wireless/iwlwifi/pcie/trans.c
@@ -1601,6 +1601,7 @@ static void iwl_trans_pcie_configure(struct iwl_trans 
*trans,
 void iwl_trans_pcie_free(struct iwl_trans *trans)
 {
struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans);
+   int i;
 
 #ifdef CPTCFG_IWLWIFI_PLATFORM_DATA
/* Make sure the device is on before calling pci functions again.
@@ -1631,6 +1632,15 @@ void iwl_trans_pcie_free(struct iwl_trans *trans)
 
iwl_pcie_free_fw_monitor(trans);
 
+   for_each_possible_cpu(i) {
+   struct iwl_tso_hdr_page *p =
+   per_cpu_ptr(trans_pcie->tso_hdr_page, i);
+
+   if (p->page)
+   __free_pages(p->page, 0);
+   }
+
+   free_percpu(trans_pcie->tso_hdr_page);
iwl_trans_free(trans);
 }
 
@@ -2839,6 +2849,11 @@ stru

[RFC v4 2/2] iwlwifi: mvm: send large SKBs to the transport

2015-10-22 Thread Emmanuel Grumbach
Now that PCIe knows how to create A-MSDUs, use this
capability and prepare SKBs that are large enough to
build an A-MSDU.
Advertise TSO support towards the network stack and
segment the packet with gso_size set to be the maximal
A-MSDU length (after having taken the headers to be added
into account) to make sure that the skb that is passed
down to the transport are not longer than the maximal
A-MSDU allowed.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com>
---
v4: Don't use the original skb after it has been consumed
---
 drivers/net/wireless/iwlwifi/mvm/tx.c | 143 --
 1 file changed, 135 insertions(+), 8 deletions(-)

diff --git a/drivers/net/wireless/iwlwifi/mvm/tx.c 
b/drivers/net/wireless/iwlwifi/mvm/tx.c
index be9d7e4..d325de4 100644
--- a/drivers/net/wireless/iwlwifi/mvm/tx.c
+++ b/drivers/net/wireless/iwlwifi/mvm/tx.c
@@ -65,6 +65,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "iwl-trans.h"
 #include "iwl-eeprom-parse.h"
@@ -181,7 +182,8 @@ void iwl_mvm_set_tx_cmd(struct iwl_mvm *mvm, struct sk_buff 
*skb,
 
tx_cmd->tx_flags = cpu_to_le32(tx_flags);
/* Total # bytes to be transmitted */
-   tx_cmd->len = cpu_to_le16((u16)skb->len);
+   tx_cmd->len = cpu_to_le16((u16)skb->len +
+   (uintptr_t)info->driver_data[0]);
tx_cmd->next_frame_len = 0;
tx_cmd->life_time = cpu_to_le32(TX_CMD_LIFE_TIME_INFINITE);
tx_cmd->sta_id = sta_id;
@@ -356,7 +358,6 @@ iwl_mvm_set_tx_params(struct iwl_mvm *mvm, struct sk_buff 
*skb,
 
memset(>status, 0, sizeof(info->status));
 
-   info->driver_data[0] = NULL;
info->driver_data[1] = dev_cmd;
 
return dev_cmd;
@@ -379,6 +380,9 @@ int iwl_mvm_tx_skb_non_sta(struct iwl_mvm *mvm, struct 
sk_buff *skb)
  info->hw_queue != info->control.vif->cab_queue)))
return -1;
 
+   /* This holds the amsdu headers length */
+   info->driver_data[0] = (void *)(uintptr_t)0;
+
/*
 * IWL_MVM_OFFCHANNEL_QUEUE is used for ROC packets that can be used
 * in 2 different types of vifs, P2P & STATION. P2P uses the offchannel
@@ -435,29 +439,148 @@ int iwl_mvm_tx_skb_non_sta(struct iwl_mvm *mvm, struct 
sk_buff *skb)
return 0;
 }
 
-static int iwl_mvm_tx_tso(struct iwl_mvm *mvm, struct sk_buff *skb_gso,
+static int iwl_mvm_tx_tso(struct iwl_mvm *mvm, struct sk_buff *skb,
  struct ieee80211_sta *sta,
  struct sk_buff_head *mpdus_skb)
 {
+   struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb);
+   struct ieee80211_hdr *hdr = (void *)skb->data;
+   unsigned int mss = skb_shinfo(skb)->gso_size;
struct sk_buff *tmp, *next;
-   char cb[sizeof(skb_gso->cb)];
+   char cb[sizeof(skb->cb)];
+   unsigned int num_subframes, tcp_payload_len, subf_len;
+   u16 amsdu_add, snap_ip_tcp, pad, i = 0;
+   /* Not used in IPv6 */
+   u16 ip_base_id = ntohs(ip_hdr(skb)->id);
+
+   snap_ip_tcp = 8 + skb_transport_header(skb) - skb_network_header(skb) +
+   tcp_hdrlen(skb);
+
+   if (!sta->max_amsdu_len ||
+   !ieee80211_is_data_qos(hdr->frame_control)) {
+   num_subframes = 1;
+   /* TODO: for the compiler... */
+   pad = 0;
+   goto segment;
+   }
+
+   /*
+* Limit A-MSDU in A-MPDU to 4095 bytes when VHT is not
+* supported. This is a spec requirement (IEEE 802.11-2015
+* section 8.7.3 NOTE 3).
+*/
+
+   /* TODO: for now, disable A-MSDU inside AMPDU */
+   if (info->flags & IEEE80211_TX_CTL_AMPDU) {
+   num_subframes = 1;
+   /* TODO: for the compiler... */
+   pad = 0;
+   goto segment;
+   }
+
+   /* Sub frame header + SNAP + IP header + TCP header + MSS */
+   subf_len = sizeof(struct ethhdr) + snap_ip_tcp + mss;
+   pad = (4 - subf_len) & 0x3;
+
+   /*
+* If we have N subframes in the A-MSDU, then the A-MSDU's size is
+* N * subf_len + (N - 1) * pad.
+*/
+   num_subframes = (sta->max_amsdu_len + pad) / (subf_len + pad);
+   if (num_subframes > 1) {
+   u8 *qc = ieee80211_get_qos_ctl((void *)skb->data);
+
+   *qc |= IEEE80211_QOS_CTL_A_MSDU_PRESENT;
+   }
+
+   tcp_payload_len = skb_tail_pointer(skb) - skb_transport_header(skb) -
+   tcp_hdrlen(skb) + skb->data_len;
 
-   memcpy(cb, skb_gso->cb, sizeof(cb));
-   next = skb_gso_segment(skb_gso, 0);
-   if (IS_ERR(next))
+   /*
+* Make sure we have enough TBs for the A-MSDU:
+*  2 for each subframe
+*  1 more for each fragment
+*  1 more for the potential data in the header
+*/

[PATCH 2/2] iwlwifi: mvm: send large SKBs to the transport

2015-10-22 Thread Emmanuel Grumbach
Now that PCIe knows how to create A-MSDUs, use this
capability and prepare SKBs that are large enough to
build an A-MSDU.
Advertise TSO support towards the network stack and
segment the packet with gso_size set to be the maximal
A-MSDU length (after having taken the headers to be added
into account) to make sure that the skb that is passed
down to the transport are not longer than the maximal
A-MSDU allowed.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com>
---
No change since RFCv4
---
 drivers/net/wireless/iwlwifi/mvm/tx.c | 143 --
 1 file changed, 135 insertions(+), 8 deletions(-)

diff --git a/drivers/net/wireless/iwlwifi/mvm/tx.c 
b/drivers/net/wireless/iwlwifi/mvm/tx.c
index be9d7e4..d325de4 100644
--- a/drivers/net/wireless/iwlwifi/mvm/tx.c
+++ b/drivers/net/wireless/iwlwifi/mvm/tx.c
@@ -65,6 +65,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "iwl-trans.h"
 #include "iwl-eeprom-parse.h"
@@ -181,7 +182,8 @@ void iwl_mvm_set_tx_cmd(struct iwl_mvm *mvm, struct sk_buff 
*skb,
 
tx_cmd->tx_flags = cpu_to_le32(tx_flags);
/* Total # bytes to be transmitted */
-   tx_cmd->len = cpu_to_le16((u16)skb->len);
+   tx_cmd->len = cpu_to_le16((u16)skb->len +
+   (uintptr_t)info->driver_data[0]);
tx_cmd->next_frame_len = 0;
tx_cmd->life_time = cpu_to_le32(TX_CMD_LIFE_TIME_INFINITE);
tx_cmd->sta_id = sta_id;
@@ -356,7 +358,6 @@ iwl_mvm_set_tx_params(struct iwl_mvm *mvm, struct sk_buff 
*skb,
 
memset(>status, 0, sizeof(info->status));
 
-   info->driver_data[0] = NULL;
info->driver_data[1] = dev_cmd;
 
return dev_cmd;
@@ -379,6 +380,9 @@ int iwl_mvm_tx_skb_non_sta(struct iwl_mvm *mvm, struct 
sk_buff *skb)
  info->hw_queue != info->control.vif->cab_queue)))
return -1;
 
+   /* This holds the amsdu headers length */
+   info->driver_data[0] = (void *)(uintptr_t)0;
+
/*
 * IWL_MVM_OFFCHANNEL_QUEUE is used for ROC packets that can be used
 * in 2 different types of vifs, P2P & STATION. P2P uses the offchannel
@@ -435,29 +439,148 @@ int iwl_mvm_tx_skb_non_sta(struct iwl_mvm *mvm, struct 
sk_buff *skb)
return 0;
 }
 
-static int iwl_mvm_tx_tso(struct iwl_mvm *mvm, struct sk_buff *skb_gso,
+static int iwl_mvm_tx_tso(struct iwl_mvm *mvm, struct sk_buff *skb,
  struct ieee80211_sta *sta,
  struct sk_buff_head *mpdus_skb)
 {
+   struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb);
+   struct ieee80211_hdr *hdr = (void *)skb->data;
+   unsigned int mss = skb_shinfo(skb)->gso_size;
struct sk_buff *tmp, *next;
-   char cb[sizeof(skb_gso->cb)];
+   char cb[sizeof(skb->cb)];
+   unsigned int num_subframes, tcp_payload_len, subf_len;
+   u16 amsdu_add, snap_ip_tcp, pad, i = 0;
+   /* Not used in IPv6 */
+   u16 ip_base_id = ntohs(ip_hdr(skb)->id);
+
+   snap_ip_tcp = 8 + skb_transport_header(skb) - skb_network_header(skb) +
+   tcp_hdrlen(skb);
+
+   if (!sta->max_amsdu_len ||
+   !ieee80211_is_data_qos(hdr->frame_control)) {
+   num_subframes = 1;
+   /* TODO: for the compiler... */
+   pad = 0;
+   goto segment;
+   }
+
+   /*
+* Limit A-MSDU in A-MPDU to 4095 bytes when VHT is not
+* supported. This is a spec requirement (IEEE 802.11-2015
+* section 8.7.3 NOTE 3).
+*/
+
+   /* TODO: for now, disable A-MSDU inside AMPDU */
+   if (info->flags & IEEE80211_TX_CTL_AMPDU) {
+   num_subframes = 1;
+   /* TODO: for the compiler... */
+   pad = 0;
+   goto segment;
+   }
+
+   /* Sub frame header + SNAP + IP header + TCP header + MSS */
+   subf_len = sizeof(struct ethhdr) + snap_ip_tcp + mss;
+   pad = (4 - subf_len) & 0x3;
+
+   /*
+* If we have N subframes in the A-MSDU, then the A-MSDU's size is
+* N * subf_len + (N - 1) * pad.
+*/
+   num_subframes = (sta->max_amsdu_len + pad) / (subf_len + pad);
+   if (num_subframes > 1) {
+   u8 *qc = ieee80211_get_qos_ctl((void *)skb->data);
+
+   *qc |= IEEE80211_QOS_CTL_A_MSDU_PRESENT;
+   }
+
+   tcp_payload_len = skb_tail_pointer(skb) - skb_transport_header(skb) -
+   tcp_hdrlen(skb) + skb->data_len;
 
-   memcpy(cb, skb_gso->cb, sizeof(cb));
-   next = skb_gso_segment(skb_gso, 0);
-   if (IS_ERR(next))
+   /*
+* Make sure we have enough TBs for the A-MSDU:
+*  2 for each subframe
+*  1 more for each fragment
+*  1 more for the potential data in the header
+*/
+   num_subframes =
+   min_t(unsigned 

[PATCH 1/2] iwlwifi: pcie: allow to build an A-MSDU using TSO core

2015-10-22 Thread Emmanuel Grumbach
When the op_mode sends an skb whose payload is bigger than
MSS, PCIe will create an A-MSDU out of it. PCIe assumes
that the skb that is coming from the op_mode can fit in one
A-MSDU. It is the op_mode's responsibility to make sure
that this guarantee holds.

Additional headers need to be built for the subframes.
The TSO core code takes care of the IP / TCP headers and
the driver takes care of the 802.11 subframe headers.

These headers are stored on a per-cpu page that is re-used
for all the packets handled on that same CPU. Each skb
holds a reference to that page and releases the page when
it is reclaimed. When the page gets full, it is released
and a new one is allocated.

Since any SKB that doesn't go through the fast-xmit path
of mac80211 will be segmented, we can assume here that the
packet is not WEP / TKIP and has a proper SNAP header.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com>
---
Changes since RFCv4, addressing Eliad's comments:
 * free trans_pcie->tso_hdr_page in the failure path of iwl_trans_pcie_alloc
 * s/alloc_pages(gfp, 0)/alloc_page(gfp)
 * make sure that get_page_hdr's allocation didn't fail
 * use data_left instead of mss for the subframe header's length field
 * free csum_skb in the failure paths
 * identation
---
 drivers/net/wireless/iwlwifi/iwl-devtrace-data.h |  16 ++
 drivers/net/wireless/iwlwifi/iwl-trans.h |   6 +-
 drivers/net/wireless/iwlwifi/pcie/internal.h |   7 +
 drivers/net/wireless/iwlwifi/pcie/trans.c|  16 ++
 drivers/net/wireless/iwlwifi/pcie/tx.c   | 295 ++-
 5 files changed, 334 insertions(+), 6 deletions(-)

diff --git a/drivers/net/wireless/iwlwifi/iwl-devtrace-data.h 
b/drivers/net/wireless/iwlwifi/iwl-devtrace-data.h
index 71a78ce..59d9edf 100644
--- a/drivers/net/wireless/iwlwifi/iwl-devtrace-data.h
+++ b/drivers/net/wireless/iwlwifi/iwl-devtrace-data.h
@@ -51,6 +51,22 @@ TRACE_EVENT(iwlwifi_dev_tx_data,
TP_printk("[%s] TX frame data", __get_str(dev))
 );
 
+TRACE_EVENT(iwlwifi_dev_tx_tso_chunk,
+   TP_PROTO(const struct device *dev,
+u8 *data_src, size_t data_len),
+   TP_ARGS(dev, data_src, data_len),
+   TP_STRUCT__entry(
+   DEV_ENTRY
+
+   __dynamic_array(u8, data, data_len)
+   ),
+   TP_fast_assign(
+   DEV_ASSIGN;
+   memcpy(__get_dynamic_array(data), data_src, data_len);
+   ),
+   TP_printk("[%s] TX frame data", __get_str(dev))
+);
+
 TRACE_EVENT(iwlwifi_dev_rx_data,
TP_PROTO(const struct device *dev,
 const struct iwl_trans *trans,
diff --git a/drivers/net/wireless/iwlwifi/iwl-trans.h 
b/drivers/net/wireless/iwlwifi/iwl-trans.h
index 0ceff69..6919243 100644
--- a/drivers/net/wireless/iwlwifi/iwl-trans.h
+++ b/drivers/net/wireless/iwlwifi/iwl-trans.h
@@ -379,7 +379,11 @@ static inline void iwl_free_rxb(struct iwl_rx_cmd_buffer 
*r)
 }
 
 #define MAX_NO_RECLAIM_CMDS6
-
+/*
+ * The first entry in driver_data array in ieee80211_tx_info
+ * that can be used by the transport.
+ */
+#define IWL_FIRST_DRIVER_DATA 2
 #define IWL_MASK(lo, hi) ((1 << (hi)) | ((1 << (hi)) - (1 << (lo
 
 /*
diff --git a/drivers/net/wireless/iwlwifi/pcie/internal.h 
b/drivers/net/wireless/iwlwifi/pcie/internal.h
index be168d1..7da5643 100644
--- a/drivers/net/wireless/iwlwifi/pcie/internal.h
+++ b/drivers/net/wireless/iwlwifi/pcie/internal.h
@@ -295,6 +295,11 @@ iwl_pcie_get_scratchbuf_dma(struct iwl_txq *txq, int idx)
   sizeof(struct iwl_pcie_txq_scratch_buf) * idx;
 }
 
+struct iwl_tso_hdr_page {
+   struct page *page;
+   u8 *pos;
+};
+
 /**
  * struct iwl_trans_pcie - PCIe transport specific data
  * @rxq: all the RX queue data
@@ -332,6 +337,8 @@ struct iwl_trans_pcie {
struct net_device napi_dev;
struct napi_struct napi;
 
+   struct __percpu iwl_tso_hdr_page *tso_hdr_page;
+
/* INT ICT Table */
__le32 *ict_tbl;
dma_addr_t ict_tbl_dma;
diff --git a/drivers/net/wireless/iwlwifi/pcie/trans.c 
b/drivers/net/wireless/iwlwifi/pcie/trans.c
index a275318..93e488f 100644
--- a/drivers/net/wireless/iwlwifi/pcie/trans.c
+++ b/drivers/net/wireless/iwlwifi/pcie/trans.c
@@ -1601,6 +1601,7 @@ static void iwl_trans_pcie_configure(struct iwl_trans 
*trans,
 void iwl_trans_pcie_free(struct iwl_trans *trans)
 {
struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans);
+   int i;
 
 #ifdef CPTCFG_IWLWIFI_PLATFORM_DATA
/* Make sure the device is on before calling pci functions again.
@@ -1631,6 +1632,15 @@ void iwl_trans_pcie_free(struct iwl_trans *trans)
 
iwl_pcie_free_fw_monitor(trans);
 
+   for_each_possible_cpu(i) {
+   struct iwl_tso_hdr_page *p =
+   per_cpu_ptr(trans_pcie->tso_hdr_page, i);
+
+   if (p->page)
+   __free_pages(p->page, 0);
+   }
+
+   free_

[RFC v3 2/2] iwlwifi: mvm: send large SKBs to the transport

2015-10-21 Thread Emmanuel Grumbach
Now that PCIe knows how to create A-MSDUs, use this
capability and prepare SKBs that are large enough to
build an A-MSDU.
Advertise TSO support towards the network stack and
segment the packet with gso_size set to be the maximal
A-MSDU length (after having taken the headers to be added
into account) to make sure that the skb that is passed
down to the transport are not longer than the maximal
A-MSDU allowed.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com>
---
 drivers/net/wireless/iwlwifi/mvm/tx.c | 143 --
 1 file changed, 135 insertions(+), 8 deletions(-)

diff --git a/drivers/net/wireless/iwlwifi/mvm/tx.c 
b/drivers/net/wireless/iwlwifi/mvm/tx.c
index be9d7e4..4f23149 100644
--- a/drivers/net/wireless/iwlwifi/mvm/tx.c
+++ b/drivers/net/wireless/iwlwifi/mvm/tx.c
@@ -65,6 +65,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "iwl-trans.h"
 #include "iwl-eeprom-parse.h"
@@ -181,7 +182,8 @@ void iwl_mvm_set_tx_cmd(struct iwl_mvm *mvm, struct sk_buff 
*skb,
 
tx_cmd->tx_flags = cpu_to_le32(tx_flags);
/* Total # bytes to be transmitted */
-   tx_cmd->len = cpu_to_le16((u16)skb->len);
+   tx_cmd->len = cpu_to_le16((u16)skb->len +
+   (uintptr_t)info->driver_data[0]);
tx_cmd->next_frame_len = 0;
tx_cmd->life_time = cpu_to_le32(TX_CMD_LIFE_TIME_INFINITE);
tx_cmd->sta_id = sta_id;
@@ -356,7 +358,6 @@ iwl_mvm_set_tx_params(struct iwl_mvm *mvm, struct sk_buff 
*skb,
 
memset(>status, 0, sizeof(info->status));
 
-   info->driver_data[0] = NULL;
info->driver_data[1] = dev_cmd;
 
return dev_cmd;
@@ -379,6 +380,9 @@ int iwl_mvm_tx_skb_non_sta(struct iwl_mvm *mvm, struct 
sk_buff *skb)
  info->hw_queue != info->control.vif->cab_queue)))
return -1;
 
+   /* This holds the amsdu headers length */
+   info->driver_data[0] = (void *)(uintptr_t)0;
+
/*
 * IWL_MVM_OFFCHANNEL_QUEUE is used for ROC packets that can be used
 * in 2 different types of vifs, P2P & STATION. P2P uses the offchannel
@@ -435,29 +439,148 @@ int iwl_mvm_tx_skb_non_sta(struct iwl_mvm *mvm, struct 
sk_buff *skb)
return 0;
 }
 
-static int iwl_mvm_tx_tso(struct iwl_mvm *mvm, struct sk_buff *skb_gso,
+static int iwl_mvm_tx_tso(struct iwl_mvm *mvm, struct sk_buff *skb,
  struct ieee80211_sta *sta,
  struct sk_buff_head *mpdus_skb)
 {
+   struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb);
+   struct ieee80211_hdr *hdr = (void *)skb->data;
+   unsigned int mss = skb_shinfo(skb)->gso_size;
struct sk_buff *tmp, *next;
-   char cb[sizeof(skb_gso->cb)];
+   char cb[sizeof(skb->cb)];
+   unsigned int num_subframes, tcp_payload_len, subf_len;
+   u16 amsdu_add, snap_ip_tcp, pad, i = 0;
+
+   snap_ip_tcp = 8 + skb_transport_header(skb) - skb_network_header(skb) +
+   tcp_hdrlen(skb);
+
+   if (!sta->max_amsdu_len ||
+   !ieee80211_is_data_qos(hdr->frame_control)) {
+   num_subframes = 1;
+   /* TODO: for the compiler... */
+   pad = 0;
+   goto segment;
+   }
+
+   /*
+* Limit A-MSDU in A-MPDU to 4095 bytes when VHT is not
+* supported. This is a spec requirement (IEEE 802.11-2015
+* section 8.7.3 NOTE 3).
+*/
+
+   /* TODO: for now, disable A-MSDU inside AMPDU */
+   if (info->flags & IEEE80211_TX_CTL_AMPDU) {
+   num_subframes = 1;
+   /* TODO: for the compiler... */
+   pad = 0;
+   goto segment;
+   }
+
+   /* Sub frame header + SNAP + IP header + TCP header + MSS */
+   subf_len = sizeof(struct ethhdr) + snap_ip_tcp + mss;
+   pad = (4 - subf_len) & 0x3;
+
+   /*
+* If we have N subframes in the A-MSDU, then the A-MSDU's size is
+* N * subf_len + (N - 1) * pad.
+*/
+   num_subframes = (sta->max_amsdu_len + pad) / (subf_len + pad);
+   if (num_subframes > 1) {
+   u8 *qc = ieee80211_get_qos_ctl((void *)skb->data);
+
+   *qc |= IEEE80211_QOS_CTL_A_MSDU_PRESENT;
+   }
+
+   tcp_payload_len = skb_tail_pointer(skb) - skb_transport_header(skb) -
+   tcp_hdrlen(skb) + skb->data_len;
+
+   /*
+* Make sure we have enough TBs for the A-MSDU:
+*  2 for each subframe
+*  1 more for each fragment
+*  1 more for the potential data in the header
+*/
+   num_subframes =
+   min_t(unsigned int, num_subframes,
+ (mvm->trans->max_skb_frags - 1 -
+  skb_shinfo(skb)->nr_frags) / 2);
+
+   /* This skb fits in one single A-MSDU */
+   if (num_

[RFC v3 1/2] iwlwifi: pcie: allow to build an A-MSDU using TSO core

2015-10-21 Thread Emmanuel Grumbach
When the op_mode sends an skb whose payload is bigger than
MSS, PCIe will create an A-MSDU out of it. PCIe assumes
that the skb that is coming from the op_mode can fit in one
A-MSDU. It is the op_mode's responsibility to make sure
that this guarantee holds.

Additional headers need to be built for the subframes.
The TSO core code takes care of the IP / TCP headers and
the driver takes care of the 802.11 subframe headers.

These headers are stored on a per-cpu page that is re-used
for all the packets handled on that same CPU. Each skb
holds a reference to that page and releases the page when
it is reclaimed. When the page gets full, it is released
and a new one is allocated.

Since any SKB that doesn't go through the fast-xmit path
of mac80211 will be segmented, we can assume here that the
packet is not WEP / TKIP and has a proper SNAP header.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com>
---
 drivers/net/wireless/iwlwifi/iwl-devtrace-data.h |  16 ++
 drivers/net/wireless/iwlwifi/iwl-trans.h |   6 +-
 drivers/net/wireless/iwlwifi/pcie/internal.h |   7 +
 drivers/net/wireless/iwlwifi/pcie/trans.c|  20 +-
 drivers/net/wireless/iwlwifi/pcie/tx.c   | 286 ++-
 5 files changed, 329 insertions(+), 6 deletions(-)

diff --git a/drivers/net/wireless/iwlwifi/iwl-devtrace-data.h 
b/drivers/net/wireless/iwlwifi/iwl-devtrace-data.h
index 71a78ce..59d9edf 100644
--- a/drivers/net/wireless/iwlwifi/iwl-devtrace-data.h
+++ b/drivers/net/wireless/iwlwifi/iwl-devtrace-data.h
@@ -51,6 +51,22 @@ TRACE_EVENT(iwlwifi_dev_tx_data,
TP_printk("[%s] TX frame data", __get_str(dev))
 );
 
+TRACE_EVENT(iwlwifi_dev_tx_tso_chunk,
+   TP_PROTO(const struct device *dev,
+u8 *data_src, size_t data_len),
+   TP_ARGS(dev, data_src, data_len),
+   TP_STRUCT__entry(
+   DEV_ENTRY
+
+   __dynamic_array(u8, data, data_len)
+   ),
+   TP_fast_assign(
+   DEV_ASSIGN;
+   memcpy(__get_dynamic_array(data), data_src, data_len);
+   ),
+   TP_printk("[%s] TX frame data", __get_str(dev))
+);
+
 TRACE_EVENT(iwlwifi_dev_rx_data,
TP_PROTO(const struct device *dev,
 const struct iwl_trans *trans,
diff --git a/drivers/net/wireless/iwlwifi/iwl-trans.h 
b/drivers/net/wireless/iwlwifi/iwl-trans.h
index 0ceff69..6919243 100644
--- a/drivers/net/wireless/iwlwifi/iwl-trans.h
+++ b/drivers/net/wireless/iwlwifi/iwl-trans.h
@@ -379,7 +379,11 @@ static inline void iwl_free_rxb(struct iwl_rx_cmd_buffer 
*r)
 }
 
 #define MAX_NO_RECLAIM_CMDS6
-
+/*
+ * The first entry in driver_data array in ieee80211_tx_info
+ * that can be used by the transport.
+ */
+#define IWL_FIRST_DRIVER_DATA 2
 #define IWL_MASK(lo, hi) ((1 << (hi)) | ((1 << (hi)) - (1 << (lo
 
 /*
diff --git a/drivers/net/wireless/iwlwifi/pcie/internal.h 
b/drivers/net/wireless/iwlwifi/pcie/internal.h
index be168d1..7da5643 100644
--- a/drivers/net/wireless/iwlwifi/pcie/internal.h
+++ b/drivers/net/wireless/iwlwifi/pcie/internal.h
@@ -295,6 +295,11 @@ iwl_pcie_get_scratchbuf_dma(struct iwl_txq *txq, int idx)
   sizeof(struct iwl_pcie_txq_scratch_buf) * idx;
 }
 
+struct iwl_tso_hdr_page {
+   struct page *page;
+   u8 *pos;
+};
+
 /**
  * struct iwl_trans_pcie - PCIe transport specific data
  * @rxq: all the RX queue data
@@ -332,6 +337,8 @@ struct iwl_trans_pcie {
struct net_device napi_dev;
struct napi_struct napi;
 
+   struct __percpu iwl_tso_hdr_page *tso_hdr_page;
+
/* INT ICT Table */
__le32 *ict_tbl;
dma_addr_t ict_tbl_dma;
diff --git a/drivers/net/wireless/iwlwifi/pcie/trans.c 
b/drivers/net/wireless/iwlwifi/pcie/trans.c
index a275318..5bd678b 100644
--- a/drivers/net/wireless/iwlwifi/pcie/trans.c
+++ b/drivers/net/wireless/iwlwifi/pcie/trans.c
@@ -1601,6 +1601,7 @@ static void iwl_trans_pcie_configure(struct iwl_trans 
*trans,
 void iwl_trans_pcie_free(struct iwl_trans *trans)
 {
struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans);
+   int i;
 
 #ifdef CPTCFG_IWLWIFI_PLATFORM_DATA
/* Make sure the device is on before calling pci functions again.
@@ -1631,6 +1632,15 @@ void iwl_trans_pcie_free(struct iwl_trans *trans)
 
iwl_pcie_free_fw_monitor(trans);
 
+   for_each_possible_cpu(i) {
+   struct iwl_tso_hdr_page *p =
+   per_cpu_ptr(trans_pcie->tso_hdr_page, i);
+
+   if (p->page)
+   __free_pages(p->page, 0);
+   }
+
+   free_percpu(trans_pcie->tso_hdr_page);
iwl_trans_free(trans);
 }
 
@@ -2822,7 +2832,7 @@ struct iwl_trans *iwl_trans_pcie_alloc(struct pci_dev 
*pdev,
struct iwl_trans_pcie *trans_pcie;
struct iwl_trans *trans;
u16 pci_cmd;
-   int ret;
+   int i, ret;
 
trans = iwl_trans_alloc(sizeof(struct iwl_trans_pcie),

[RFC v3 0/2] Add TSO for iwlwifi

2015-10-21 Thread Emmanuel Grumbach
Here is a v3 of the series that has been sent uite a while
ago. V3 is completly different from v2 and (tries to)
address Eric's comments.
I have to admit that the code looks much easier now :)
The purpose is still to be able to create an A-MSDU from
a large send.

iwlwifi is split into 2 different layers:
the op_mode (iwlmvm.ko) and the transport (iwlwifi.ko).

The transport uses the the TSO core to split the payload
and builds an A-MSDU out of it. It needs additional memory
to store the new headers (n - 1) SNAP / IP / TCP headers
that weren't in the original skb and the 802.11 subframe
headers with padding. I allocate a per-cpu page to hold
these headers. When the page is full, I get a new one.
This is the most efficient way I could think about because
it avoids to allocate a (rather big: ~140 byte) data
structure for each Tx descriptor (and we have a lot...).
This is also efficient in terms of data locality.
Allocating / freeing a page is supposed to be very fast.

Since the transport doesn't handle sequence numbers and
other per 802.11 packet related stuff, it needs to get
only one 802.11 at a time. The problem here is that based
on link condition and peer caps, the size of the maximal
802.11 packet may vary: a same netdev can send data to
different peers (think about 2 stations associated to an
Access Point) and then, skbs coming from the same netdev
need to be split into chunks of different sizes so that the
transport will be able to create A-MSDUs. This will happen
if one station supports 11K A-MSDU, while the other one
supports 4K A-MSDU only. This requirement disallows the
use of gso_max_{segs,size} since they can't differentiate
between different peers.

Of course, I could make the transport aware of all the
sequence numbers and friends, but that seemed too
intrusive into the current architecture.
Instead, I prefered to use skb_gso_segment and let it do
the work. When I get an skb to be sent, I know the peer
and its caps. I can then determine what is the maximal
length of A-MSDU for this peer and skb_gso_segment this
skb with gso_size set to that maximal A-MSDU length.
Note that the maximal A-MSDU length can be up to 11K
nowadays, so we are far from the limit of the 2 bytes
of gso_size.
After I did that, I only need to fixup slightly the id in
the IP header.

There are still a few minor TODOs here and there, but
I think (and hope) we are getting close :)

Emmanuel Grumbach (2):
  iwlwifi: pcie: allow to build an A-MSDU using TSO core
  iwlwifi: mvm: send large SKBs to the transport

 drivers/net/wireless/iwlwifi/iwl-devtrace-data.h |  16 ++
 drivers/net/wireless/iwlwifi/iwl-trans.h |   6 +-
 drivers/net/wireless/iwlwifi/mvm/tx.c| 143 +++-
 drivers/net/wireless/iwlwifi/pcie/internal.h |   7 +
 drivers/net/wireless/iwlwifi/pcie/trans.c|  20 +-
 drivers/net/wireless/iwlwifi/pcie/tx.c   | 286 ++-
 6 files changed, 464 insertions(+), 14 deletions(-)

-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC v3 1/3] iwlwifi: mvm: add real TSO implementation

2015-08-20 Thread Emmanuel Grumbach
The segmentation is done completely in software. The
driver creates several MPDUs out of a single large send.
Each MPDU is a newly allocated SKB.
A page is allocated to create the headers that need to be
duplicated (SNAP / IP / TCP). The WiFi header is in the
header of the newly created SKBs.

type=feature

Change-Id: I238ffa79cacc5bbdacdfbf3e9673c8d4f02b462a
Signed-off-by: Emmanuel Grumbach emmanuel.grumb...@intel.com
---
 drivers/net/wireless/iwlwifi/mvm/tx.c | 513 +++---
 1 file changed, 481 insertions(+), 32 deletions(-)

diff --git a/drivers/net/wireless/iwlwifi/mvm/tx.c 
b/drivers/net/wireless/iwlwifi/mvm/tx.c
index 90f0ea1..a63686c 100644
--- a/drivers/net/wireless/iwlwifi/mvm/tx.c
+++ b/drivers/net/wireless/iwlwifi/mvm/tx.c
@@ -65,6 +65,7 @@
 #include linux/ieee80211.h
 #include linux/etherdevice.h
 #include net/tcp.h
+#include net/ip.h
 
 #include iwl-trans.h
 #include iwl-eeprom-parse.h
@@ -435,32 +436,471 @@ int iwl_mvm_tx_skb_non_sta(struct iwl_mvm *mvm, struct 
sk_buff *skb)
return 0;
 }
 
+/*
+ * Update the IP / TCP headers and recompute the IP header CSUM +
+ * pseudo header CSUM.
+ */
+static void iwl_update_ip_tcph(void *iph, struct tcphdr *tcph, bool ipv6,
+  unsigned int len, unsigned int tcp_seq_offset,
+  u16 num_segment)
+{
+   be32_add_cpu(tcph-seq, tcp_seq_offset);
+
+   if (ipv6) {
+   struct ipv6hdr *iphv6 = iph;
+
+   iphv6-payload_len = cpu_to_be16(len + tcph-doff * 4);
+
+   /* Compute CSUM on the the pseudo-header */
+   tcph-check = ~csum_ipv6_magic(iphv6-saddr, iphv6-daddr,
+  len + tcph-doff * 4,
+  IPPROTO_TCP, 0);
+   } else {
+   struct iphdr *iphv4 = iph;
+
+   iphv4-tot_len =
+   cpu_to_be16(len + tcph-doff * 4 + iphv4-ihl * 4);
+   be16_add_cpu(iphv4-id, num_segment);
+   ip_send_check(iphv4);
+
+   /* Compute CSUM on the the pseudo-header */
+   tcph-check = ~csum_tcpudp_magic(iphv4-saddr, iphv4-daddr,
+len + tcph-doff * 4,
+IPPROTO_TCP, 0);
+   }
+}
+
+/**
+ * struct iwl_lso_splitter - state of the split.
+ * @linear_payload_len: The length of the payload inside the header of the
+ * original GSO skb.
+ * @gso_frag_num: The fragment number from which to take the data in the
+ * original GSO skb.
+ * @gso_payload_len: The length of the payload in the original GSO skb.
+ * @gso_payload_pos: The incrementing position in the payload of the original
+ * GSO skb.
+ * @gso_offset_in_page: The offset in the page of gso_frag_num.
+ * @gso_current_frag_size: The size of gso_frag_num.
+ * @gso_offset_in_frag: The offset in the gso_frag_num.
+ * @frag_in_mpdu: The index of the frag inside the new (split) MPDU.
+ * @mss: The maximal segment size.
+ * @si: Points to the the shared info of the original GSO skb.
+ * @ieee80211_hdr *hdr: Points to the WiFi header.
+ * @gso_nr_frags: The number of frags in the original GSO skb.
+ * @wifi_hdr_iv_len: The length of the WiFi header including IV.
+ * @tcp_fin: True if TCP_FIN is set in the original GSO skb.
+ * @tcp_push: True if TCP_PSH is set in the original GSO skb.
+ */
+struct iwl_lso_splitter {
+   unsigned int linear_payload_len;
+   unsigned int gso_frag_num;
+   unsigned int gso_payload_len;
+   unsigned int gso_payload_pos;
+   unsigned int gso_offset_in_page;
+   unsigned int gso_current_frag_size;
+   unsigned int gso_offset_in_frag;
+   unsigned int frag_in_mpdu;
+   unsigned int mss;
+   struct skb_shared_info *si;
+   struct ieee80211_hdr *hdr;
+   u8 gso_nr_frags;
+   u8 wifi_hdr_iv_len;
+   bool tcp_fin;
+   bool tcp_push;
+};
+
+/*
+ * Adds a TCP segment from skb_gso to skb. All the state is taken from
+ * and fed back to p. This function takes care about the payload only.
+ * This MSDU might already have msdu_sz bytes of payload that come from
+ * the original GSO skb's header.
+ */
+static unsigned int
+iwl_add_tcp_segment(struct iwl_mvm *mvm, struct sk_buff *skb_gso,
+   struct sk_buff *skb, struct iwl_lso_splitter *p,
+   unsigned int msdu_sz)
+{
+   while (msdu_sz  p-mss) {
+   unsigned int frag_sz =
+   min_t(unsigned int, p-gso_current_frag_size,
+ p-mss - msdu_sz);
+
+   if (p-frag_in_mpdu = mvm-trans-max_skb_frags)
+   return msdu_sz;
+
+   skb_add_rx_frag(skb, p-frag_in_mpdu,
+   skb_frag_page(p-si-frags[p-gso_frag_num]),
+   p-gso_offset_in_page, frag_sz, 0);
+
+   /* We just added one frag to the mpdu

[RFC v3 2/3] iwlwifi: mvm: allow to create A-MSDUs from a large send

2015-08-20 Thread Emmanuel Grumbach
Now that we can get a big chunk of data from the network
stack, we can create an A-MSDU out of it. The purpose is to
get a throughput improvement since sending one single A-MSDU
is more efficient than sending several MSDUs at least under
ideal link conditions.

type=feature

Change-Id: I5ea1b1132a57542187cd4c34c5299dbf44fe8b01
Signed-off-by: Emmanuel Grumbach emmanuel.grumb...@intel.com
---
 drivers/net/wireless/iwlwifi/mvm/mac80211.c |   3 +-
 drivers/net/wireless/iwlwifi/mvm/sta.c  |   4 +-
 drivers/net/wireless/iwlwifi/mvm/sta.h  |   6 +-
 drivers/net/wireless/iwlwifi/mvm/tx.c   | 159 ++--
 4 files changed, 160 insertions(+), 12 deletions(-)

diff --git a/drivers/net/wireless/iwlwifi/mvm/mac80211.c 
b/drivers/net/wireless/iwlwifi/mvm/mac80211.c
index 3dd4e97..dd15e04 100644
--- a/drivers/net/wireless/iwlwifi/mvm/mac80211.c
+++ b/drivers/net/wireless/iwlwifi/mvm/mac80211.c
@@ -925,7 +925,8 @@ static int iwl_mvm_mac_ampdu_action(struct ieee80211_hw *hw,
ret = iwl_mvm_sta_tx_agg_flush(mvm, vif, sta, tid);
break;
case IEEE80211_AMPDU_TX_OPERATIONAL:
-   ret = iwl_mvm_sta_tx_agg_oper(mvm, vif, sta, tid, buf_size);
+   ret = iwl_mvm_sta_tx_agg_oper(mvm, vif, sta, tid,
+ buf_size, amsdu);
break;
default:
WARN_ON_ONCE(1);
diff --git a/drivers/net/wireless/iwlwifi/mvm/sta.c 
b/drivers/net/wireless/iwlwifi/mvm/sta.c
index df216cd..606fc09 100644
--- a/drivers/net/wireless/iwlwifi/mvm/sta.c
+++ b/drivers/net/wireless/iwlwifi/mvm/sta.c
@@ -976,7 +976,8 @@ int iwl_mvm_sta_tx_agg_start(struct iwl_mvm *mvm, struct 
ieee80211_vif *vif,
 }
 
 int iwl_mvm_sta_tx_agg_oper(struct iwl_mvm *mvm, struct ieee80211_vif *vif,
-   struct ieee80211_sta *sta, u16 tid, u8 buf_size)
+   struct ieee80211_sta *sta, u16 tid, u8 buf_size,
+   bool amsdu)
 {
struct iwl_mvm_sta *mvmsta = iwl_mvm_sta_from_mac80211(sta);
struct iwl_mvm_tid_data *tid_data = mvmsta-tid_data[tid];
@@ -995,6 +996,7 @@ int iwl_mvm_sta_tx_agg_oper(struct iwl_mvm *mvm, struct 
ieee80211_vif *vif,
queue = tid_data-txq_id;
tid_data-state = IWL_AGG_ON;
mvmsta-agg_tids |= BIT(tid);
+   tid_data-amsdu_in_ampdu_allowed = amsdu;
tid_data-ssn = 0x;
spin_unlock_bh(mvmsta-lock);
 
diff --git a/drivers/net/wireless/iwlwifi/mvm/sta.h 
b/drivers/net/wireless/iwlwifi/mvm/sta.h
index eedb215..26d1e31 100644
--- a/drivers/net/wireless/iwlwifi/mvm/sta.h
+++ b/drivers/net/wireless/iwlwifi/mvm/sta.h
@@ -258,6 +258,8 @@ enum iwl_mvm_agg_state {
  * Tx response (TX_CMD), and the block ack notification (COMPRESSED_BA).
  * @reduced_tpc: Reduced tx power. Holds the data between the
  * Tx response (TX_CMD), and the block ack notification (COMPRESSED_BA).
+ * @amsdu_in_ampdu_allowed: true if A-MSDU in A-MPDU is allowed. Relevant only
+ * if state is %IWL_AGG_ON.
  * @state: state of the BA agreement establishment / tear down.
  * @txq_id: Tx queue used by the BA session
  * @ssn: the first packet to be sent in AGG HW queue in Tx AGG start flow, or
@@ -272,6 +274,7 @@ struct iwl_mvm_tid_data {
/* The rest is Tx AGG related */
u32 rate_n_flags;
u8 reduced_tpc;
+   bool amsdu_in_ampdu_allowed;
enum iwl_mvm_agg_state state;
u16 txq_id;
u16 ssn;
@@ -387,7 +390,8 @@ int iwl_mvm_sta_rx_agg(struct iwl_mvm *mvm, struct 
ieee80211_sta *sta,
 int iwl_mvm_sta_tx_agg_start(struct iwl_mvm *mvm, struct ieee80211_vif *vif,
struct ieee80211_sta *sta, u16 tid, u16 *ssn);
 int iwl_mvm_sta_tx_agg_oper(struct iwl_mvm *mvm, struct ieee80211_vif *vif,
-   struct ieee80211_sta *sta, u16 tid, u8 buf_size);
+   struct ieee80211_sta *sta, u16 tid, u8 buf_size,
+   bool amsdu);
 int iwl_mvm_sta_tx_agg_stop(struct iwl_mvm *mvm, struct ieee80211_vif *vif,
struct ieee80211_sta *sta, u16 tid);
 int iwl_mvm_sta_tx_agg_flush(struct iwl_mvm *mvm, struct ieee80211_vif *vif,
diff --git a/drivers/net/wireless/iwlwifi/mvm/tx.c 
b/drivers/net/wireless/iwlwifi/mvm/tx.c
index a63686c..5046833 100644
--- a/drivers/net/wireless/iwlwifi/mvm/tx.c
+++ b/drivers/net/wireless/iwlwifi/mvm/tx.c
@@ -488,8 +488,10 @@ static void iwl_update_ip_tcph(void *iph, struct tcphdr 
*tcph, bool ipv6,
  * @ieee80211_hdr *hdr: Points to the WiFi header.
  * @gso_nr_frags: The number of frags in the original GSO skb.
  * @wifi_hdr_iv_len: The length of the WiFi header including IV.
+ * @amsdu_pad: Number of bytes for the A-MSDU subframe
  * @tcp_fin: True if TCP_FIN is set in the original GSO skb.
  * @tcp_push: True if TCP_PSH is set in the original GSO skb.
+ * @amsdu: True if we are building an A-MSDU
  */
 struct iwl_lso_splitter {
unsigned int

[RFC v3 0/3] add TSO / A-MSDU TX for iwlwifi

2015-08-20 Thread Emmanuel Grumbach
An A-MSDU is a 802.11 frame that comprises several L3
segments:
Each subframe has a ETH / SNAP / IP / TCP header (in case 
TCP is used of course). Each subframe has no 802.11
limitation on its length (only the MSS limitation), but
the whole A-MSDU has a size limit both in number of
subframes, and in bytes. In the best conditions, the
A-MSDU is unlimited in number of subframes, and can be up
to 11K byte long. An A-MSDU has only one WiFi MAC address,
and hence, each A-MSDU can be sent to only one WiFi peer.

There a 2 ways to get A-MSDU:
 * xmit_more
 * LSO
 * LSO + skb_gso_segment

The problem with xmit_more is that in AP mode, the Qdisc
can have packets for several L2 clients which makes is
harder to use since as stated above, an A-MSDU can be sent
to one single WiFi peer.

The problem with LSO is that the LSO packet is very likely
to be way bigger than the maximal A-MSDU length. This means
that a single LSO packet will generate several WiFi packets.
From an arch point of view, having different WiFi packets
sitting in one single skb is a problem, so we need to create
several skbs out of one single LSO packet. This makes the
usage of currently existing helpers (net/core/tso.c) very
hard.

Using LSO and skb_gso_segment is a bit wasteful since we
don't need an skb for each MSS, so that we would have to
reassemble the segs into one single packet. This is
sub-efficient.

Out of the two aforementioned ways, I chose to work with
LSO and implemented the segmentation in the driver itself.
While this code can technically be made driver agnostic,
the trend in the industry seems to show that more and more
vendors split the buffers in the firmware, so that I don't
expect to see any new devices coming up with the same
behavior as ours. I did take a look on currently existing
drivers and they already do the work in firmware.
If someone in linux-wireless thinks that this code can
serve his purposes, please speak up. All you'd need is to
support TX_CSUM and SG.

The additional headers needed (SNAP / IP / TCP) are copied
and updated on a page which is allocated in xmit. This is
supposed to very cheap. I add skb frags from that page to 
add the skb. Since I can have several 802.11 packets from a
single LSO packet, I allocated skb's on the way if needed.

I am quite a newbie in skb handling, so I guess that this
code can be improved. I have tested it decently using iperf,
but this doesn't mean that there are no issues using other
applications. We are enabling pktgen on TCP (using patches
that were sent a year ago or so) to test the different
layouts of the skb (payload partition amongst the header
and the different frags).

Changes since v2:
*
   + I fixed the whitespace problem spotted by Sergei.
   + I changed the D'tor check for truesize accounting.
 Since the D'tor seems to be set to tcp_wfree if it
 exists, I just the D'tor not being NULL.

Emmanuel Grumbach (3):
  iwlwifi: mvm: add real TSO implementation
  iwlwifi: mvm: allow to create A-MSDUs from a large send
  iwlwifi: mvm: transfer the truesize to the last TSO segment

 drivers/net/wireless/iwlwifi/mvm/mac80211.c |   3 +-
 drivers/net/wireless/iwlwifi/mvm/sta.c  |   4 +-
 drivers/net/wireless/iwlwifi/mvm/sta.h  |   6 +-
 drivers/net/wireless/iwlwifi/mvm/tx.c   | 669 ++--
 4 files changed, 647 insertions(+), 35 deletions(-)

-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC v3 3/3] iwlwifi: mvm: transfer the truesize to the last TSO segment

2015-08-20 Thread Emmanuel Grumbach
This allows to release the backpressure on the socket only
when the last segment is released.
Now the truesize looks like this:
if the truesize of the original skb is 65420, all the
segments will have a truesize of 704 (skb itself) and the
last one will have 65420.

Change-Id: I3c894cf2afc0aedfe7b2a5b992ba41653ff79c0e
Signed-off-by: Emmanuel Grumbach emmanuel.grumb...@intel.com
---
 drivers/net/wireless/iwlwifi/mvm/tx.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/iwlwifi/mvm/tx.c 
b/drivers/net/wireless/iwlwifi/mvm/tx.c
index 5046833..2aeb5fd 100644
--- a/drivers/net/wireless/iwlwifi/mvm/tx.c
+++ b/drivers/net/wireless/iwlwifi/mvm/tx.c
@@ -764,7 +764,7 @@ static int iwl_mvm_tx_tso(struct iwl_mvm *mvm, struct 
sk_buff *skb_gso,
bool ipv6 = skb_shinfo(skb_gso)-gso_type  SKB_GSO_TCPV6;
struct iwl_lso_splitter s = {};
struct page *hdr_page;
-   unsigned int mpdu_sz;
+   unsigned int mpdu_sz, sum_truesize = 0;
u8 *hdr_page_pos, *qc, tid;
int i, ret;
 
@@ -898,6 +898,7 @@ static int iwl_mvm_tx_tso(struct iwl_mvm *mvm, struct 
sk_buff *skb_gso,
mpdu_sz, tcp_hdrlen(skb_gso));
 
__skb_queue_tail(mpdus_skb, skb_gso);
+   sum_truesize += skb_gso-truesize;
 
/* mss bytes have been consumed from the data */
s.gso_payload_pos = s.mss;
@@ -1034,6 +1035,20 @@ static int iwl_mvm_tx_tso(struct iwl_mvm *mvm, struct 
sk_buff *skb_gso,
}
 
__skb_queue_tail(mpdus_skb, skb);
+   sum_truesize += skb-truesize;
+   }
+
+   /* Release the backpressure on the socket only when
+* the last segment is released.
+*/
+   if (skb_gso-destructor) {
+   struct sk_buff *tail = mpdus_skb-prev;
+
+   swap(tail-truesize, skb_gso-truesize);
+   swap(tail-destructor, skb_gso-destructor);
+   swap(tail-sk, skb_gso-sk);
+   atomic_add(sum_truesize - skb_gso-truesize,
+  skb_gso-sk-sk_wmem_alloc);
}
 
ret = 0;
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC v2 1/3] iwlwifi: mvm: add real TSO implementation

2015-08-19 Thread Emmanuel Grumbach
The segmentation is done completely in software. The
driver creates several MPDUs out of a single large send.
Each MPDU is a newly allocated SKB.
A page is allocated to create the headers that need to be
duplicated (SNAP / IP / TCP). The WiFi header is in the
header of the newly created SKBs.

type=feature

Change-Id: I238ffa79cacc5bbdacdfbf3e9673c8d4f02b462a
Signed-off-by: Emmanuel Grumbach emmanuel.grumb...@intel.com
---
 drivers/net/wireless/iwlwifi/mvm/tx.c | 513 +++---
 1 file changed, 481 insertions(+), 32 deletions(-)

diff --git a/drivers/net/wireless/iwlwifi/mvm/tx.c 
b/drivers/net/wireless/iwlwifi/mvm/tx.c
index 90f0ea1..a63686c 100644
--- a/drivers/net/wireless/iwlwifi/mvm/tx.c
+++ b/drivers/net/wireless/iwlwifi/mvm/tx.c
@@ -65,6 +65,7 @@
 #include linux/ieee80211.h
 #include linux/etherdevice.h
 #include net/tcp.h
+#include net/ip.h
 
 #include iwl-trans.h
 #include iwl-eeprom-parse.h
@@ -435,32 +436,471 @@ int iwl_mvm_tx_skb_non_sta(struct iwl_mvm *mvm, struct 
sk_buff *skb)
return 0;
 }
 
+/*
+ * Update the IP / TCP headers and recompute the IP header CSUM +
+ * pseudo header CSUM.
+ */
+static void iwl_update_ip_tcph(void *iph, struct tcphdr *tcph, bool ipv6,
+  unsigned int len, unsigned int tcp_seq_offset,
+  u16 num_segment)
+{
+   be32_add_cpu(tcph-seq, tcp_seq_offset);
+
+   if (ipv6) {
+   struct ipv6hdr *iphv6 = iph;
+
+   iphv6-payload_len = cpu_to_be16(len + tcph-doff * 4);
+
+   /* Compute CSUM on the the pseudo-header */
+   tcph-check = ~csum_ipv6_magic(iphv6-saddr, iphv6-daddr,
+  len + tcph-doff * 4,
+  IPPROTO_TCP, 0);
+   } else {
+   struct iphdr *iphv4 = iph;
+
+   iphv4-tot_len =
+   cpu_to_be16(len + tcph-doff * 4 + iphv4-ihl * 4);
+   be16_add_cpu(iphv4-id, num_segment);
+   ip_send_check(iphv4);
+
+   /* Compute CSUM on the the pseudo-header */
+   tcph-check = ~csum_tcpudp_magic(iphv4-saddr, iphv4-daddr,
+len + tcph-doff * 4,
+IPPROTO_TCP, 0);
+   }
+}
+
+/**
+ * struct iwl_lso_splitter - state of the split.
+ * @linear_payload_len: The length of the payload inside the header of the
+ * original GSO skb.
+ * @gso_frag_num: The fragment number from which to take the data in the
+ * original GSO skb.
+ * @gso_payload_len: The length of the payload in the original GSO skb.
+ * @gso_payload_pos: The incrementing position in the payload of the original
+ * GSO skb.
+ * @gso_offset_in_page: The offset in the page of gso_frag_num.
+ * @gso_current_frag_size: The size of gso_frag_num.
+ * @gso_offset_in_frag: The offset in the gso_frag_num.
+ * @frag_in_mpdu: The index of the frag inside the new (split) MPDU.
+ * @mss: The maximal segment size.
+ * @si: Points to the the shared info of the original GSO skb.
+ * @ieee80211_hdr *hdr: Points to the WiFi header.
+ * @gso_nr_frags: The number of frags in the original GSO skb.
+ * @wifi_hdr_iv_len: The length of the WiFi header including IV.
+ * @tcp_fin: True if TCP_FIN is set in the original GSO skb.
+ * @tcp_push: True if TCP_PSH is set in the original GSO skb.
+ */
+struct iwl_lso_splitter {
+   unsigned int linear_payload_len;
+   unsigned int gso_frag_num;
+   unsigned int gso_payload_len;
+   unsigned int gso_payload_pos;
+   unsigned int gso_offset_in_page;
+   unsigned int gso_current_frag_size;
+   unsigned int gso_offset_in_frag;
+   unsigned int frag_in_mpdu;
+   unsigned int mss;
+   struct skb_shared_info *si;
+   struct ieee80211_hdr *hdr;
+   u8 gso_nr_frags;
+   u8 wifi_hdr_iv_len;
+   bool tcp_fin;
+   bool tcp_push;
+};
+
+/*
+ * Adds a TCP segment from skb_gso to skb. All the state is taken from
+ * and fed back to p. This function takes care about the payload only.
+ * This MSDU might already have msdu_sz bytes of payload that come from
+ * the original GSO skb's header.
+ */
+static unsigned int
+iwl_add_tcp_segment(struct iwl_mvm *mvm, struct sk_buff *skb_gso,
+   struct sk_buff *skb, struct iwl_lso_splitter *p,
+   unsigned int msdu_sz)
+{
+   while (msdu_sz  p-mss) {
+   unsigned int frag_sz =
+   min_t(unsigned int, p-gso_current_frag_size,
+ p-mss - msdu_sz);
+
+   if (p-frag_in_mpdu = mvm-trans-max_skb_frags)
+   return msdu_sz;
+
+   skb_add_rx_frag(skb, p-frag_in_mpdu,
+   skb_frag_page(p-si-frags[p-gso_frag_num]),
+   p-gso_offset_in_page, frag_sz, 0);
+
+   /* We just added one frag to the mpdu

[RFC v2 0/3] add TSO / A-MSDU TX for iwlwifi

2015-08-19 Thread Emmanuel Grumbach
We enable TSO to get a lot of data at once to build A-MSDUs.
Our hardware doesn't have (yet) TCP CSUM offload, so we do
it manually. TSO won't be enabled on hardware that don't
support CSUM offload in release code, computing TCP CSUM
in the driver is just a way to start coding the flows.
This is why the CSUM offload implementation in the driver
in so bad in terms of efficiency. I preferred to have the
flows as close as they will be when the hardware will be
able to the CSUM than to try to seek efficiency.
The hardware that will have CSUM offload will still require
the driver to split the skb in software including the
IP / TCP header copy and update etc...

We could have enabled A-MSDU based on xmit-more, but the
rationale of using LSO is that when using pfifo-fast,
the Qdisc gets one packet and dequeues is straight away
which limits the possibility to get a lot of packets at
once. (Am I right here?).

A note about A-MSDUs for non-wireless people:
*
An A-MSDU is a aggregated frame. It is one big 802.11
packet that contains several subframes. Each subframe
is a TCP segment. One A-MSDU is represented by one single
skb which means that we need to copy / duplicate the TCP
/ IP / SNAP headers in one single skb. This is why those
headers are copied to a separate page: that page is added
multiple times to the skb with different offsets. Each
subframes needs at least 2 frags: 1 for the headers, 1 (or
more) for the payload.

I am quite a newbie in skb handling, so I guess that this
code can be improved. I have tested it decently using iperf,
but this doesn't mean that there are no issues using other
applications. We are enabling pktgen on TCP (using patches
that were sent a year ago or so) to test the different
layouts of the skb (payload partition amongst the header
and the different frags).

I'll be very happy to get comments on that code, this is
why I am sending it to netdev as well since the TSO experts
are there :)

Emmanuel Grumbach (3):
  iwlwifi: mvm: add real TSO implementation
  iwlwifi: mvm: allow to create A-MSDUs from a large send
  iwlwifi: mvm: transfer the truesize to the last TSO segment

 drivers/net/wireless/iwlwifi/mvm/mac80211.c |   3 +-
 drivers/net/wireless/iwlwifi/mvm/sta.c  |   4 +-
 drivers/net/wireless/iwlwifi/mvm/sta.h  |   6 +-
 drivers/net/wireless/iwlwifi/mvm/tx.c   | 669 ++--
 4 files changed, 647 insertions(+), 35 deletions(-)

-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC v2 3/3] iwlwifi: mvm: transfer the truesize to the last TSO segment

2015-08-19 Thread Emmanuel Grumbach
This allows to release the backpressure on the socket only
when the last segment is released.
Now the truesize looks like this:
if the truesize of the original skb is 65420, all the
segments will have a truesize of 704 (skb itself) and the
last one will have 65420.

Change-Id: I3c894cf2afc0aedfe7b2a5b992ba41653ff79c0e
Signed-off-by: Emmanuel Grumbach emmanuel.grumb...@intel.com
---
 drivers/net/wireless/iwlwifi/mvm/tx.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/iwlwifi/mvm/tx.c 
b/drivers/net/wireless/iwlwifi/mvm/tx.c
index 5046833..046e50d 100644
--- a/drivers/net/wireless/iwlwifi/mvm/tx.c
+++ b/drivers/net/wireless/iwlwifi/mvm/tx.c
@@ -764,7 +764,7 @@ static int iwl_mvm_tx_tso(struct iwl_mvm *mvm, struct 
sk_buff *skb_gso,
bool ipv6 = skb_shinfo(skb_gso)-gso_type  SKB_GSO_TCPV6;
struct iwl_lso_splitter s = {};
struct page *hdr_page;
-   unsigned int mpdu_sz;
+   unsigned int mpdu_sz, sum_truesize = 0;
u8 *hdr_page_pos, *qc, tid;
int i, ret;
 
@@ -898,6 +898,7 @@ static int iwl_mvm_tx_tso(struct iwl_mvm *mvm, struct 
sk_buff *skb_gso,
mpdu_sz, tcp_hdrlen(skb_gso));
 
__skb_queue_tail(mpdus_skb, skb_gso);
+   sum_truesize += skb_gso-truesize;
 
/* mss bytes have been consumed from the data */
s.gso_payload_pos = s.mss;
@@ -1034,6 +1035,20 @@ static int iwl_mvm_tx_tso(struct iwl_mvm *mvm, struct 
sk_buff *skb_gso,
}
 
__skb_queue_tail(mpdus_skb, skb);
+   sum_truesize += skb-truesize;
+   }
+
+   /* Release the backpressure on the socket only when
+* the last segment is released.
+*/
+   if (skb_gso-destructor == sock_wfree) {
+   struct sk_buff *tail = mpdus_skb-prev;
+
+   swap(tail-truesize, skb_gso-truesize);
+   swap(tail-destructor, skb_gso-destructor);
+   swap(tail-sk, skb_gso-sk);
+atomic_add(sum_truesize - skb_gso-truesize,
+   skb_gso-sk-sk_wmem_alloc);
}
 
ret = 0;
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC v2 2/3] iwlwifi: mvm: allow to create A-MSDUs from a large send

2015-08-19 Thread Emmanuel Grumbach
Now that we can get a big chunk of data from the network
stack, we can create an A-MSDU out of it. The purpose is to
get a throughput improvement since sending one single A-MSDU
is more efficient than sending several MSDUs at least under
ideal link conditions.

type=feature

Change-Id: I5ea1b1132a57542187cd4c34c5299dbf44fe8b01
Signed-off-by: Emmanuel Grumbach emmanuel.grumb...@intel.com
---
 drivers/net/wireless/iwlwifi/mvm/mac80211.c |   3 +-
 drivers/net/wireless/iwlwifi/mvm/sta.c  |   4 +-
 drivers/net/wireless/iwlwifi/mvm/sta.h  |   6 +-
 drivers/net/wireless/iwlwifi/mvm/tx.c   | 159 ++--
 4 files changed, 160 insertions(+), 12 deletions(-)

diff --git a/drivers/net/wireless/iwlwifi/mvm/mac80211.c 
b/drivers/net/wireless/iwlwifi/mvm/mac80211.c
index 3dd4e97..dd15e04 100644
--- a/drivers/net/wireless/iwlwifi/mvm/mac80211.c
+++ b/drivers/net/wireless/iwlwifi/mvm/mac80211.c
@@ -925,7 +925,8 @@ static int iwl_mvm_mac_ampdu_action(struct ieee80211_hw *hw,
ret = iwl_mvm_sta_tx_agg_flush(mvm, vif, sta, tid);
break;
case IEEE80211_AMPDU_TX_OPERATIONAL:
-   ret = iwl_mvm_sta_tx_agg_oper(mvm, vif, sta, tid, buf_size);
+   ret = iwl_mvm_sta_tx_agg_oper(mvm, vif, sta, tid,
+ buf_size, amsdu);
break;
default:
WARN_ON_ONCE(1);
diff --git a/drivers/net/wireless/iwlwifi/mvm/sta.c 
b/drivers/net/wireless/iwlwifi/mvm/sta.c
index df216cd..606fc09 100644
--- a/drivers/net/wireless/iwlwifi/mvm/sta.c
+++ b/drivers/net/wireless/iwlwifi/mvm/sta.c
@@ -976,7 +976,8 @@ int iwl_mvm_sta_tx_agg_start(struct iwl_mvm *mvm, struct 
ieee80211_vif *vif,
 }
 
 int iwl_mvm_sta_tx_agg_oper(struct iwl_mvm *mvm, struct ieee80211_vif *vif,
-   struct ieee80211_sta *sta, u16 tid, u8 buf_size)
+   struct ieee80211_sta *sta, u16 tid, u8 buf_size,
+   bool amsdu)
 {
struct iwl_mvm_sta *mvmsta = iwl_mvm_sta_from_mac80211(sta);
struct iwl_mvm_tid_data *tid_data = mvmsta-tid_data[tid];
@@ -995,6 +996,7 @@ int iwl_mvm_sta_tx_agg_oper(struct iwl_mvm *mvm, struct 
ieee80211_vif *vif,
queue = tid_data-txq_id;
tid_data-state = IWL_AGG_ON;
mvmsta-agg_tids |= BIT(tid);
+   tid_data-amsdu_in_ampdu_allowed = amsdu;
tid_data-ssn = 0x;
spin_unlock_bh(mvmsta-lock);
 
diff --git a/drivers/net/wireless/iwlwifi/mvm/sta.h 
b/drivers/net/wireless/iwlwifi/mvm/sta.h
index eedb215..26d1e31 100644
--- a/drivers/net/wireless/iwlwifi/mvm/sta.h
+++ b/drivers/net/wireless/iwlwifi/mvm/sta.h
@@ -258,6 +258,8 @@ enum iwl_mvm_agg_state {
  * Tx response (TX_CMD), and the block ack notification (COMPRESSED_BA).
  * @reduced_tpc: Reduced tx power. Holds the data between the
  * Tx response (TX_CMD), and the block ack notification (COMPRESSED_BA).
+ * @amsdu_in_ampdu_allowed: true if A-MSDU in A-MPDU is allowed. Relevant only
+ * if state is %IWL_AGG_ON.
  * @state: state of the BA agreement establishment / tear down.
  * @txq_id: Tx queue used by the BA session
  * @ssn: the first packet to be sent in AGG HW queue in Tx AGG start flow, or
@@ -272,6 +274,7 @@ struct iwl_mvm_tid_data {
/* The rest is Tx AGG related */
u32 rate_n_flags;
u8 reduced_tpc;
+   bool amsdu_in_ampdu_allowed;
enum iwl_mvm_agg_state state;
u16 txq_id;
u16 ssn;
@@ -387,7 +390,8 @@ int iwl_mvm_sta_rx_agg(struct iwl_mvm *mvm, struct 
ieee80211_sta *sta,
 int iwl_mvm_sta_tx_agg_start(struct iwl_mvm *mvm, struct ieee80211_vif *vif,
struct ieee80211_sta *sta, u16 tid, u16 *ssn);
 int iwl_mvm_sta_tx_agg_oper(struct iwl_mvm *mvm, struct ieee80211_vif *vif,
-   struct ieee80211_sta *sta, u16 tid, u8 buf_size);
+   struct ieee80211_sta *sta, u16 tid, u8 buf_size,
+   bool amsdu);
 int iwl_mvm_sta_tx_agg_stop(struct iwl_mvm *mvm, struct ieee80211_vif *vif,
struct ieee80211_sta *sta, u16 tid);
 int iwl_mvm_sta_tx_agg_flush(struct iwl_mvm *mvm, struct ieee80211_vif *vif,
diff --git a/drivers/net/wireless/iwlwifi/mvm/tx.c 
b/drivers/net/wireless/iwlwifi/mvm/tx.c
index a63686c..5046833 100644
--- a/drivers/net/wireless/iwlwifi/mvm/tx.c
+++ b/drivers/net/wireless/iwlwifi/mvm/tx.c
@@ -488,8 +488,10 @@ static void iwl_update_ip_tcph(void *iph, struct tcphdr 
*tcph, bool ipv6,
  * @ieee80211_hdr *hdr: Points to the WiFi header.
  * @gso_nr_frags: The number of frags in the original GSO skb.
  * @wifi_hdr_iv_len: The length of the WiFi header including IV.
+ * @amsdu_pad: Number of bytes for the A-MSDU subframe
  * @tcp_fin: True if TCP_FIN is set in the original GSO skb.
  * @tcp_push: True if TCP_PSH is set in the original GSO skb.
+ * @amsdu: True if we are building an A-MSDU
  */
 struct iwl_lso_splitter {
unsigned int

Re: [PATCH 1/2] iwlwifi: convert hex_dump_to_buffer() to %*ph

2015-08-04 Thread Emmanuel Grumbach
On Tue, Aug 4, 2015 at 1:47 PM, Andy Shevchenko
andriy.shevche...@linux.intel.com wrote:
 On Thu, 2015-07-16 at 15:42 +0300, Andy Shevchenko wrote:
 There is no need to use hex_dump_to_buffer() in the cases like this:

   hexdump_to_buffer(buf, len, 16, 1, outbuf, outlen, false);   /* 
 len = 16
 */
   sprintf(%s\n, outbuf);

 since it maybe easily converted to simple:

   sprintf(%*ph\n, len, buf);

 Note: it seems in one case the output is groupped by 2 bytes and
 looks like a
 typo. Thus, patch changes that to plain byte stream.


 Any comments on this, anyone?

 The idea is to minimize usage of hexdump_to_buffer(), i.e. not using it
 when it's an obvious overkill.


Sorry for the delay, since the buffer is small (less than 16 bytes, it
should be fine).
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] iwlwifi: convert hex_dump_to_buffer() to %*ph

2015-08-04 Thread Emmanuel Grumbach
On Tue, Aug 4, 2015 at 2:48 PM, Emmanuel Grumbach egrumb...@gmail.com wrote:
 On Tue, Aug 4, 2015 at 1:47 PM, Andy Shevchenko
 andriy.shevche...@linux.intel.com wrote:
 On Thu, 2015-07-16 at 15:42 +0300, Andy Shevchenko wrote:
 There is no need to use hex_dump_to_buffer() in the cases like this:

   hexdump_to_buffer(buf, len, 16, 1, outbuf, outlen, false);   /* 
 len = 16
 */
   sprintf(%s\n, outbuf);

 since it maybe easily converted to simple:

   sprintf(%*ph\n, len, buf);

 Note: it seems in one case the output is groupped by 2 bytes and
 looks like a
 typo. Thus, patch changes that to plain byte stream.


 Any comments on this, anyone?

 The idea is to minimize usage of hexdump_to_buffer(), i.e. not using it
 when it's an obvious overkill.


 Sorry for the delay, since the buffer is small (less than 16 bytes, it
 should be fine).

I applied it on our internal tree. Thank you.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net/wireless: enable wiphy device to suspend/resume asynchronously

2015-07-29 Thread Emmanuel Grumbach
On Thu, Jul 30, 2015 at 8:18 AM, Fu, Zhonghui
zhonghui...@linux.intel.com wrote:
 Enable wiphy device to suspend/resume asynchronously. This can improve
 system suspend/resume speed.


How will that impact the timing with respect to the suspend call
coming from the bus?
I think that a few drivers rely on the suspend call of the wiphy
device happening before the suspend call to the bus device.
Not sure though.

 Signed-off-by: Zhonghui Fu zhonghui...@linux.intel.com
 ---
  net/wireless/core.c |1 +
  1 files changed, 1 insertions(+), 0 deletions(-)

 diff --git a/net/wireless/core.c b/net/wireless/core.c
 index 2a0bbd2..bc5e68f 100644
 --- a/net/wireless/core.c
 +++ b/net/wireless/core.c
 @@ -416,6 +416,7 @@ use_default_name:
 device_initialize(rdev-wiphy.dev);
 rdev-wiphy.dev.class = ieee80211_class;
 rdev-wiphy.dev.platform_data = rdev;
 +   device_enable_async_suspend(rdev-wiphy.dev);

 INIT_LIST_HEAD(rdev-destroy_list);
 spin_lock_init(rdev-destroy_list_lock);
 -- 1.7.1

 --
 To unsubscribe from this list: send the line unsubscribe linux-wireless in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] iwlwifi: Deinline iwl_{read,write}{8,32}

2015-07-15 Thread Emmanuel Grumbach
On Tue, Jul 14, 2015 at 3:41 PM, Denys Vlasenko
vda.li...@googlemail.com wrote:
 On Tue, Jul 14, 2015 at 2:38 PM, Sergei Shtylyov
 sergei.shtyl...@cogentembedded.com wrote:
 +#define IWL_READ_WRITE(static_inline) \
 +static_inline void iwl_write8(struct iwl_trans *trans, u32 ofs, u8 val) \
 +{ \
 +   trace_iwlwifi_dev_iowrite8(trans-dev, ofs, val); \
 +   iwl_trans_write8(trans, ofs, val); \
 +} \
 [...]

 +#if !defined(CONFIG_IWLWIFI_DEVICE_TRACING)
 +IWL_READ_WRITE(static inline)

Not static_inline?

 Yes. Here we are putting two words, static inline, in front of every
 function definition.
 --

I'll try to come up with a patch that is easier for me to read, but I
am really busy right now. Ping me in a week if you have heard from me
earlier.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] mac80211: Use kstrdup to simplify code

2015-07-08 Thread Emmanuel Grumbach
The subject is wrong. You are not patch mac80211, but bcrmsmac.

On Wed, Jul 8, 2015 at 10:32 PM, Christophe JAILLET
christophe.jail...@wanadoo.fr wrote:
 Replace a kmalloc+strcpy by an equivalent kstrdup in order to improve
 readability.

 Signed-off-by: Christophe JAILLET christophe.jail...@wanadoo.fr
 ---
  drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c | 4 +---
  1 file changed, 1 insertion(+), 3 deletions(-)

 diff --git a/drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c 
 b/drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c
 index 4813506..8a6c077 100644
 --- a/drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c
 +++ b/drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c
 @@ -1476,9 +1476,7 @@ struct brcms_timer *brcms_init_timer(struct brcms_info 
 *wl,
 wl-timers = t;

  #ifdef DEBUG
 -   t-name = kmalloc(strlen(name) + 1, GFP_ATOMIC);
 -   if (t-name)
 -   strcpy(t-name, name);
 +   t-name = kstrdup(name, GFP_ATOMIC);
  #endif

 return t;
 --
 2.1.4

 --
 To unsubscribe from this list: send the line unsubscribe linux-wireless in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html