Re: [PATCH 0/6] Constant Time Memory Comparisons Are Important
On Mon, Jun 12, 2017 at 12:30 AM, Emil Lenngren <emil.lenng...@gmail.com> wrote: > 2017-06-11 22:48 GMT+02:00 Emmanuel Grumbach <egrumb...@gmail.com>: >> On Sun, Jun 11, 2017 at 4:36 PM, Kees Cook <keesc...@chromium.org> wrote: >>> >>> On Sun, Jun 11, 2017 at 1:13 AM, Kalle Valo <kv...@codeaurora.org> wrote: >>> > "Jason A. Donenfeld" <ja...@zx2c4.com> writes: >>> > >>> >> Whenever you're comparing two MACs, it's important to do this using >>> >> crypto_memneq instead of memcmp. With memcmp, you leak timing >>> >> information, >>> >> which could then be used to iteratively forge a MAC. >>> > >>> > Do you have any pointers where I could learn more about this? >>> >>> While not using C specifically, this talks about the problem generally: >>> https://www.chosenplaintext.ca/articles/beginners-guide-constant-time-cryptography.html >>> >> >> Sorry for the stupid question, but the MAC address is in plaintext in >> the air anyway or easily accessible via user space tools. I fail to >> see what it is so secret about a MAC address in that code where that >> same MAC address is accessible via myriads of ways. > > I think you're mixing up Media Access Control (MAC) addresses with > Message Authentication Code (MAC). The second one is a cryptographic > signature of a message. Obviously... Sorry for the noise.
Re: [PATCH 0/6] Constant Time Memory Comparisons Are Important
On Sun, Jun 11, 2017 at 4:36 PM, Kees Cookwrote: > > On Sun, Jun 11, 2017 at 1:13 AM, Kalle Valo wrote: > > "Jason A. Donenfeld" writes: > > > >> Whenever you're comparing two MACs, it's important to do this using > >> crypto_memneq instead of memcmp. With memcmp, you leak timing information, > >> which could then be used to iteratively forge a MAC. > > > > Do you have any pointers where I could learn more about this? > > While not using C specifically, this talks about the problem generally: > https://www.chosenplaintext.ca/articles/beginners-guide-constant-time-cryptography.html > Sorry for the stupid question, but the MAC address is in plaintext in the air anyway or easily accessible via user space tools. I fail to see what it is so secret about a MAC address in that code where that same MAC address is accessible via myriads of ways.
Re: [PATCH RESEND] iwlwifi, Do not implement thermal zone unless ucode is loaded
On Mon, Jul 11, 2016 at 6:18 PM, Prarit Bhargava <pra...@redhat.com> wrote: > > Didn't get any feedback or review comments on this patch. Resending ... > > P. This change is obviously completely broken. It simply disables the registration to thermal zone core. > > ---8<--- > > The iwlwifi driver implements a thermal zone and hwmon device, but > returns -EIO on temperature reads if the firmware isn't loaded. This > results in the error > > iwlwifi-virtual-0 > Adapter: Virtual device > ERROR: Can't get value of subfeature temp1_input: I/O error > temp1:N/A > > being output when using sensors from the lm-sensors package. Since > the temperature cannot be read unless the ucode is loaded there is no > reason to add the interface only to have it return an error 100% of > the time. > > This patch moves the firmware check to iwl_mvm_thermal_zone_register() and > stops the thermal zone from being created if the ucode hasn't been loaded. > > Signed-off-by: Prarit Bhargava <pra...@redhat.com> > Cc: Johannes Berg <johannes.b...@intel.com> > Cc: Emmanuel Grumbach <emmanuel.grumb...@intel.com> > Cc: Luca Coelho <luciano.coe...@intel.com> > Cc: Intel Linux Wireless <linuxw...@intel.com> > Cc: Kalle Valo <kv...@codeaurora.org> > Cc: Chaya Rachel Ivgi <chaya.rachel.i...@intel.com> > Cc: Sara Sharon <sara.sha...@intel.com> > Cc: linux-wirel...@vger.kernel.org > Cc: netdev@vger.kernel.org > --- > drivers/net/wireless/intel/iwlwifi/mvm/tt.c | 13 +++-- > 1 file changed, 3 insertions(+), 10 deletions(-) > > diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/tt.c > b/drivers/net/wireless/intel/iwlwifi/mvm/tt.c > index 58fc7b3c711c..64802659711f 100644 > --- a/drivers/net/wireless/intel/iwlwifi/mvm/tt.c > +++ b/drivers/net/wireless/intel/iwlwifi/mvm/tt.c > @@ -634,11 +634,6 @@ static int iwl_mvm_tzone_get_temp(struct > thermal_zone_device *device, > > mutex_lock(>mutex); > > - if (!mvm->ucode_loaded || !(mvm->cur_ucode == IWL_UCODE_REGULAR)) { > - ret = -EIO; > - goto out; > - } > - > ret = iwl_mvm_get_temp(mvm, ); > if (ret) > goto out; > @@ -684,11 +679,6 @@ static int iwl_mvm_tzone_set_trip_temp(struct > thermal_zone_device *device, > > mutex_lock(>mutex); > > - if (!mvm->ucode_loaded || !(mvm->cur_ucode == IWL_UCODE_REGULAR)) { > - ret = -EIO; > - goto out; > - } > - > if (trip < 0 || trip >= IWL_MAX_DTS_TRIPS) { > ret = -EINVAL; > goto out; > @@ -750,6 +740,9 @@ static void iwl_mvm_thermal_zone_register(struct iwl_mvm > *mvm) > return; > } > > + if (!mvm->ucode_loaded || !(mvm->cur_ucode == IWL_UCODE_REGULAR)) > + return; > + > BUILD_BUG_ON(ARRAY_SIZE(name) >= THERMAL_NAME_LENGTH); > > mvm->tz_device.tzone = thermal_zone_device_register(name, > -- > 1.7.9.3 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-wireless" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT] Networking
On Wed, May 18, 2016 at 4:00 AM, Linus Torvaldswrote: > On Tue, May 17, 2016 at 12:11 PM, David Miller wrote: >> >> Highlights: > > Lowlights: > > 1) the iwlwifi driver seems to be broken > > My laptop that uses the intel 7680 iwlwifi module no longer connects > to the network. It fails with a "Microcode SW error detected." and > spews out register state over and over again. Can we have the register state and the ASSERT / NMI / whatever that goes along with it? This clearly means that the firmware is crashing, but I don't know why, I copied here the lines that I need from another bug with another device with another firmware, but the log that we will still explain what I need: [ 800.880402] iwlwifi :02:00.0: Start IWL Error Log Dump: [ 800.880406] iwlwifi :02:00.0: Status: 0x, count: 6 [ 800.880409] iwlwifi :02:00.0: Loaded firmware version: 21.311951.0 [ 800.880413] iwlwifi :02:00.0: 0x0394 | ADVANCED_SYSASSERT [ 800.880416] iwlwifi :02:00.0: 0x0220 | trm_hw_status0 [ 800.880419] iwlwifi :02:00.0: 0x | trm_hw_status1 [ 800.880422] iwlwifi :02:00.0: 0x0BD8 | branchlink2 [ 800.880425] iwlwifi :02:00.0: 0x00026AC4 | interruptlink1 [ 800.880428] iwlwifi :02:00.0: 0x | interruptlink2 [ 800.880431] iwlwifi :02:00.0: 0x0001 | data1 [ 800.880434] iwlwifi :02:00.0: 0x02039845 | data2 [ 800.880437] iwlwifi :02:00.0: 0x0056 | data3 [ 800.880440] iwlwifi :02:00.0: 0x8E4184A7 | beacon time [ 800.880443] iwlwifi :02:00.0: 0x30E2CB41 | tsf low [ 800.880446] iwlwifi :02:00.0: 0x0027 | tsf hi [ 800.880449] iwlwifi :02:00.0: 0x | time gp1 [ 800.880451] iwlwifi :02:00.0: 0x2F842F8A | time gp2 [ 800.880454] iwlwifi :02:00.0: 0x | uCode revision type [ 800.880457] iwlwifi :02:00.0: 0x0015 | uCode version major [ 800.880460] iwlwifi :02:00.0: 0x0004C28F | uCode version minor [ 800.880463] iwlwifi :02:00.0: 0x0201 | hw version [ 800.880466] iwlwifi :02:00.0: 0x00489008 | board version [ 800.880469] iwlwifi :02:00.0: 0x001C | hcmd [ 800.880472] iwlwifi :02:00.0: 0x24022000 | isr0 [ 800.880475] iwlwifi :02:00.0: 0x0100 | isr1 [ 800.880478] iwlwifi :02:00.0: 0x580A | isr2 [ 800.880481] iwlwifi :02:00.0: 0x4041FCC1 | isr3 [ 800.880483] iwlwifi :02:00.0: 0x | isr4 [ 800.880486] iwlwifi :02:00.0: 0x00800110 | last cmd Id [ 800.880489] iwlwifi :02:00.0: 0x | wait_event [ 800.880492] iwlwifi :02:00.0: 0x02C8 | l2p_control [ 800.880495] iwlwifi :02:00.0: 0x00018030 | l2p_duration [ 800.880498] iwlwifi :02:00.0: 0x00BF | l2p_mhvalid [ 800.880501] iwlwifi :02:00.0: 0x00EF | l2p_addr_match [ 800.880503] iwlwifi :02:00.0: 0x000D | lmpm_pmg_sel [ 800.880506] iwlwifi :02:00.0: 0x30031805 | timestamp [ 800.880509] iwlwifi :02:00.0: 0xE0F0 | flow_handler > > The last thing it says before falling over is: > > wlp1s0: authenticate with xx:xx:xx:xx:xx:xx > wlp1s0: send auth to xx:xx:xx:xx:xx:xx (try 1/3) > wlp1s0: send auth to xx:xx:xx:xx:xx:xx (try 2/3) > > and then it goes all titsup. > > I thought that it might be because I had downloaded one of the daily > firmware versions (it calls itself iwlwifi-7260-17.ucode, but isn't a > real release afaik - but it has worked fien for me before), but the > problem persists with the ver-16 ucode too, so that wasn't it. > > I haven't bisected it, but there is absolutely nothing odd in my hardware. > > I do have a 802.11ac network, which apparently not everybody does, > judging by previous bug-reports of mine.. > > Intel iwlwifi people: please check this out. > >Linus > -- > To unsubscribe from this list: send the line "unsubscribe linux-wireless" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next v5] rtnetlink: add new RTM_GETSTATS message to dump link stats
On Tue, Apr 19, 2016 at 1:09 PM, Johannes Bergwrote: > On Mon, 2016-04-18 at 23:52 -0400, David Miller wrote: >> From: David Miller >> Date: Mon, 18 Apr 2016 21:48:51 -0400 (EDT) >> >> > I think the time has probably come to have a new netlink attribute >> > format that doesn't have this multi-decade old problem. >> ... >> >> Scratch this whole idea, I think the padding attribute works a lot >> better and is backwards compatible with every properly written >> netlink application. > > This talk about attribute flags reminded me about something else. > > At netconf, we talked about how awkward it can be that one doesn't know > if an attribute was accepted by the kernel or simply ignored because > it's not supported (older kernel version). > > I considered (and Emmanuel has started to cook up a patch for it) > adding a flag here meaning "reject if not parsed" (or so). > > Now, I realize that with all the existing netlink commands one can't > actually rely on that (since it requires some infrastructure), but new > commands in existing families and also entirely new generic netlink > families would allow let userspace rely on such a thing. > > Thoughts? FWIW: diff --git a/include/net/netlink.h b/include/net/netlink.h index 2a5dbcc..2dfd46d 100644 --- a/include/net/netlink.h +++ b/include/net/netlink.h @@ -663,6 +663,15 @@ static inline int nla_type(const struct nlattr *nla) } /** + * nla_must_parse - returns true if the attribute must be parsed + * @nla: netlink attribute + */ +static inline bool nla_must_parse(const struct nlattr *nla) +{ + return nla->nla_type & NLA_F_NET_MUST_PARSE; +} + +/** * nla_data - head of payload * @nla: netlink attribute */ diff --git a/include/uapi/linux/netlink.h b/include/uapi/linux/netlink.h index 1a85940..e81a8d4 100644 --- a/include/uapi/linux/netlink.h +++ b/include/uapi/linux/netlink.h @@ -165,17 +165,21 @@ struct nlattr { /* * nla_type (16 bits) - * +---+---+---+ - * | N | O | Attribute Type| - * +---+---+---+ + * +---+---+---++ + * | N | O | M | Attribute Type | + * +---+---+---++ * N := Carries nested attributes * O := Payload stored in network byte order + * M := Attribute can't be ignored * * Note: The N and O flag are mutually exclusive. */ #define NLA_F_NESTED (1 << 15) #define NLA_F_NET_BYTEORDER(1 << 14) -#define NLA_TYPE_MASK ~(NLA_F_NESTED | NLA_F_NET_BYTEORDER) +#define NLA_F_NET_MUST_PARSE (1 << 13) +#define NLA_TYPE_MASK ~(NLA_F_NESTED |\ + NLA_F_NET_BYTEORDER | \ + NLA_F_NET_MUST_PARSE) #define NLA_ALIGNTO4 #define NLA_ALIGN(len) (((len) + NLA_ALIGNTO - 1) & ~(NLA_ALIGNTO - 1)) diff --git a/lib/nlattr.c b/lib/nlattr.c index f5907d2..fa15e6f 100644 --- a/lib/nlattr.c +++ b/lib/nlattr.c @@ -198,6 +198,10 @@ int nla_parse(struct nlattr **tb, int maxtype, const struct nlattr *head, } tb[type] = (struct nlattr *)nla; + } else if (nla_must_parse(nla)) { + err = -EINVAL; + goto errout; } }
Re: [PATCH] wlcore: spi: add wl18xx support
On Wed, Mar 30, 2016 at 3:58 PM, Eyal Reizerwrote: > > From: Eyal Are you trying to hide or something? :) > > Add support for using with both wl12xx and wl18xx. > > - all wilink family needs special init command for entering wspi mode. > extra clock cycles should be sent after the spi init command while the > cs pin is high. > - switch to controling the cs pin from the spi driver for achieveing the > above. > - the selected cs gpio is read from the spi device-tree node using the > cs-gpios field and setup as a gpio. > - See the example below for specifying the cs gpio using the cs-gpios entry > >{ > status = "okay"; > pinctrl-names = "default"; > pinctrl-0 = <_pins>; > cs-gpios = < 5 0>; > #address-cells = <1>; > #size-cells = <0>; > wlcore: wlcore@0 { > compatible = "ti,wl1835"; > vwlan-supply = <_en_reg>; > spi-max-frequency = <4800>; > reg = <0>; /* chip select 0 on spi0, ie spi0.0 */ > interrupt-parent = <>; > interrupts = <27 IRQ_TYPE_EDGE_RISING>; > }; > }; > > Signed-off-by: Eyal Reizer > --- > drivers/net/wireless/ti/wlcore/spi.c | 176 > ++ > 1 file changed, 157 insertions(+), 19 deletions(-) > > diff --git a/drivers/net/wireless/ti/wlcore/spi.c > b/drivers/net/wireless/ti/wlcore/spi.c > index 96d9c9d..6c5a369 100644 > --- a/drivers/net/wireless/ti/wlcore/spi.c > +++ b/drivers/net/wireless/ti/wlcore/spi.c > @@ -32,6 +32,7 @@ > #include > #include > #include > +#include > > #include "wlcore.h" > #include "wl12xx_80211.h" > @@ -70,16 +71,30 @@ > #define WSPI_MAX_CHUNK_SIZE4092 > > /* > - * only support SPI for 12xx - this code should be reworked when 18xx > - * support is introduced > + * wl18xx driver aggregation buffer size is (13 * PAGE_SIZE) compared to > + * (4 * PAGE_SIZE) for wl12xx, so use the larger buffer needed for wl18xx > */ > -#define SPI_AGGR_BUFFER_SIZE (4 * PAGE_SIZE) > +#define SPI_AGGR_BUFFER_SIZE (13 * PAGE_SIZE) > > /* Maximum number of SPI write chunks */ > #define WSPI_MAX_NUM_OF_CHUNKS \ > ((SPI_AGGR_BUFFER_SIZE / WSPI_MAX_CHUNK_SIZE) + 1) > > > +struct wilink_familiy_data { > + char name[8]; > +}; > + > +const struct wilink_familiy_data *wilink_data; > + > +static const struct wilink_familiy_data wl18xx_data = { > + .name = "wl18xx", > +}; > + > +static const struct wilink_familiy_data wl12xx_data = { > + .name = "wl12xx", > +}; > + > struct wl12xx_spi_glue { > struct device *dev; > struct platform_device *core; > @@ -120,6 +135,8 @@ static void wl12xx_spi_init(struct device *child) > struct spi_transfer t; > struct spi_message m; > u8 *cmd = kzalloc(WSPI_INIT_CMD_LEN, GFP_KERNEL); > + struct spi_device *spi = (struct spi_device *)glue->dev; > + struct spi_master *master = spi->master; > > if (!cmd) { > dev_err(child->parent, > @@ -127,6 +144,15 @@ static void wl12xx_spi_init(struct device *child) > return; > } > > + if (!master->cs_gpios) { > + dev_err(child->parent, > + "spi chip select pin missing in platform data!\n"); > + return; > + } > + > + /* Drive CS line low */ > + gpio_direction_output(master->cs_gpios[0], 0); > + > memset(, 0, sizeof(t)); > spi_message_init(); > > @@ -163,6 +189,26 @@ static void wl12xx_spi_init(struct device *child) > spi_message_add_tail(, ); > > spi_sync(to_spi_device(glue->dev), ); > + > + /* Send extra clocks with CS high. this is required by the wilink > +* family in order for successfully enter WSPI mode > +*/ > + gpio_direction_output(master->cs_gpios[0], 1); > + > + memset(, 0, sizeof(m)); > + spi_message_init(); > + > + cmd[0] = 0xff; > + cmd[1] = 0xff; > + cmd[2] = 0xff; > + cmd[3] = 0xff; > + swab32s((u32 *)cmd); > + > + t.tx_buf = cmd; > + t.len = 4; > + spi_message_add_tail(, ); > + spi_sync(to_spi_device(glue->dev), ); > + > kfree(cmd); > } > > @@ -213,6 +259,16 @@ static int __must_check wl12xx_spi_raw_read(struct > device *child, int addr, > u32 *busy_buf; > u32 *cmd; > u32 chunk_len; > + struct spi_device *spi = (struct spi_device *)glue->dev; > + struct spi_master *master = spi->master; > + > + if (!master->cs_gpios) { > + dev_err(child->parent, > + "spi chip select pin missing in platform data!\n"); > + return -EINVAL; > + } > + /* Drive CS line low */ > + gpio_direction_output(master->cs_gpios[0], 0); > > while (len > 0) { > chunk_len = min_t(size_t,
Re: [PATCH] codel: cast the output of MS2TIME to codel_time_t
On Mon, Feb 22, 2016 at 9:53 PM, David Miller <da...@davemloft.net> wrote: > From: Emmanuel Grumbach <emmanuel.grumb...@intel.com> > Date: Mon, 22 Feb 2016 13:49:47 +0200 > >> This will allow to pass the typecheck in the comparators: >> codel_time_{after,before} >> >> Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com> > > I do not see any existing code where this would matter. Maybe it was > not meant to be fed directly? And if you need to you should add the > explicit casts yourself. Well.. This header file is kinda special. As Eric pointed out, I shouldn't include it since it contains huge inline functions. I only really need small parts of it for iwlwifi and I copied those into the driver. In iwlwifi I do need to feed MS2TIME output into the comparators so I guess I'll do that there. I just thought it'd be nice to feed this change back in the original implementation of the code. If another user of the small helpers in this file comes up, it may make sense to split this file into two header files: one that can be included, and another one which can't. If that were to happen, I'd be glad to have this patch in. We're not there yet. Bottom line, I am perfectly fine if you don't apply this. > > I'm not applying this, sorry. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-wireless" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] codel: cast the output of MS2TIME to codel_time_t
This will allow to pass the typecheck in the comparators: codel_time_{after,before} Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com> --- include/net/codel.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/net/codel.h b/include/net/codel.h index 267e702..0d92e39 100644 --- a/include/net/codel.h +++ b/include/net/codel.h @@ -62,7 +62,7 @@ typedef u32 codel_time_t; typedef s32 codel_tdiff_t; #define CODEL_SHIFT 10 -#define MS2TIME(a) ((a * NSEC_PER_MSEC) >> CODEL_SHIFT) +#define MS2TIME(a) ((codel_time_t)((a * NSEC_PER_MSEC) >> CODEL_SHIFT)) static inline codel_time_t codel_get_time(void) { -- 2.5.0
[PATCH] codel: add forgotten inline to functions in header file
Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com> --- include/net/codel.h | 44 ++-- 1 file changed, 22 insertions(+), 22 deletions(-) diff --git a/include/net/codel.h b/include/net/codel.h index 267e702..0775c24 100644 --- a/include/net/codel.h +++ b/include/net/codel.h @@ -92,18 +92,18 @@ struct codel_skb_cb { codel_time_t enqueue_time; }; -static struct codel_skb_cb *get_codel_cb(const struct sk_buff *skb) +static inline struct codel_skb_cb *get_codel_cb(const struct sk_buff *skb) { qdisc_cb_private_validate(skb, sizeof(struct codel_skb_cb)); return (struct codel_skb_cb *)qdisc_skb_cb(skb)->data; } -static codel_time_t codel_get_enqueue_time(const struct sk_buff *skb) +static inline codel_time_t codel_get_enqueue_time(const struct sk_buff *skb) { return get_codel_cb(skb)->enqueue_time; } -static void codel_set_enqueue_time(struct sk_buff *skb) +static inline void codel_set_enqueue_time(struct sk_buff *skb) { get_codel_cb(skb)->enqueue_time = codel_get_time(); } @@ -174,8 +174,8 @@ struct codel_stats { #define CODEL_DISABLED_THRESHOLD INT_MAX -static void codel_params_init(struct codel_params *params, - const struct Qdisc *sch) +static inline void codel_params_init(struct codel_params *params, +const struct Qdisc *sch) { params->interval = MS2TIME(100); params->target = MS2TIME(5); @@ -184,12 +184,12 @@ static void codel_params_init(struct codel_params *params, params->ecn = false; } -static void codel_vars_init(struct codel_vars *vars) +static inline void codel_vars_init(struct codel_vars *vars) { memset(vars, 0, sizeof(*vars)); } -static void codel_stats_init(struct codel_stats *stats) +static inline void codel_stats_init(struct codel_stats *stats) { stats->maxpacket = 0; } @@ -200,7 +200,7 @@ static void codel_stats_init(struct codel_stats *stats) * * Here, invsqrt is a fixed point number (< 1.0), 32bit mantissa, aka Q0.32 */ -static void codel_Newton_step(struct codel_vars *vars) +static inline void codel_Newton_step(struct codel_vars *vars) { u32 invsqrt = ((u32)vars->rec_inv_sqrt) << REC_INV_SQRT_SHIFT; u32 invsqrt2 = ((u64)invsqrt * invsqrt) >> 32; @@ -217,19 +217,19 @@ static void codel_Newton_step(struct codel_vars *vars) * We maintain in rec_inv_sqrt the reciprocal value of sqrt(count) to avoid * both sqrt() and divide operation. */ -static codel_time_t codel_control_law(codel_time_t t, - codel_time_t interval, - u32 rec_inv_sqrt) +static inline codel_time_t codel_control_law(codel_time_t t, +codel_time_t interval, +u32 rec_inv_sqrt) { return t + reciprocal_scale(interval, rec_inv_sqrt << REC_INV_SQRT_SHIFT); } -static bool codel_should_drop(const struct sk_buff *skb, - struct Qdisc *sch, - struct codel_vars *vars, - struct codel_params *params, - struct codel_stats *stats, - codel_time_t now) +static inline bool codel_should_drop(const struct sk_buff *skb, +struct Qdisc *sch, +struct codel_vars *vars, +struct codel_params *params, +struct codel_stats *stats, +codel_time_t now) { bool ok_to_drop; @@ -265,11 +265,11 @@ static bool codel_should_drop(const struct sk_buff *skb, typedef struct sk_buff * (*codel_skb_dequeue_t)(struct codel_vars *vars, struct Qdisc *sch); -static struct sk_buff *codel_dequeue(struct Qdisc *sch, -struct codel_params *params, -struct codel_vars *vars, -struct codel_stats *stats, -codel_skb_dequeue_t dequeue_func) +static inline struct sk_buff *codel_dequeue(struct Qdisc *sch, + struct codel_params *params, + struct codel_vars *vars, + struct codel_stats *stats, + codel_skb_dequeue_t dequeue_func) { struct sk_buff *skb = dequeue_func(vars, sch); codel_time_t now; -- 2.5.0
Re: [RFC v2] iwlwifi: pcie: transmit queue auto-sizing
On Mon, Feb 8, 2016 at 12:00 PM, Michal Kaziorwrote: > On 5 February 2016 at 17:47, Dave Taht wrote: >>> A bursted txop can be as big as 5-10ms. If you consider you want to >>> queue 5-10ms worth of data for *each* station at any given time you >>> obviously introduce a lot of lag. If you have 10 stations you might >>> end up with service period at 10*10ms = 100ms. This gets even worse if >>> you consider MU-MIMO because you need to do an expensive sounding >>> procedure before transmitting. So while SU aggregation can probably >>> still work reasonably well with shorter bursts (1-2ms) MU needs at >>> least 3ms to get *any* gain when compared to SU (which obviously means >>> you want more to actually make MU pay off). >> >> I am not sure where you get these numbers. Got a spreadsheet? > > Here's a nice summary on some of it: > > > http://chimera.labs.oreilly.com/books/123401739/ch03.html#figure-mac-ampdu > > Even if your single A-MPDU can be shorter than a txop you can burst a > few of them if my understanding is correct. > > The overhead associated with MU sounding is something I've been told > only. Apparently for MU to pay off you need fairly big bursts. This > implies that the more stations you have to service the less it makes > sense to attempt MU if you consider latency. > > >> Gradually reducing the maximum sized txop as a function of the number >> of stations makes sense. If you have 10 stations pending delivery and >> reduced the max txop to 1ms, you hurt bandwidth at that instant, but >> by offering more service to more stations, in less time, they will >> converge on a reasonable share of the bandwidth for each, faster[1]. >> And I'm sure that the person videoconferencing on a link like that >> would appreciate getting some service inside of a 10ms interval, >> rather than a 100ms. >> >> yes, there's overhead, and that's not the right number, which would >> vary as to g,n,ac and successors. >> >> You will also get more opportunities to use mu-mimo with shorter >> bursts extant and more stations being regularly serviced. >> >> [1] https://www.youtube.com/watch?v=Rb-UnHDw02o at about 13:50 > > This is my thinking as well, at least for most common use cases. > > If you try to optimize for throughput by introducing extra induced > latency you might end up not being able to use aggregation in practice > anyway because you won't be able to start up connections and ask for > enough data - or at least that's what my intuition tells me. > > But, like I've mentioned, there's interest in making it possible to > maximize for throughput (regardless of latency). This surely makes > sense for synthetic UDP benchmarks. But does it make sense for any > real-world application? No idea. > > >>> The rule of thumb is the >>> longer you wait the bigger capacity you can get. >> >> This is not strictly true as the "fountain" of packets is regulated by >> acks on the other side of the link, and ramp up or down as a function >> of service time and loss. > > Yes, if you consider real world cases, i.e. TCP, web traffic, etc. > then you're correct. > > >>> Apparently there's interest in maximizing throughput but it stands in >>> direct opposition of keeping the latency down so I've been thinking >>> how to satisfy both. >>> >>> The current approach ath10k is taking (patches in review [1][2]) is to >>> use mac80211 software queues for per-station queuing, exposing queue >>> state to firmware (it decides where frames should be dequeued from) >>> and making it possible to stop/wake per-station tx subqueue with fake >>> netdev queues. I'm starting to think this is not the right way though >>> because it's inherently hard to control latency and there's a huge >>> memory overhead associated with the fake netdev queues. >> >> What is this overhead? > > E.g. if you want to be able to maximize throughput for 50 MU clients > you need to be able to queue, in theory, 50*200 (roughly) frames. This > translates to both huge memory usage and latency *and* renders > (fq_)codel qdisc rather.. moot. Ok - now I understand. So yes the conclusion below (make fq_codel station aware) makes a lot sense. > > >> Applying things like codel tend to dramatically shorten the amount of >> skbs extant... > >> modern 802.11ac capable hardware has tons more >> memory... > > I don't think it does. QCA988x is able to handle "only" 1424 tx > descriptors (IOW 1500-byte long MSDUs) in the driver-to-firmware tx > queue (it's a flat queue). QCA99x0 is able to handle 2500 if asked > politely. As I said, our design is not flat which removes for the firmware to classify the packets by station to be able to build aggregates, but the downside is the number of clients you can service. > > This is still not enough to satisfy the insane "maximize the > capacity/throughput" expectations though. > > You could actually argue it's too much from the bufferbloat problem > point of view anyway and Emmanuel's patch
Re: [RFC v2] iwlwifi: pcie: transmit queue auto-sizing
On Fri, Feb 5, 2016 at 10:44 AM, Michal Kazior <michal.kaz...@tieto.com> wrote: > On 4 February 2016 at 22:14, Ben Greear <gree...@candelatech.com> wrote: >> On 02/04/2016 12:56 PM, Grumbach, Emmanuel wrote: >>> On 02/04/2016 10:46 PM, Ben Greear wrote: >>>> On 02/04/2016 12:16 PM, Emmanuel Grumbach wrote: >>>>> >>>>> As many (all?) WiFi devices, Intel WiFi devices have >>>>> transmit queues which have 256 transmit descriptors >>>>> each and each descriptor corresponds to an MPDU. >>>>> This means that when it is full, the queue contains >>>>> 256 * ~1500 bytes to be transmitted (if we don't have >>>>> A-MSDUs). The purpose of those queues is to have enough >>>>> packets to be ready for transmission so that when the device >>>>> gets an opportunity to transmit (TxOP), it can take as many >>>>> packets as the spec allows and aggregate them into one >>>>> A-MPDU or even several A-MPDUs if we are using bursts. >>>> >>>> I guess this is only really usable if you have exactly one >>>> peer connected (ie, in station mode)? >>>> >>>> Otherwise, you could have one slow peer and one fast one, >>>> and then I suspect this would not work so well? >>> >>> >>> Yes. I guess this one (big) limitation. I guess that what would happen >>> in this case is that the the latency would constantly jitter. But I also > > Hmm.. You'd probably need to track per-station packet sojourn time as > well and make it possible to stop/wake queues per station. Clearly here comes the difference between the devices you work on, and the devices I work on. Intel devices are more client oriented. Our AP mode doesn't handle many clients etc.. > > >>> noticed that I could reduce the transmit queue to 130 descriptors >>> (instead of 256) and still reach maximal throughput because we can >>> refill the queues quickly enough. >>> In iwlwifi, we have plans to have one queue for each peer. >>> This is under development. Not sure when it'll be ready. It also requires >>> firmware change obviously. >> >> Per-peer queues will probably be nice, especially if we can keep the >> buffer bloat manageable. > > Per-station queues sound tricky if you consider bufferbloat. iwlwifi's A-MPDU model is different from athXk's I guess. In iwlwifi (the Intel devices really since it is mostly firmware) the firmware will try to use a TxOP whenever there is data in the queue. Then, once we have a chance to transmit, we will look at what we have in the queue and send the biggest aggregates we can (and bursts if allowed). But we will not defer packets' transmission to get bigger aggregates. Testing shows that under ideal conditions, we can have enough packets in the queue to build big aggregates without creating artificial latency. > > To maximize use of airtime (i.e. txop) you need to send big > aggregates. Since aggregates are per station-tid to maximize > multi-station performance (in AP mode) you'll need to queue a lot of > frames, per each station, depending on the chosen tx rate. Sure. > > A bursted txop can be as big as 5-10ms. If you consider you want to > queue 5-10ms worth of data for *each* station at any given time you > obviously introduce a lot of lag. If you have 10 stations you might > end up with service period at 10*10ms = 100ms. This gets even worse if > you consider MU-MIMO because you need to do an expensive sounding > procedure before transmitting. So while SU aggregation can probably > still work reasonably well with shorter bursts (1-2ms) MU needs at > least 3ms to get *any* gain when compared to SU (which obviously means > you want more to actually make MU pay off). The rule of thumb is the > longer you wait the bigger capacity you can get. I am not an expert about MU-MIMO, but I'll believe you :) We can chat about this in a few days :) Queueing frames under good conditions is fine in a way that it is a "Good queue" (hey Stephen), you need those queues to maximize the throughput because of the bursty nature of WiFi and the queue "moves" quickly since you have high throughput so that the sojourn time in your queue is relatively small, but when the link conditions gets less good you need to reduce the queue length because it doesn't really help you anyway. This is what my patch aims at fixing. All this is true when you have a small number of stations... I understand from your comment that even in ideal conditions you still need to create a lot of latency to gain TPT. Then there isn't much we can do without impacting either TPT or latency. Then, there is a real tradeoff. I g
[RFC] iwlwifi: pcie: transmit queue auto-sizing
As many (all?) WiFi devices, Intel WiFi devices have transmit queues which have 256 transmit descriptors each and each descriptor corresponds to an MPDU. This means that when it is full, the queue contains 256 * ~1500 bytes to be transmitted (if we don't have A-MSDUs). The purpose of those queues is to have enough packets to be ready for transmission so that when the device gets an opportunity to transmit (TxOP), it can take as many packets as the spec allows and aggregate them into one A-MPDU or even several A-MPDUs if we are using bursts. The problem is that the packets that are in these queues are already out of control of the Qdisc and can stay in those queues for a fairly long time when the link condition is not good. This leads to the well known bufferbloat problem. This patch adds a way to tune the size of the transmit queue so that it won't cause excessive latency. When the link condition is good, the packets will flow smoothly and the transmit queue will grow quickly allowing A-MPDUs and maximal throughput. When the link is not optimal, we will have retransmissions, lower transmit rates or signal detection (CCA) which will cause a delay in the packet transmission. The driver will sense this higher latency and will reduce the size of the transmit queue. This means that the packets that continue to arrive will pile up in the Qdisc rather than in the device queues. The major advantage of this approach is that codel can now do its work. The algorithm is really (too?) simple: every 5 seconds, starts from a short queue again. If a packet has been in the queue for less than 10ms, allow 10 more MPDUs in. If a packet has been in the queue for more than 20ms, reduce by 10 the size of the transmit queue. The implementation is really naive and way too simple: * reading jiffies for every Tx / Tx status is not a good idead. * jiffies are not fine-grained enough on all platforms * the constants chosen are really arbitrary and can't be tuned. * This may be implemented in mac80211 probably and help other drivers. * etc... But already this gives nice results. I ran a very simple experiment: I put the device in a controlled environment and ran traffic while running default sized ping in the background. In this configuration, our device quickly raises its transmission rate to the best rate. Then, I force the device to use the lowest rate (6Mbps). Of course, the throughput collapses, but the ping RTT shoots up. Using codel helps, but the latency is still high. Codel with this patch gives much better results: pfifo_fast: rtt min/avg/max/mdev = 1932.616/2393.284/2833.407/315.941 ms, pipe 3, ipg/ewma 2215.707/2446.884 ms fq_codel + Tx queue auto-sizing: rtt min/avg/max/mdev = 13.541/32.396/54.791/9.610 ms, ipg/ewma 200.685/32.202 ms fq_codel without Tx queue auto-sizing: rtt min/avg/max/mdev = 140.821/257.303/331.889/31.074 ms, pipe 2, ipg/ewma 258.147/252.847 ms Clearly, there is more work to do to be able to merge this, but it seems that the wireless problems mentioned in https://lwn.net/Articles/616241/ may have a solution. Cc: Stephen Hemminger <step...@networkplumber.org> Cc: Dave Taht <dave.t...@bufferbloat.net> Cc: Jonathan Corbet <cor...@lwn.net> Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com> --- drivers/net/wireless/intel/iwlwifi/pcie/internal.h | 6 drivers/net/wireless/intel/iwlwifi/pcie/tx.c | 32 -- 2 files changed, 35 insertions(+), 3 deletions(-) diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h index 2f95916..d83eb56 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h +++ b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h @@ -192,6 +192,11 @@ struct iwl_cmd_meta { u32 flags; }; +struct iwl_txq_auto_size { + int min_space; + unsigned long reset_ts; +}; + /* * Generic queue structure * @@ -293,6 +298,7 @@ struct iwl_txq { bool block; unsigned long wd_timeout; struct sk_buff_head overflow_q; + struct iwl_txq_auto_size auto_sz; }; static inline dma_addr_t diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c index 837a7d5..4d1dee6 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c @@ -572,6 +572,8 @@ static int iwl_pcie_txq_init(struct iwl_trans *trans, struct iwl_txq *txq, spin_lock_init(>lock); __skb_queue_head_init(>overflow_q); + txq->auto_sz.min_space = 240; + txq->auto_sz.reset_ts = jiffies; /* * Tell nic where to find circular buffer of Tx Frame Descriptors for @@ -1043,10 +1045,14 @@ void iwl_trans_pcie_reclaim(struct iwl_trans *trans, int txq_id, int ssn, q->read_ptr != tfd_num; q->read_ptr = iwl_queue_inc_wrap(q->read_ptr)) { struct sk_
[RFC v2] iwlwifi: pcie: transmit queue auto-sizing
As many (all?) WiFi devices, Intel WiFi devices have transmit queues which have 256 transmit descriptors each and each descriptor corresponds to an MPDU. This means that when it is full, the queue contains 256 * ~1500 bytes to be transmitted (if we don't have A-MSDUs). The purpose of those queues is to have enough packets to be ready for transmission so that when the device gets an opportunity to transmit (TxOP), it can take as many packets as the spec allows and aggregate them into one A-MPDU or even several A-MPDUs if we are using bursts. The problem is that the packets that are in these queues are already out of control of the Qdisc and can stay in those queues for a fairly long time when the link condition is not good. This leads to the well known bufferbloat problem. This patch adds a way to tune the size of the transmit queue so that it won't cause excessive latency. When the link condition is good, the packets will flow smoothly and the transmit queue will grow quickly allowing A-MPDUs and maximal throughput. When the link is not optimal, we will have retransmissions, lower transmit rates or signal detection (CCA) which will cause a delay in the packet transmission. The driver will sense this higher latency and will reduce the size of the transmit queue. This means that the packets that continue to arrive will pile up in the Qdisc rather than in the device queues. The major advantage of this approach is that codel can now do its work. The algorithm is really (too?) simple: every 5 seconds, starts from a short queue again. If a packet has been in the queue for less than 10ms, allow 10 more MPDUs in. If a packet has been in the queue for more than 20ms, reduce by 10 the size of the transmit queue. The implementation is really naive and way too simple: * reading jiffies for every Tx / Tx status is not a good idead. * jiffies are not fine-grained enough on all platforms * the constants chosen are really arbitrary and can't be tuned. * This may be implemented in mac80211 probably and help other drivers. * etc... But already this gives nice results. I ran a very simple experiment: I put the device in a controlled environment and ran traffic while running default sized ping in the background. In this configuration, our device quickly raises its transmission rate to the best rate. Then, I force the device to use the lowest rate (6Mbps). Of course, the throughput collapses, but the ping RTT shoots up. Using codel helps, but the latency is still high. Codel with this patch gives much better results: pfifo_fast: rtt min/avg/max/mdev = 1932.616/2393.284/2833.407/315.941 ms, pipe 3, ipg/ewma 2215.707/2446.884 ms fq_codel + Tx queue auto-sizing: rtt min/avg/max/mdev = 13.541/32.396/54.791/9.610 ms, ipg/ewma 200.685/32.202 ms fq_codel without Tx queue auto-sizing: rtt min/avg/max/mdev = 140.821/257.303/331.889/31.074 ms, pipe 2, ipg/ewma 258.147/252.847 ms Clearly, there is more work to do to be able to merge this, but it seems that the wireless problems mentioned in https://lwn.net/Articles/616241/ may have a solution. Cc: Stephen Hemminger <step...@networkplumber.org> Cc: Dave Taht <dave.t...@gmail.com> Cc: Jonathan Corbet <cor...@lwn.net> Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com> --- Fix Dave's email address --- drivers/net/wireless/intel/iwlwifi/pcie/internal.h | 6 drivers/net/wireless/intel/iwlwifi/pcie/tx.c | 32 -- 2 files changed, 35 insertions(+), 3 deletions(-) diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h index 2f95916..d83eb56 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h +++ b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h @@ -192,6 +192,11 @@ struct iwl_cmd_meta { u32 flags; }; +struct iwl_txq_auto_size { + int min_space; + unsigned long reset_ts; +}; + /* * Generic queue structure * @@ -293,6 +298,7 @@ struct iwl_txq { bool block; unsigned long wd_timeout; struct sk_buff_head overflow_q; + struct iwl_txq_auto_size auto_sz; }; static inline dma_addr_t diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c index 837a7d5..4d1dee6 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c @@ -572,6 +572,8 @@ static int iwl_pcie_txq_init(struct iwl_trans *trans, struct iwl_txq *txq, spin_lock_init(>lock); __skb_queue_head_init(>overflow_q); + txq->auto_sz.min_space = 240; + txq->auto_sz.reset_ts = jiffies; /* * Tell nic where to find circular buffer of Tx Frame Descriptors for @@ -1043,10 +1045,14 @@ void iwl_trans_pcie_reclaim(struct iwl_trans *trans, int txq_id, int ssn, q->read_ptr != tfd_num; q->read_ptr = iwl_queue_inc_wrap(q->read_ptr)) {
[RFC RESEND] iwlwifi: pcie: transmit queue auto-sizing
As many (all?) WiFi devices, Intel WiFi devices have transmit queues which have 256 transmit descriptors each and each descriptor corresponds to an MPDU. This means that when it is full, the queue contains 256 * ~1500 bytes to be transmitted (if we don't have A-MSDUs). The purpose of those queues is to have enough packets to be ready for transmission so that when the device gets an opportunity to transmit (TxOP), it can take as many packets as the spec allows and aggregate them into one A-MPDU or even several A-MPDUs if we are using bursts. The problem is that the packets that are in these queues are already out of control of the Qdisc and can stay in those queues for a fairly long time when the link condition is not good. This leads to the well known bufferbloat problem. This patch adds a way to tune the size of the transmit queue so that it won't cause excessive latency. When the link condition is good, the packets will flow smoothly and the transmit queue will grow quickly allowing A-MPDUs and maximal throughput. When the link is not optimal, we will have retransmissions, lower transmit rates or signal detection (CCA) which will cause a delay in the packet transmission. The driver will sense this higher latency and will reduce the size of the transmit queue. This means that the packets that continue to arrive will pile up in the Qdisc rather than in the device queues. The major advantage of this approach is that codel can now do its work. The algorithm is really (too?) simple: every 5 seconds, starts from a short queue again. If a packet has been in the queue for less than 10ms, allow 10 more MPDUs in. If a packet has been in the queue for more than 20ms, reduce by 10 the size of the transmit queue. The implementation is really naive and way too simple: * reading jiffies for every Tx / Tx status is not a good idead. * jiffies are not fine-grained enough on all platforms * the constants chosen are really arbitrary and can't be tuned. * This may be implemented in mac80211 probably and help other drivers. * etc... But already this gives nice results. I ran a very simple experiment: I put the device in a controlled environment and ran traffic while running default sized ping in the background. In this configuration, our device quickly raises its transmission rate to the best rate. Then, I force the device to use the lowest rate (6Mbps). Of course, the throughput collapses, but the ping RTT shoots up. Using codel helps, but the latency is still high. Codel with this patch gives much better results: pfifo_fast: rtt min/avg/max/mdev = 1932.616/2393.284/2833.407/315.941 ms, pipe 3, ipg/ewma 2215.707/2446.884 ms fq_codel + Tx queue auto-sizing: rtt min/avg/max/mdev = 13.541/32.396/54.791/9.610 ms, ipg/ewma 200.685/32.202 ms fq_codel without Tx queue auto-sizing: rtt min/avg/max/mdev = 140.821/257.303/331.889/31.074 ms, pipe 2, ipg/ewma 258.147/252.847 ms Clearly, there is more work to do to be able to merge this, but it seems that the wireless problems mentioned in https://lwn.net/Articles/616241/ may have a solution. Cc: Stephen Hemminger <step...@networkplumber.org> Cc: Dave Taht <dave.t...@gmail.com> Cc: Jonathan Corbet <cor...@lwn.net> Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com> --- Fix Dave's email address --- drivers/net/wireless/intel/iwlwifi/pcie/internal.h | 6 drivers/net/wireless/intel/iwlwifi/pcie/tx.c | 32 -- 2 files changed, 35 insertions(+), 3 deletions(-) diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h index 2f95916..d83eb56 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h +++ b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h @@ -192,6 +192,11 @@ struct iwl_cmd_meta { u32 flags; }; +struct iwl_txq_auto_size { + int min_space; + unsigned long reset_ts; +}; + /* * Generic queue structure * @@ -293,6 +298,7 @@ struct iwl_txq { bool block; unsigned long wd_timeout; struct sk_buff_head overflow_q; + struct iwl_txq_auto_size auto_sz; }; static inline dma_addr_t diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c index 837a7d5..4d1dee6 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c @@ -572,6 +572,8 @@ static int iwl_pcie_txq_init(struct iwl_trans *trans, struct iwl_txq *txq, spin_lock_init(>lock); __skb_queue_head_init(>overflow_q); + txq->auto_sz.min_space = 240; + txq->auto_sz.reset_ts = jiffies; /* * Tell nic where to find circular buffer of Tx Frame Descriptors for @@ -1043,10 +1045,14 @@ void iwl_trans_pcie_reclaim(struct iwl_trans *trans, int txq_id, int ssn, q->read_ptr != tfd_num; q->read_ptr = iwl_queue_inc_wrap(q->read_ptr)) {
Re: [PATCH v2 10/27] iwlwifi: move under intel vendor directory
Hi Kalle, On Wed, Nov 18, 2015 at 4:45 PM, Kalle Valowrote: > > Part of reorganising wireless drivers directory and Kconfig. > I am sorry but I'd prefer to wait with this. We have a big machinery of scripts / builds and alike that will break. I did give a heads up a while ago to the people in charge of these scripts that this change is on the way, but they are not done yet. I will get back to them and ask them when we can be ready for such a change and then it'll come from my tree. I promise to try to make it happen ASAP. The above is relevant for iwlwifi only. As far as I am concerned, you can move the iwlwifi and ipw. Thank you. > Signed-off-by: Kalle Valo > --- > MAINTAINERS|2 +- -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3] net: tso: add support for IPv6
Adding IPv6 for the TSO helper API is trivial: * Don't play with the id (which doesn't exist in IPv6) * Correctly update the payload_len (don't include the length of the IP header itself) Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com> --- v3: use vlan_get_protocol and call it once in tso_start store the result in tso_t --- include/net/tso.h | 1 + net/core/tso.c| 18 +- 2 files changed, 14 insertions(+), 5 deletions(-) diff --git a/include/net/tso.h b/include/net/tso.h index 47e5444..b7be852 100644 --- a/include/net/tso.h +++ b/include/net/tso.h @@ -8,6 +8,7 @@ struct tso_t { void *data; size_t size; u16 ip_id; + bool ipv6; u32 tcp_seq; }; diff --git a/net/core/tso.c b/net/core/tso.c index 630b30b..5dca7ce 100644 --- a/net/core/tso.c +++ b/net/core/tso.c @@ -1,4 +1,5 @@ #include +#include #include #include #include @@ -14,18 +15,24 @@ EXPORT_SYMBOL(tso_count_descs); void tso_build_hdr(struct sk_buff *skb, char *hdr, struct tso_t *tso, int size, bool is_last) { - struct iphdr *iph; struct tcphdr *tcph; int hdr_len = skb_transport_offset(skb) + tcp_hdrlen(skb); int mac_hdr_len = skb_network_offset(skb); memcpy(hdr, skb->data, hdr_len); - iph = (struct iphdr *)(hdr + mac_hdr_len); - iph->id = htons(tso->ip_id); - iph->tot_len = htons(size + hdr_len - mac_hdr_len); + if (!tso->ipv6) { + struct iphdr *iph = (void *)(hdr + mac_hdr_len); + + iph->id = htons(tso->ip_id); + iph->tot_len = htons(size + hdr_len - mac_hdr_len); + tso->ip_id++; + } else { + struct ipv6hdr *iph = (void *)(hdr + mac_hdr_len); + + iph->payload_len = htons(size + tcp_hdrlen(skb)); + } tcph = (struct tcphdr *)(hdr + skb_transport_offset(skb)); put_unaligned_be32(tso->tcp_seq, >seq); - tso->ip_id++; if (!is_last) { /* Clear all special flags for not last packet */ @@ -61,6 +68,7 @@ void tso_start(struct sk_buff *skb, struct tso_t *tso) tso->ip_id = ntohs(ip_hdr(skb)->id); tso->tcp_seq = ntohl(tcp_hdr(skb)->seq); tso->next_frag_idx = 0; + tso->ipv6 = vlan_get_protocol(skb) == htons(ETH_P_IPV6); /* Build first data */ tso->size = skb_headlen(skb) - hdr_len; -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] iwlwifi: pcie: allow to build an A-MSDU using TSO core
Hi Eric, > > When the op_mode sends an skb whose payload is bigger than > MSS, PCIe will create an A-MSDU out of it. PCIe assumes > that the skb that is coming from the op_mode can fit in one > A-MSDU. It is the op_mode's responsibility to make sure > that this guarantee holds. > > Additional headers need to be built for the subframes. > The TSO core code takes care of the IP / TCP headers and > the driver takes care of the 802.11 subframe headers. > > These headers are stored on a per-cpu page that is re-used > for all the packets handled on that same CPU. Each skb > holds a reference to that page and releases the page when > it is reclaimed. When the page gets full, it is released > and a new one is allocated. > > Since any SKB that doesn't go through the fast-xmit path > of mac80211 will be segmented, we can assume here that the > packet is not WEP / TKIP and has a proper SNAP header. > > Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com> Assuming your review queue works as a FIFO and you reviewed the TSO helper patch, I can assume you ACK this one? :) Or at least, don't NACK it :) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] net: tso: add support for IPv6
Adding IPv6 for the TSO helper API is trivial: * Don't play with the id (which doesn't exist in IPv6) * Correctly update the payload_len (don't include the length of the IP header itself) Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com> --- net/core/tso.c | 17 - 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/net/core/tso.c b/net/core/tso.c index 630b30b..ece4605 100644 --- a/net/core/tso.c +++ b/net/core/tso.c @@ -14,18 +14,25 @@ EXPORT_SYMBOL(tso_count_descs); void tso_build_hdr(struct sk_buff *skb, char *hdr, struct tso_t *tso, int size, bool is_last) { - struct iphdr *iph; struct tcphdr *tcph; int hdr_len = skb_transport_offset(skb) + tcp_hdrlen(skb); int mac_hdr_len = skb_network_offset(skb); memcpy(hdr, skb->data, hdr_len); - iph = (struct iphdr *)(hdr + mac_hdr_len); - iph->id = htons(tso->ip_id); - iph->tot_len = htons(size + hdr_len - mac_hdr_len); + if (skb->protocol == htons(ETH_P_IP)) { + struct iphdr *iph = (void *)(hdr + mac_hdr_len); + + iph->id = htons(tso->ip_id); + iph->tot_len = htons(size + hdr_len - mac_hdr_len); + tso->ip_id++; + } + if (skb->protocol == htons(ETH_P_IPV6)) { + struct ipv6hdr *iph = (void *)(hdr + mac_hdr_len); + + iph->payload_len = htons(size + tcp_hdrlen(skb)); + } tcph = (struct tcphdr *)(hdr + skb_transport_offset(skb)); put_unaligned_be32(tso->tcp_seq, >seq); - tso->ip_id++; if (!is_last) { /* Clear all special flags for not last packet */ -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] net: tso: add support for IPv6
Adding IPv6 for the TSO helper API is trivial: * Don't play with the id (which doesn't exist in IPv6) * Correctly update the payload_len (don't include the length of the IP header itself) Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com> --- v2: add else if NOTE: instead of checking the skb->protocol, I can add a bool in tso_t. This might be better in terms of cache handling. Let me know if you prefer that option --- net/core/tso.c | 16 +++- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/net/core/tso.c b/net/core/tso.c index 630b30b..bf71bdc5 100644 --- a/net/core/tso.c +++ b/net/core/tso.c @@ -14,18 +14,24 @@ EXPORT_SYMBOL(tso_count_descs); void tso_build_hdr(struct sk_buff *skb, char *hdr, struct tso_t *tso, int size, bool is_last) { - struct iphdr *iph; struct tcphdr *tcph; int hdr_len = skb_transport_offset(skb) + tcp_hdrlen(skb); int mac_hdr_len = skb_network_offset(skb); memcpy(hdr, skb->data, hdr_len); - iph = (struct iphdr *)(hdr + mac_hdr_len); - iph->id = htons(tso->ip_id); - iph->tot_len = htons(size + hdr_len - mac_hdr_len); + if (skb->protocol == htons(ETH_P_IP)) { + struct iphdr *iph = (void *)(hdr + mac_hdr_len); + + iph->id = htons(tso->ip_id); + iph->tot_len = htons(size + hdr_len - mac_hdr_len); + tso->ip_id++; + } else if (skb->protocol == htons(ETH_P_IPV6)) { + struct ipv6hdr *iph = (void *)(hdr + mac_hdr_len); + + iph->payload_len = htons(size + tcp_hdrlen(skb)); + } tcph = (struct tcphdr *)(hdr + skb_transport_offset(skb)); put_unaligned_be32(tso->tcp_seq, >seq); - tso->ip_id++; if (!is_last) { /* Clear all special flags for not last packet */ -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC v4 1/2] iwlwifi: pcie: allow to build an A-MSDU using TSO core
When the op_mode sends an skb whose payload is bigger than MSS, PCIe will create an A-MSDU out of it. PCIe assumes that the skb that is coming from the op_mode can fit in one A-MSDU. It is the op_mode's responsibility to make sure that this guarantee holds. Additional headers need to be built for the subframes. The TSO core code takes care of the IP / TCP headers and the driver takes care of the 802.11 subframe headers. These headers are stored on a per-cpu page that is re-used for all the packets handled on that same CPU. Each skb holds a reference to that page and releases the page when it is reclaimed. When the page gets full, it is released and a new one is allocated. Since any SKB that doesn't go through the fast-xmit path of mac80211 will be segmented, we can assume here that the packet is not WEP / TKIP and has a proper SNAP header. Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com> --- v4: * Create a helper to free the skb's ref to the hdr page * add comment on that clarifies that the headers fit in a page * return -ENOMEM if the per_cpu variable allocation failed * don't zero the per_cpu struct after allocation --- drivers/net/wireless/iwlwifi/iwl-devtrace-data.h | 16 ++ drivers/net/wireless/iwlwifi/iwl-trans.h | 6 +- drivers/net/wireless/iwlwifi/pcie/internal.h | 7 + drivers/net/wireless/iwlwifi/pcie/trans.c| 15 ++ drivers/net/wireless/iwlwifi/pcie/tx.c | 287 ++- 5 files changed, 326 insertions(+), 5 deletions(-) diff --git a/drivers/net/wireless/iwlwifi/iwl-devtrace-data.h b/drivers/net/wireless/iwlwifi/iwl-devtrace-data.h index 71a78ce..59d9edf 100644 --- a/drivers/net/wireless/iwlwifi/iwl-devtrace-data.h +++ b/drivers/net/wireless/iwlwifi/iwl-devtrace-data.h @@ -51,6 +51,22 @@ TRACE_EVENT(iwlwifi_dev_tx_data, TP_printk("[%s] TX frame data", __get_str(dev)) ); +TRACE_EVENT(iwlwifi_dev_tx_tso_chunk, + TP_PROTO(const struct device *dev, +u8 *data_src, size_t data_len), + TP_ARGS(dev, data_src, data_len), + TP_STRUCT__entry( + DEV_ENTRY + + __dynamic_array(u8, data, data_len) + ), + TP_fast_assign( + DEV_ASSIGN; + memcpy(__get_dynamic_array(data), data_src, data_len); + ), + TP_printk("[%s] TX frame data", __get_str(dev)) +); + TRACE_EVENT(iwlwifi_dev_rx_data, TP_PROTO(const struct device *dev, const struct iwl_trans *trans, diff --git a/drivers/net/wireless/iwlwifi/iwl-trans.h b/drivers/net/wireless/iwlwifi/iwl-trans.h index 0ceff69..6919243 100644 --- a/drivers/net/wireless/iwlwifi/iwl-trans.h +++ b/drivers/net/wireless/iwlwifi/iwl-trans.h @@ -379,7 +379,11 @@ static inline void iwl_free_rxb(struct iwl_rx_cmd_buffer *r) } #define MAX_NO_RECLAIM_CMDS6 - +/* + * The first entry in driver_data array in ieee80211_tx_info + * that can be used by the transport. + */ +#define IWL_FIRST_DRIVER_DATA 2 #define IWL_MASK(lo, hi) ((1 << (hi)) | ((1 << (hi)) - (1 << (lo /* diff --git a/drivers/net/wireless/iwlwifi/pcie/internal.h b/drivers/net/wireless/iwlwifi/pcie/internal.h index be168d1..7da5643 100644 --- a/drivers/net/wireless/iwlwifi/pcie/internal.h +++ b/drivers/net/wireless/iwlwifi/pcie/internal.h @@ -295,6 +295,11 @@ iwl_pcie_get_scratchbuf_dma(struct iwl_txq *txq, int idx) sizeof(struct iwl_pcie_txq_scratch_buf) * idx; } +struct iwl_tso_hdr_page { + struct page *page; + u8 *pos; +}; + /** * struct iwl_trans_pcie - PCIe transport specific data * @rxq: all the RX queue data @@ -332,6 +337,8 @@ struct iwl_trans_pcie { struct net_device napi_dev; struct napi_struct napi; + struct __percpu iwl_tso_hdr_page *tso_hdr_page; + /* INT ICT Table */ __le32 *ict_tbl; dma_addr_t ict_tbl_dma; diff --git a/drivers/net/wireless/iwlwifi/pcie/trans.c b/drivers/net/wireless/iwlwifi/pcie/trans.c index a275318..f03c3e1 100644 --- a/drivers/net/wireless/iwlwifi/pcie/trans.c +++ b/drivers/net/wireless/iwlwifi/pcie/trans.c @@ -1601,6 +1601,7 @@ static void iwl_trans_pcie_configure(struct iwl_trans *trans, void iwl_trans_pcie_free(struct iwl_trans *trans) { struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); + int i; #ifdef CPTCFG_IWLWIFI_PLATFORM_DATA /* Make sure the device is on before calling pci functions again. @@ -1631,6 +1632,15 @@ void iwl_trans_pcie_free(struct iwl_trans *trans) iwl_pcie_free_fw_monitor(trans); + for_each_possible_cpu(i) { + struct iwl_tso_hdr_page *p = + per_cpu_ptr(trans_pcie->tso_hdr_page, i); + + if (p->page) + __free_pages(p->page, 0); + } + + free_percpu(trans_pcie->tso_hdr_page); iwl_trans_free(trans); } @@ -2839,6 +2849,11 @@ stru
[RFC v4 2/2] iwlwifi: mvm: send large SKBs to the transport
Now that PCIe knows how to create A-MSDUs, use this capability and prepare SKBs that are large enough to build an A-MSDU. Advertise TSO support towards the network stack and segment the packet with gso_size set to be the maximal A-MSDU length (after having taken the headers to be added into account) to make sure that the skb that is passed down to the transport are not longer than the maximal A-MSDU allowed. Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com> --- v4: Don't use the original skb after it has been consumed --- drivers/net/wireless/iwlwifi/mvm/tx.c | 143 -- 1 file changed, 135 insertions(+), 8 deletions(-) diff --git a/drivers/net/wireless/iwlwifi/mvm/tx.c b/drivers/net/wireless/iwlwifi/mvm/tx.c index be9d7e4..d325de4 100644 --- a/drivers/net/wireless/iwlwifi/mvm/tx.c +++ b/drivers/net/wireless/iwlwifi/mvm/tx.c @@ -65,6 +65,7 @@ #include #include #include +#include #include "iwl-trans.h" #include "iwl-eeprom-parse.h" @@ -181,7 +182,8 @@ void iwl_mvm_set_tx_cmd(struct iwl_mvm *mvm, struct sk_buff *skb, tx_cmd->tx_flags = cpu_to_le32(tx_flags); /* Total # bytes to be transmitted */ - tx_cmd->len = cpu_to_le16((u16)skb->len); + tx_cmd->len = cpu_to_le16((u16)skb->len + + (uintptr_t)info->driver_data[0]); tx_cmd->next_frame_len = 0; tx_cmd->life_time = cpu_to_le32(TX_CMD_LIFE_TIME_INFINITE); tx_cmd->sta_id = sta_id; @@ -356,7 +358,6 @@ iwl_mvm_set_tx_params(struct iwl_mvm *mvm, struct sk_buff *skb, memset(>status, 0, sizeof(info->status)); - info->driver_data[0] = NULL; info->driver_data[1] = dev_cmd; return dev_cmd; @@ -379,6 +380,9 @@ int iwl_mvm_tx_skb_non_sta(struct iwl_mvm *mvm, struct sk_buff *skb) info->hw_queue != info->control.vif->cab_queue))) return -1; + /* This holds the amsdu headers length */ + info->driver_data[0] = (void *)(uintptr_t)0; + /* * IWL_MVM_OFFCHANNEL_QUEUE is used for ROC packets that can be used * in 2 different types of vifs, P2P & STATION. P2P uses the offchannel @@ -435,29 +439,148 @@ int iwl_mvm_tx_skb_non_sta(struct iwl_mvm *mvm, struct sk_buff *skb) return 0; } -static int iwl_mvm_tx_tso(struct iwl_mvm *mvm, struct sk_buff *skb_gso, +static int iwl_mvm_tx_tso(struct iwl_mvm *mvm, struct sk_buff *skb, struct ieee80211_sta *sta, struct sk_buff_head *mpdus_skb) { + struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb); + struct ieee80211_hdr *hdr = (void *)skb->data; + unsigned int mss = skb_shinfo(skb)->gso_size; struct sk_buff *tmp, *next; - char cb[sizeof(skb_gso->cb)]; + char cb[sizeof(skb->cb)]; + unsigned int num_subframes, tcp_payload_len, subf_len; + u16 amsdu_add, snap_ip_tcp, pad, i = 0; + /* Not used in IPv6 */ + u16 ip_base_id = ntohs(ip_hdr(skb)->id); + + snap_ip_tcp = 8 + skb_transport_header(skb) - skb_network_header(skb) + + tcp_hdrlen(skb); + + if (!sta->max_amsdu_len || + !ieee80211_is_data_qos(hdr->frame_control)) { + num_subframes = 1; + /* TODO: for the compiler... */ + pad = 0; + goto segment; + } + + /* +* Limit A-MSDU in A-MPDU to 4095 bytes when VHT is not +* supported. This is a spec requirement (IEEE 802.11-2015 +* section 8.7.3 NOTE 3). +*/ + + /* TODO: for now, disable A-MSDU inside AMPDU */ + if (info->flags & IEEE80211_TX_CTL_AMPDU) { + num_subframes = 1; + /* TODO: for the compiler... */ + pad = 0; + goto segment; + } + + /* Sub frame header + SNAP + IP header + TCP header + MSS */ + subf_len = sizeof(struct ethhdr) + snap_ip_tcp + mss; + pad = (4 - subf_len) & 0x3; + + /* +* If we have N subframes in the A-MSDU, then the A-MSDU's size is +* N * subf_len + (N - 1) * pad. +*/ + num_subframes = (sta->max_amsdu_len + pad) / (subf_len + pad); + if (num_subframes > 1) { + u8 *qc = ieee80211_get_qos_ctl((void *)skb->data); + + *qc |= IEEE80211_QOS_CTL_A_MSDU_PRESENT; + } + + tcp_payload_len = skb_tail_pointer(skb) - skb_transport_header(skb) - + tcp_hdrlen(skb) + skb->data_len; - memcpy(cb, skb_gso->cb, sizeof(cb)); - next = skb_gso_segment(skb_gso, 0); - if (IS_ERR(next)) + /* +* Make sure we have enough TBs for the A-MSDU: +* 2 for each subframe +* 1 more for each fragment +* 1 more for the potential data in the header +*/
[PATCH 2/2] iwlwifi: mvm: send large SKBs to the transport
Now that PCIe knows how to create A-MSDUs, use this capability and prepare SKBs that are large enough to build an A-MSDU. Advertise TSO support towards the network stack and segment the packet with gso_size set to be the maximal A-MSDU length (after having taken the headers to be added into account) to make sure that the skb that is passed down to the transport are not longer than the maximal A-MSDU allowed. Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com> --- No change since RFCv4 --- drivers/net/wireless/iwlwifi/mvm/tx.c | 143 -- 1 file changed, 135 insertions(+), 8 deletions(-) diff --git a/drivers/net/wireless/iwlwifi/mvm/tx.c b/drivers/net/wireless/iwlwifi/mvm/tx.c index be9d7e4..d325de4 100644 --- a/drivers/net/wireless/iwlwifi/mvm/tx.c +++ b/drivers/net/wireless/iwlwifi/mvm/tx.c @@ -65,6 +65,7 @@ #include #include #include +#include #include "iwl-trans.h" #include "iwl-eeprom-parse.h" @@ -181,7 +182,8 @@ void iwl_mvm_set_tx_cmd(struct iwl_mvm *mvm, struct sk_buff *skb, tx_cmd->tx_flags = cpu_to_le32(tx_flags); /* Total # bytes to be transmitted */ - tx_cmd->len = cpu_to_le16((u16)skb->len); + tx_cmd->len = cpu_to_le16((u16)skb->len + + (uintptr_t)info->driver_data[0]); tx_cmd->next_frame_len = 0; tx_cmd->life_time = cpu_to_le32(TX_CMD_LIFE_TIME_INFINITE); tx_cmd->sta_id = sta_id; @@ -356,7 +358,6 @@ iwl_mvm_set_tx_params(struct iwl_mvm *mvm, struct sk_buff *skb, memset(>status, 0, sizeof(info->status)); - info->driver_data[0] = NULL; info->driver_data[1] = dev_cmd; return dev_cmd; @@ -379,6 +380,9 @@ int iwl_mvm_tx_skb_non_sta(struct iwl_mvm *mvm, struct sk_buff *skb) info->hw_queue != info->control.vif->cab_queue))) return -1; + /* This holds the amsdu headers length */ + info->driver_data[0] = (void *)(uintptr_t)0; + /* * IWL_MVM_OFFCHANNEL_QUEUE is used for ROC packets that can be used * in 2 different types of vifs, P2P & STATION. P2P uses the offchannel @@ -435,29 +439,148 @@ int iwl_mvm_tx_skb_non_sta(struct iwl_mvm *mvm, struct sk_buff *skb) return 0; } -static int iwl_mvm_tx_tso(struct iwl_mvm *mvm, struct sk_buff *skb_gso, +static int iwl_mvm_tx_tso(struct iwl_mvm *mvm, struct sk_buff *skb, struct ieee80211_sta *sta, struct sk_buff_head *mpdus_skb) { + struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb); + struct ieee80211_hdr *hdr = (void *)skb->data; + unsigned int mss = skb_shinfo(skb)->gso_size; struct sk_buff *tmp, *next; - char cb[sizeof(skb_gso->cb)]; + char cb[sizeof(skb->cb)]; + unsigned int num_subframes, tcp_payload_len, subf_len; + u16 amsdu_add, snap_ip_tcp, pad, i = 0; + /* Not used in IPv6 */ + u16 ip_base_id = ntohs(ip_hdr(skb)->id); + + snap_ip_tcp = 8 + skb_transport_header(skb) - skb_network_header(skb) + + tcp_hdrlen(skb); + + if (!sta->max_amsdu_len || + !ieee80211_is_data_qos(hdr->frame_control)) { + num_subframes = 1; + /* TODO: for the compiler... */ + pad = 0; + goto segment; + } + + /* +* Limit A-MSDU in A-MPDU to 4095 bytes when VHT is not +* supported. This is a spec requirement (IEEE 802.11-2015 +* section 8.7.3 NOTE 3). +*/ + + /* TODO: for now, disable A-MSDU inside AMPDU */ + if (info->flags & IEEE80211_TX_CTL_AMPDU) { + num_subframes = 1; + /* TODO: for the compiler... */ + pad = 0; + goto segment; + } + + /* Sub frame header + SNAP + IP header + TCP header + MSS */ + subf_len = sizeof(struct ethhdr) + snap_ip_tcp + mss; + pad = (4 - subf_len) & 0x3; + + /* +* If we have N subframes in the A-MSDU, then the A-MSDU's size is +* N * subf_len + (N - 1) * pad. +*/ + num_subframes = (sta->max_amsdu_len + pad) / (subf_len + pad); + if (num_subframes > 1) { + u8 *qc = ieee80211_get_qos_ctl((void *)skb->data); + + *qc |= IEEE80211_QOS_CTL_A_MSDU_PRESENT; + } + + tcp_payload_len = skb_tail_pointer(skb) - skb_transport_header(skb) - + tcp_hdrlen(skb) + skb->data_len; - memcpy(cb, skb_gso->cb, sizeof(cb)); - next = skb_gso_segment(skb_gso, 0); - if (IS_ERR(next)) + /* +* Make sure we have enough TBs for the A-MSDU: +* 2 for each subframe +* 1 more for each fragment +* 1 more for the potential data in the header +*/ + num_subframes = + min_t(unsigned
[PATCH 1/2] iwlwifi: pcie: allow to build an A-MSDU using TSO core
When the op_mode sends an skb whose payload is bigger than MSS, PCIe will create an A-MSDU out of it. PCIe assumes that the skb that is coming from the op_mode can fit in one A-MSDU. It is the op_mode's responsibility to make sure that this guarantee holds. Additional headers need to be built for the subframes. The TSO core code takes care of the IP / TCP headers and the driver takes care of the 802.11 subframe headers. These headers are stored on a per-cpu page that is re-used for all the packets handled on that same CPU. Each skb holds a reference to that page and releases the page when it is reclaimed. When the page gets full, it is released and a new one is allocated. Since any SKB that doesn't go through the fast-xmit path of mac80211 will be segmented, we can assume here that the packet is not WEP / TKIP and has a proper SNAP header. Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com> --- Changes since RFCv4, addressing Eliad's comments: * free trans_pcie->tso_hdr_page in the failure path of iwl_trans_pcie_alloc * s/alloc_pages(gfp, 0)/alloc_page(gfp) * make sure that get_page_hdr's allocation didn't fail * use data_left instead of mss for the subframe header's length field * free csum_skb in the failure paths * identation --- drivers/net/wireless/iwlwifi/iwl-devtrace-data.h | 16 ++ drivers/net/wireless/iwlwifi/iwl-trans.h | 6 +- drivers/net/wireless/iwlwifi/pcie/internal.h | 7 + drivers/net/wireless/iwlwifi/pcie/trans.c| 16 ++ drivers/net/wireless/iwlwifi/pcie/tx.c | 295 ++- 5 files changed, 334 insertions(+), 6 deletions(-) diff --git a/drivers/net/wireless/iwlwifi/iwl-devtrace-data.h b/drivers/net/wireless/iwlwifi/iwl-devtrace-data.h index 71a78ce..59d9edf 100644 --- a/drivers/net/wireless/iwlwifi/iwl-devtrace-data.h +++ b/drivers/net/wireless/iwlwifi/iwl-devtrace-data.h @@ -51,6 +51,22 @@ TRACE_EVENT(iwlwifi_dev_tx_data, TP_printk("[%s] TX frame data", __get_str(dev)) ); +TRACE_EVENT(iwlwifi_dev_tx_tso_chunk, + TP_PROTO(const struct device *dev, +u8 *data_src, size_t data_len), + TP_ARGS(dev, data_src, data_len), + TP_STRUCT__entry( + DEV_ENTRY + + __dynamic_array(u8, data, data_len) + ), + TP_fast_assign( + DEV_ASSIGN; + memcpy(__get_dynamic_array(data), data_src, data_len); + ), + TP_printk("[%s] TX frame data", __get_str(dev)) +); + TRACE_EVENT(iwlwifi_dev_rx_data, TP_PROTO(const struct device *dev, const struct iwl_trans *trans, diff --git a/drivers/net/wireless/iwlwifi/iwl-trans.h b/drivers/net/wireless/iwlwifi/iwl-trans.h index 0ceff69..6919243 100644 --- a/drivers/net/wireless/iwlwifi/iwl-trans.h +++ b/drivers/net/wireless/iwlwifi/iwl-trans.h @@ -379,7 +379,11 @@ static inline void iwl_free_rxb(struct iwl_rx_cmd_buffer *r) } #define MAX_NO_RECLAIM_CMDS6 - +/* + * The first entry in driver_data array in ieee80211_tx_info + * that can be used by the transport. + */ +#define IWL_FIRST_DRIVER_DATA 2 #define IWL_MASK(lo, hi) ((1 << (hi)) | ((1 << (hi)) - (1 << (lo /* diff --git a/drivers/net/wireless/iwlwifi/pcie/internal.h b/drivers/net/wireless/iwlwifi/pcie/internal.h index be168d1..7da5643 100644 --- a/drivers/net/wireless/iwlwifi/pcie/internal.h +++ b/drivers/net/wireless/iwlwifi/pcie/internal.h @@ -295,6 +295,11 @@ iwl_pcie_get_scratchbuf_dma(struct iwl_txq *txq, int idx) sizeof(struct iwl_pcie_txq_scratch_buf) * idx; } +struct iwl_tso_hdr_page { + struct page *page; + u8 *pos; +}; + /** * struct iwl_trans_pcie - PCIe transport specific data * @rxq: all the RX queue data @@ -332,6 +337,8 @@ struct iwl_trans_pcie { struct net_device napi_dev; struct napi_struct napi; + struct __percpu iwl_tso_hdr_page *tso_hdr_page; + /* INT ICT Table */ __le32 *ict_tbl; dma_addr_t ict_tbl_dma; diff --git a/drivers/net/wireless/iwlwifi/pcie/trans.c b/drivers/net/wireless/iwlwifi/pcie/trans.c index a275318..93e488f 100644 --- a/drivers/net/wireless/iwlwifi/pcie/trans.c +++ b/drivers/net/wireless/iwlwifi/pcie/trans.c @@ -1601,6 +1601,7 @@ static void iwl_trans_pcie_configure(struct iwl_trans *trans, void iwl_trans_pcie_free(struct iwl_trans *trans) { struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); + int i; #ifdef CPTCFG_IWLWIFI_PLATFORM_DATA /* Make sure the device is on before calling pci functions again. @@ -1631,6 +1632,15 @@ void iwl_trans_pcie_free(struct iwl_trans *trans) iwl_pcie_free_fw_monitor(trans); + for_each_possible_cpu(i) { + struct iwl_tso_hdr_page *p = + per_cpu_ptr(trans_pcie->tso_hdr_page, i); + + if (p->page) + __free_pages(p->page, 0); + } + + free_
[RFC v3 2/2] iwlwifi: mvm: send large SKBs to the transport
Now that PCIe knows how to create A-MSDUs, use this capability and prepare SKBs that are large enough to build an A-MSDU. Advertise TSO support towards the network stack and segment the packet with gso_size set to be the maximal A-MSDU length (after having taken the headers to be added into account) to make sure that the skb that is passed down to the transport are not longer than the maximal A-MSDU allowed. Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com> --- drivers/net/wireless/iwlwifi/mvm/tx.c | 143 -- 1 file changed, 135 insertions(+), 8 deletions(-) diff --git a/drivers/net/wireless/iwlwifi/mvm/tx.c b/drivers/net/wireless/iwlwifi/mvm/tx.c index be9d7e4..4f23149 100644 --- a/drivers/net/wireless/iwlwifi/mvm/tx.c +++ b/drivers/net/wireless/iwlwifi/mvm/tx.c @@ -65,6 +65,7 @@ #include #include #include +#include #include "iwl-trans.h" #include "iwl-eeprom-parse.h" @@ -181,7 +182,8 @@ void iwl_mvm_set_tx_cmd(struct iwl_mvm *mvm, struct sk_buff *skb, tx_cmd->tx_flags = cpu_to_le32(tx_flags); /* Total # bytes to be transmitted */ - tx_cmd->len = cpu_to_le16((u16)skb->len); + tx_cmd->len = cpu_to_le16((u16)skb->len + + (uintptr_t)info->driver_data[0]); tx_cmd->next_frame_len = 0; tx_cmd->life_time = cpu_to_le32(TX_CMD_LIFE_TIME_INFINITE); tx_cmd->sta_id = sta_id; @@ -356,7 +358,6 @@ iwl_mvm_set_tx_params(struct iwl_mvm *mvm, struct sk_buff *skb, memset(>status, 0, sizeof(info->status)); - info->driver_data[0] = NULL; info->driver_data[1] = dev_cmd; return dev_cmd; @@ -379,6 +380,9 @@ int iwl_mvm_tx_skb_non_sta(struct iwl_mvm *mvm, struct sk_buff *skb) info->hw_queue != info->control.vif->cab_queue))) return -1; + /* This holds the amsdu headers length */ + info->driver_data[0] = (void *)(uintptr_t)0; + /* * IWL_MVM_OFFCHANNEL_QUEUE is used for ROC packets that can be used * in 2 different types of vifs, P2P & STATION. P2P uses the offchannel @@ -435,29 +439,148 @@ int iwl_mvm_tx_skb_non_sta(struct iwl_mvm *mvm, struct sk_buff *skb) return 0; } -static int iwl_mvm_tx_tso(struct iwl_mvm *mvm, struct sk_buff *skb_gso, +static int iwl_mvm_tx_tso(struct iwl_mvm *mvm, struct sk_buff *skb, struct ieee80211_sta *sta, struct sk_buff_head *mpdus_skb) { + struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb); + struct ieee80211_hdr *hdr = (void *)skb->data; + unsigned int mss = skb_shinfo(skb)->gso_size; struct sk_buff *tmp, *next; - char cb[sizeof(skb_gso->cb)]; + char cb[sizeof(skb->cb)]; + unsigned int num_subframes, tcp_payload_len, subf_len; + u16 amsdu_add, snap_ip_tcp, pad, i = 0; + + snap_ip_tcp = 8 + skb_transport_header(skb) - skb_network_header(skb) + + tcp_hdrlen(skb); + + if (!sta->max_amsdu_len || + !ieee80211_is_data_qos(hdr->frame_control)) { + num_subframes = 1; + /* TODO: for the compiler... */ + pad = 0; + goto segment; + } + + /* +* Limit A-MSDU in A-MPDU to 4095 bytes when VHT is not +* supported. This is a spec requirement (IEEE 802.11-2015 +* section 8.7.3 NOTE 3). +*/ + + /* TODO: for now, disable A-MSDU inside AMPDU */ + if (info->flags & IEEE80211_TX_CTL_AMPDU) { + num_subframes = 1; + /* TODO: for the compiler... */ + pad = 0; + goto segment; + } + + /* Sub frame header + SNAP + IP header + TCP header + MSS */ + subf_len = sizeof(struct ethhdr) + snap_ip_tcp + mss; + pad = (4 - subf_len) & 0x3; + + /* +* If we have N subframes in the A-MSDU, then the A-MSDU's size is +* N * subf_len + (N - 1) * pad. +*/ + num_subframes = (sta->max_amsdu_len + pad) / (subf_len + pad); + if (num_subframes > 1) { + u8 *qc = ieee80211_get_qos_ctl((void *)skb->data); + + *qc |= IEEE80211_QOS_CTL_A_MSDU_PRESENT; + } + + tcp_payload_len = skb_tail_pointer(skb) - skb_transport_header(skb) - + tcp_hdrlen(skb) + skb->data_len; + + /* +* Make sure we have enough TBs for the A-MSDU: +* 2 for each subframe +* 1 more for each fragment +* 1 more for the potential data in the header +*/ + num_subframes = + min_t(unsigned int, num_subframes, + (mvm->trans->max_skb_frags - 1 - + skb_shinfo(skb)->nr_frags) / 2); + + /* This skb fits in one single A-MSDU */ + if (num_
[RFC v3 1/2] iwlwifi: pcie: allow to build an A-MSDU using TSO core
When the op_mode sends an skb whose payload is bigger than MSS, PCIe will create an A-MSDU out of it. PCIe assumes that the skb that is coming from the op_mode can fit in one A-MSDU. It is the op_mode's responsibility to make sure that this guarantee holds. Additional headers need to be built for the subframes. The TSO core code takes care of the IP / TCP headers and the driver takes care of the 802.11 subframe headers. These headers are stored on a per-cpu page that is re-used for all the packets handled on that same CPU. Each skb holds a reference to that page and releases the page when it is reclaimed. When the page gets full, it is released and a new one is allocated. Since any SKB that doesn't go through the fast-xmit path of mac80211 will be segmented, we can assume here that the packet is not WEP / TKIP and has a proper SNAP header. Signed-off-by: Emmanuel Grumbach <emmanuel.grumb...@intel.com> --- drivers/net/wireless/iwlwifi/iwl-devtrace-data.h | 16 ++ drivers/net/wireless/iwlwifi/iwl-trans.h | 6 +- drivers/net/wireless/iwlwifi/pcie/internal.h | 7 + drivers/net/wireless/iwlwifi/pcie/trans.c| 20 +- drivers/net/wireless/iwlwifi/pcie/tx.c | 286 ++- 5 files changed, 329 insertions(+), 6 deletions(-) diff --git a/drivers/net/wireless/iwlwifi/iwl-devtrace-data.h b/drivers/net/wireless/iwlwifi/iwl-devtrace-data.h index 71a78ce..59d9edf 100644 --- a/drivers/net/wireless/iwlwifi/iwl-devtrace-data.h +++ b/drivers/net/wireless/iwlwifi/iwl-devtrace-data.h @@ -51,6 +51,22 @@ TRACE_EVENT(iwlwifi_dev_tx_data, TP_printk("[%s] TX frame data", __get_str(dev)) ); +TRACE_EVENT(iwlwifi_dev_tx_tso_chunk, + TP_PROTO(const struct device *dev, +u8 *data_src, size_t data_len), + TP_ARGS(dev, data_src, data_len), + TP_STRUCT__entry( + DEV_ENTRY + + __dynamic_array(u8, data, data_len) + ), + TP_fast_assign( + DEV_ASSIGN; + memcpy(__get_dynamic_array(data), data_src, data_len); + ), + TP_printk("[%s] TX frame data", __get_str(dev)) +); + TRACE_EVENT(iwlwifi_dev_rx_data, TP_PROTO(const struct device *dev, const struct iwl_trans *trans, diff --git a/drivers/net/wireless/iwlwifi/iwl-trans.h b/drivers/net/wireless/iwlwifi/iwl-trans.h index 0ceff69..6919243 100644 --- a/drivers/net/wireless/iwlwifi/iwl-trans.h +++ b/drivers/net/wireless/iwlwifi/iwl-trans.h @@ -379,7 +379,11 @@ static inline void iwl_free_rxb(struct iwl_rx_cmd_buffer *r) } #define MAX_NO_RECLAIM_CMDS6 - +/* + * The first entry in driver_data array in ieee80211_tx_info + * that can be used by the transport. + */ +#define IWL_FIRST_DRIVER_DATA 2 #define IWL_MASK(lo, hi) ((1 << (hi)) | ((1 << (hi)) - (1 << (lo /* diff --git a/drivers/net/wireless/iwlwifi/pcie/internal.h b/drivers/net/wireless/iwlwifi/pcie/internal.h index be168d1..7da5643 100644 --- a/drivers/net/wireless/iwlwifi/pcie/internal.h +++ b/drivers/net/wireless/iwlwifi/pcie/internal.h @@ -295,6 +295,11 @@ iwl_pcie_get_scratchbuf_dma(struct iwl_txq *txq, int idx) sizeof(struct iwl_pcie_txq_scratch_buf) * idx; } +struct iwl_tso_hdr_page { + struct page *page; + u8 *pos; +}; + /** * struct iwl_trans_pcie - PCIe transport specific data * @rxq: all the RX queue data @@ -332,6 +337,8 @@ struct iwl_trans_pcie { struct net_device napi_dev; struct napi_struct napi; + struct __percpu iwl_tso_hdr_page *tso_hdr_page; + /* INT ICT Table */ __le32 *ict_tbl; dma_addr_t ict_tbl_dma; diff --git a/drivers/net/wireless/iwlwifi/pcie/trans.c b/drivers/net/wireless/iwlwifi/pcie/trans.c index a275318..5bd678b 100644 --- a/drivers/net/wireless/iwlwifi/pcie/trans.c +++ b/drivers/net/wireless/iwlwifi/pcie/trans.c @@ -1601,6 +1601,7 @@ static void iwl_trans_pcie_configure(struct iwl_trans *trans, void iwl_trans_pcie_free(struct iwl_trans *trans) { struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); + int i; #ifdef CPTCFG_IWLWIFI_PLATFORM_DATA /* Make sure the device is on before calling pci functions again. @@ -1631,6 +1632,15 @@ void iwl_trans_pcie_free(struct iwl_trans *trans) iwl_pcie_free_fw_monitor(trans); + for_each_possible_cpu(i) { + struct iwl_tso_hdr_page *p = + per_cpu_ptr(trans_pcie->tso_hdr_page, i); + + if (p->page) + __free_pages(p->page, 0); + } + + free_percpu(trans_pcie->tso_hdr_page); iwl_trans_free(trans); } @@ -2822,7 +2832,7 @@ struct iwl_trans *iwl_trans_pcie_alloc(struct pci_dev *pdev, struct iwl_trans_pcie *trans_pcie; struct iwl_trans *trans; u16 pci_cmd; - int ret; + int i, ret; trans = iwl_trans_alloc(sizeof(struct iwl_trans_pcie),
[RFC v3 0/2] Add TSO for iwlwifi
Here is a v3 of the series that has been sent uite a while ago. V3 is completly different from v2 and (tries to) address Eric's comments. I have to admit that the code looks much easier now :) The purpose is still to be able to create an A-MSDU from a large send. iwlwifi is split into 2 different layers: the op_mode (iwlmvm.ko) and the transport (iwlwifi.ko). The transport uses the the TSO core to split the payload and builds an A-MSDU out of it. It needs additional memory to store the new headers (n - 1) SNAP / IP / TCP headers that weren't in the original skb and the 802.11 subframe headers with padding. I allocate a per-cpu page to hold these headers. When the page is full, I get a new one. This is the most efficient way I could think about because it avoids to allocate a (rather big: ~140 byte) data structure for each Tx descriptor (and we have a lot...). This is also efficient in terms of data locality. Allocating / freeing a page is supposed to be very fast. Since the transport doesn't handle sequence numbers and other per 802.11 packet related stuff, it needs to get only one 802.11 at a time. The problem here is that based on link condition and peer caps, the size of the maximal 802.11 packet may vary: a same netdev can send data to different peers (think about 2 stations associated to an Access Point) and then, skbs coming from the same netdev need to be split into chunks of different sizes so that the transport will be able to create A-MSDUs. This will happen if one station supports 11K A-MSDU, while the other one supports 4K A-MSDU only. This requirement disallows the use of gso_max_{segs,size} since they can't differentiate between different peers. Of course, I could make the transport aware of all the sequence numbers and friends, but that seemed too intrusive into the current architecture. Instead, I prefered to use skb_gso_segment and let it do the work. When I get an skb to be sent, I know the peer and its caps. I can then determine what is the maximal length of A-MSDU for this peer and skb_gso_segment this skb with gso_size set to that maximal A-MSDU length. Note that the maximal A-MSDU length can be up to 11K nowadays, so we are far from the limit of the 2 bytes of gso_size. After I did that, I only need to fixup slightly the id in the IP header. There are still a few minor TODOs here and there, but I think (and hope) we are getting close :) Emmanuel Grumbach (2): iwlwifi: pcie: allow to build an A-MSDU using TSO core iwlwifi: mvm: send large SKBs to the transport drivers/net/wireless/iwlwifi/iwl-devtrace-data.h | 16 ++ drivers/net/wireless/iwlwifi/iwl-trans.h | 6 +- drivers/net/wireless/iwlwifi/mvm/tx.c| 143 +++- drivers/net/wireless/iwlwifi/pcie/internal.h | 7 + drivers/net/wireless/iwlwifi/pcie/trans.c| 20 +- drivers/net/wireless/iwlwifi/pcie/tx.c | 286 ++- 6 files changed, 464 insertions(+), 14 deletions(-) -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC v3 1/3] iwlwifi: mvm: add real TSO implementation
The segmentation is done completely in software. The driver creates several MPDUs out of a single large send. Each MPDU is a newly allocated SKB. A page is allocated to create the headers that need to be duplicated (SNAP / IP / TCP). The WiFi header is in the header of the newly created SKBs. type=feature Change-Id: I238ffa79cacc5bbdacdfbf3e9673c8d4f02b462a Signed-off-by: Emmanuel Grumbach emmanuel.grumb...@intel.com --- drivers/net/wireless/iwlwifi/mvm/tx.c | 513 +++--- 1 file changed, 481 insertions(+), 32 deletions(-) diff --git a/drivers/net/wireless/iwlwifi/mvm/tx.c b/drivers/net/wireless/iwlwifi/mvm/tx.c index 90f0ea1..a63686c 100644 --- a/drivers/net/wireless/iwlwifi/mvm/tx.c +++ b/drivers/net/wireless/iwlwifi/mvm/tx.c @@ -65,6 +65,7 @@ #include linux/ieee80211.h #include linux/etherdevice.h #include net/tcp.h +#include net/ip.h #include iwl-trans.h #include iwl-eeprom-parse.h @@ -435,32 +436,471 @@ int iwl_mvm_tx_skb_non_sta(struct iwl_mvm *mvm, struct sk_buff *skb) return 0; } +/* + * Update the IP / TCP headers and recompute the IP header CSUM + + * pseudo header CSUM. + */ +static void iwl_update_ip_tcph(void *iph, struct tcphdr *tcph, bool ipv6, + unsigned int len, unsigned int tcp_seq_offset, + u16 num_segment) +{ + be32_add_cpu(tcph-seq, tcp_seq_offset); + + if (ipv6) { + struct ipv6hdr *iphv6 = iph; + + iphv6-payload_len = cpu_to_be16(len + tcph-doff * 4); + + /* Compute CSUM on the the pseudo-header */ + tcph-check = ~csum_ipv6_magic(iphv6-saddr, iphv6-daddr, + len + tcph-doff * 4, + IPPROTO_TCP, 0); + } else { + struct iphdr *iphv4 = iph; + + iphv4-tot_len = + cpu_to_be16(len + tcph-doff * 4 + iphv4-ihl * 4); + be16_add_cpu(iphv4-id, num_segment); + ip_send_check(iphv4); + + /* Compute CSUM on the the pseudo-header */ + tcph-check = ~csum_tcpudp_magic(iphv4-saddr, iphv4-daddr, +len + tcph-doff * 4, +IPPROTO_TCP, 0); + } +} + +/** + * struct iwl_lso_splitter - state of the split. + * @linear_payload_len: The length of the payload inside the header of the + * original GSO skb. + * @gso_frag_num: The fragment number from which to take the data in the + * original GSO skb. + * @gso_payload_len: The length of the payload in the original GSO skb. + * @gso_payload_pos: The incrementing position in the payload of the original + * GSO skb. + * @gso_offset_in_page: The offset in the page of gso_frag_num. + * @gso_current_frag_size: The size of gso_frag_num. + * @gso_offset_in_frag: The offset in the gso_frag_num. + * @frag_in_mpdu: The index of the frag inside the new (split) MPDU. + * @mss: The maximal segment size. + * @si: Points to the the shared info of the original GSO skb. + * @ieee80211_hdr *hdr: Points to the WiFi header. + * @gso_nr_frags: The number of frags in the original GSO skb. + * @wifi_hdr_iv_len: The length of the WiFi header including IV. + * @tcp_fin: True if TCP_FIN is set in the original GSO skb. + * @tcp_push: True if TCP_PSH is set in the original GSO skb. + */ +struct iwl_lso_splitter { + unsigned int linear_payload_len; + unsigned int gso_frag_num; + unsigned int gso_payload_len; + unsigned int gso_payload_pos; + unsigned int gso_offset_in_page; + unsigned int gso_current_frag_size; + unsigned int gso_offset_in_frag; + unsigned int frag_in_mpdu; + unsigned int mss; + struct skb_shared_info *si; + struct ieee80211_hdr *hdr; + u8 gso_nr_frags; + u8 wifi_hdr_iv_len; + bool tcp_fin; + bool tcp_push; +}; + +/* + * Adds a TCP segment from skb_gso to skb. All the state is taken from + * and fed back to p. This function takes care about the payload only. + * This MSDU might already have msdu_sz bytes of payload that come from + * the original GSO skb's header. + */ +static unsigned int +iwl_add_tcp_segment(struct iwl_mvm *mvm, struct sk_buff *skb_gso, + struct sk_buff *skb, struct iwl_lso_splitter *p, + unsigned int msdu_sz) +{ + while (msdu_sz p-mss) { + unsigned int frag_sz = + min_t(unsigned int, p-gso_current_frag_size, + p-mss - msdu_sz); + + if (p-frag_in_mpdu = mvm-trans-max_skb_frags) + return msdu_sz; + + skb_add_rx_frag(skb, p-frag_in_mpdu, + skb_frag_page(p-si-frags[p-gso_frag_num]), + p-gso_offset_in_page, frag_sz, 0); + + /* We just added one frag to the mpdu
[RFC v3 2/3] iwlwifi: mvm: allow to create A-MSDUs from a large send
Now that we can get a big chunk of data from the network stack, we can create an A-MSDU out of it. The purpose is to get a throughput improvement since sending one single A-MSDU is more efficient than sending several MSDUs at least under ideal link conditions. type=feature Change-Id: I5ea1b1132a57542187cd4c34c5299dbf44fe8b01 Signed-off-by: Emmanuel Grumbach emmanuel.grumb...@intel.com --- drivers/net/wireless/iwlwifi/mvm/mac80211.c | 3 +- drivers/net/wireless/iwlwifi/mvm/sta.c | 4 +- drivers/net/wireless/iwlwifi/mvm/sta.h | 6 +- drivers/net/wireless/iwlwifi/mvm/tx.c | 159 ++-- 4 files changed, 160 insertions(+), 12 deletions(-) diff --git a/drivers/net/wireless/iwlwifi/mvm/mac80211.c b/drivers/net/wireless/iwlwifi/mvm/mac80211.c index 3dd4e97..dd15e04 100644 --- a/drivers/net/wireless/iwlwifi/mvm/mac80211.c +++ b/drivers/net/wireless/iwlwifi/mvm/mac80211.c @@ -925,7 +925,8 @@ static int iwl_mvm_mac_ampdu_action(struct ieee80211_hw *hw, ret = iwl_mvm_sta_tx_agg_flush(mvm, vif, sta, tid); break; case IEEE80211_AMPDU_TX_OPERATIONAL: - ret = iwl_mvm_sta_tx_agg_oper(mvm, vif, sta, tid, buf_size); + ret = iwl_mvm_sta_tx_agg_oper(mvm, vif, sta, tid, + buf_size, amsdu); break; default: WARN_ON_ONCE(1); diff --git a/drivers/net/wireless/iwlwifi/mvm/sta.c b/drivers/net/wireless/iwlwifi/mvm/sta.c index df216cd..606fc09 100644 --- a/drivers/net/wireless/iwlwifi/mvm/sta.c +++ b/drivers/net/wireless/iwlwifi/mvm/sta.c @@ -976,7 +976,8 @@ int iwl_mvm_sta_tx_agg_start(struct iwl_mvm *mvm, struct ieee80211_vif *vif, } int iwl_mvm_sta_tx_agg_oper(struct iwl_mvm *mvm, struct ieee80211_vif *vif, - struct ieee80211_sta *sta, u16 tid, u8 buf_size) + struct ieee80211_sta *sta, u16 tid, u8 buf_size, + bool amsdu) { struct iwl_mvm_sta *mvmsta = iwl_mvm_sta_from_mac80211(sta); struct iwl_mvm_tid_data *tid_data = mvmsta-tid_data[tid]; @@ -995,6 +996,7 @@ int iwl_mvm_sta_tx_agg_oper(struct iwl_mvm *mvm, struct ieee80211_vif *vif, queue = tid_data-txq_id; tid_data-state = IWL_AGG_ON; mvmsta-agg_tids |= BIT(tid); + tid_data-amsdu_in_ampdu_allowed = amsdu; tid_data-ssn = 0x; spin_unlock_bh(mvmsta-lock); diff --git a/drivers/net/wireless/iwlwifi/mvm/sta.h b/drivers/net/wireless/iwlwifi/mvm/sta.h index eedb215..26d1e31 100644 --- a/drivers/net/wireless/iwlwifi/mvm/sta.h +++ b/drivers/net/wireless/iwlwifi/mvm/sta.h @@ -258,6 +258,8 @@ enum iwl_mvm_agg_state { * Tx response (TX_CMD), and the block ack notification (COMPRESSED_BA). * @reduced_tpc: Reduced tx power. Holds the data between the * Tx response (TX_CMD), and the block ack notification (COMPRESSED_BA). + * @amsdu_in_ampdu_allowed: true if A-MSDU in A-MPDU is allowed. Relevant only + * if state is %IWL_AGG_ON. * @state: state of the BA agreement establishment / tear down. * @txq_id: Tx queue used by the BA session * @ssn: the first packet to be sent in AGG HW queue in Tx AGG start flow, or @@ -272,6 +274,7 @@ struct iwl_mvm_tid_data { /* The rest is Tx AGG related */ u32 rate_n_flags; u8 reduced_tpc; + bool amsdu_in_ampdu_allowed; enum iwl_mvm_agg_state state; u16 txq_id; u16 ssn; @@ -387,7 +390,8 @@ int iwl_mvm_sta_rx_agg(struct iwl_mvm *mvm, struct ieee80211_sta *sta, int iwl_mvm_sta_tx_agg_start(struct iwl_mvm *mvm, struct ieee80211_vif *vif, struct ieee80211_sta *sta, u16 tid, u16 *ssn); int iwl_mvm_sta_tx_agg_oper(struct iwl_mvm *mvm, struct ieee80211_vif *vif, - struct ieee80211_sta *sta, u16 tid, u8 buf_size); + struct ieee80211_sta *sta, u16 tid, u8 buf_size, + bool amsdu); int iwl_mvm_sta_tx_agg_stop(struct iwl_mvm *mvm, struct ieee80211_vif *vif, struct ieee80211_sta *sta, u16 tid); int iwl_mvm_sta_tx_agg_flush(struct iwl_mvm *mvm, struct ieee80211_vif *vif, diff --git a/drivers/net/wireless/iwlwifi/mvm/tx.c b/drivers/net/wireless/iwlwifi/mvm/tx.c index a63686c..5046833 100644 --- a/drivers/net/wireless/iwlwifi/mvm/tx.c +++ b/drivers/net/wireless/iwlwifi/mvm/tx.c @@ -488,8 +488,10 @@ static void iwl_update_ip_tcph(void *iph, struct tcphdr *tcph, bool ipv6, * @ieee80211_hdr *hdr: Points to the WiFi header. * @gso_nr_frags: The number of frags in the original GSO skb. * @wifi_hdr_iv_len: The length of the WiFi header including IV. + * @amsdu_pad: Number of bytes for the A-MSDU subframe * @tcp_fin: True if TCP_FIN is set in the original GSO skb. * @tcp_push: True if TCP_PSH is set in the original GSO skb. + * @amsdu: True if we are building an A-MSDU */ struct iwl_lso_splitter { unsigned int
[RFC v3 0/3] add TSO / A-MSDU TX for iwlwifi
An A-MSDU is a 802.11 frame that comprises several L3 segments: Each subframe has a ETH / SNAP / IP / TCP header (in case TCP is used of course). Each subframe has no 802.11 limitation on its length (only the MSS limitation), but the whole A-MSDU has a size limit both in number of subframes, and in bytes. In the best conditions, the A-MSDU is unlimited in number of subframes, and can be up to 11K byte long. An A-MSDU has only one WiFi MAC address, and hence, each A-MSDU can be sent to only one WiFi peer. There a 2 ways to get A-MSDU: * xmit_more * LSO * LSO + skb_gso_segment The problem with xmit_more is that in AP mode, the Qdisc can have packets for several L2 clients which makes is harder to use since as stated above, an A-MSDU can be sent to one single WiFi peer. The problem with LSO is that the LSO packet is very likely to be way bigger than the maximal A-MSDU length. This means that a single LSO packet will generate several WiFi packets. From an arch point of view, having different WiFi packets sitting in one single skb is a problem, so we need to create several skbs out of one single LSO packet. This makes the usage of currently existing helpers (net/core/tso.c) very hard. Using LSO and skb_gso_segment is a bit wasteful since we don't need an skb for each MSS, so that we would have to reassemble the segs into one single packet. This is sub-efficient. Out of the two aforementioned ways, I chose to work with LSO and implemented the segmentation in the driver itself. While this code can technically be made driver agnostic, the trend in the industry seems to show that more and more vendors split the buffers in the firmware, so that I don't expect to see any new devices coming up with the same behavior as ours. I did take a look on currently existing drivers and they already do the work in firmware. If someone in linux-wireless thinks that this code can serve his purposes, please speak up. All you'd need is to support TX_CSUM and SG. The additional headers needed (SNAP / IP / TCP) are copied and updated on a page which is allocated in xmit. This is supposed to very cheap. I add skb frags from that page to add the skb. Since I can have several 802.11 packets from a single LSO packet, I allocated skb's on the way if needed. I am quite a newbie in skb handling, so I guess that this code can be improved. I have tested it decently using iperf, but this doesn't mean that there are no issues using other applications. We are enabling pktgen on TCP (using patches that were sent a year ago or so) to test the different layouts of the skb (payload partition amongst the header and the different frags). Changes since v2: * + I fixed the whitespace problem spotted by Sergei. + I changed the D'tor check for truesize accounting. Since the D'tor seems to be set to tcp_wfree if it exists, I just the D'tor not being NULL. Emmanuel Grumbach (3): iwlwifi: mvm: add real TSO implementation iwlwifi: mvm: allow to create A-MSDUs from a large send iwlwifi: mvm: transfer the truesize to the last TSO segment drivers/net/wireless/iwlwifi/mvm/mac80211.c | 3 +- drivers/net/wireless/iwlwifi/mvm/sta.c | 4 +- drivers/net/wireless/iwlwifi/mvm/sta.h | 6 +- drivers/net/wireless/iwlwifi/mvm/tx.c | 669 ++-- 4 files changed, 647 insertions(+), 35 deletions(-) -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC v3 3/3] iwlwifi: mvm: transfer the truesize to the last TSO segment
This allows to release the backpressure on the socket only when the last segment is released. Now the truesize looks like this: if the truesize of the original skb is 65420, all the segments will have a truesize of 704 (skb itself) and the last one will have 65420. Change-Id: I3c894cf2afc0aedfe7b2a5b992ba41653ff79c0e Signed-off-by: Emmanuel Grumbach emmanuel.grumb...@intel.com --- drivers/net/wireless/iwlwifi/mvm/tx.c | 17 - 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/drivers/net/wireless/iwlwifi/mvm/tx.c b/drivers/net/wireless/iwlwifi/mvm/tx.c index 5046833..2aeb5fd 100644 --- a/drivers/net/wireless/iwlwifi/mvm/tx.c +++ b/drivers/net/wireless/iwlwifi/mvm/tx.c @@ -764,7 +764,7 @@ static int iwl_mvm_tx_tso(struct iwl_mvm *mvm, struct sk_buff *skb_gso, bool ipv6 = skb_shinfo(skb_gso)-gso_type SKB_GSO_TCPV6; struct iwl_lso_splitter s = {}; struct page *hdr_page; - unsigned int mpdu_sz; + unsigned int mpdu_sz, sum_truesize = 0; u8 *hdr_page_pos, *qc, tid; int i, ret; @@ -898,6 +898,7 @@ static int iwl_mvm_tx_tso(struct iwl_mvm *mvm, struct sk_buff *skb_gso, mpdu_sz, tcp_hdrlen(skb_gso)); __skb_queue_tail(mpdus_skb, skb_gso); + sum_truesize += skb_gso-truesize; /* mss bytes have been consumed from the data */ s.gso_payload_pos = s.mss; @@ -1034,6 +1035,20 @@ static int iwl_mvm_tx_tso(struct iwl_mvm *mvm, struct sk_buff *skb_gso, } __skb_queue_tail(mpdus_skb, skb); + sum_truesize += skb-truesize; + } + + /* Release the backpressure on the socket only when +* the last segment is released. +*/ + if (skb_gso-destructor) { + struct sk_buff *tail = mpdus_skb-prev; + + swap(tail-truesize, skb_gso-truesize); + swap(tail-destructor, skb_gso-destructor); + swap(tail-sk, skb_gso-sk); + atomic_add(sum_truesize - skb_gso-truesize, + skb_gso-sk-sk_wmem_alloc); } ret = 0; -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC v2 1/3] iwlwifi: mvm: add real TSO implementation
The segmentation is done completely in software. The driver creates several MPDUs out of a single large send. Each MPDU is a newly allocated SKB. A page is allocated to create the headers that need to be duplicated (SNAP / IP / TCP). The WiFi header is in the header of the newly created SKBs. type=feature Change-Id: I238ffa79cacc5bbdacdfbf3e9673c8d4f02b462a Signed-off-by: Emmanuel Grumbach emmanuel.grumb...@intel.com --- drivers/net/wireless/iwlwifi/mvm/tx.c | 513 +++--- 1 file changed, 481 insertions(+), 32 deletions(-) diff --git a/drivers/net/wireless/iwlwifi/mvm/tx.c b/drivers/net/wireless/iwlwifi/mvm/tx.c index 90f0ea1..a63686c 100644 --- a/drivers/net/wireless/iwlwifi/mvm/tx.c +++ b/drivers/net/wireless/iwlwifi/mvm/tx.c @@ -65,6 +65,7 @@ #include linux/ieee80211.h #include linux/etherdevice.h #include net/tcp.h +#include net/ip.h #include iwl-trans.h #include iwl-eeprom-parse.h @@ -435,32 +436,471 @@ int iwl_mvm_tx_skb_non_sta(struct iwl_mvm *mvm, struct sk_buff *skb) return 0; } +/* + * Update the IP / TCP headers and recompute the IP header CSUM + + * pseudo header CSUM. + */ +static void iwl_update_ip_tcph(void *iph, struct tcphdr *tcph, bool ipv6, + unsigned int len, unsigned int tcp_seq_offset, + u16 num_segment) +{ + be32_add_cpu(tcph-seq, tcp_seq_offset); + + if (ipv6) { + struct ipv6hdr *iphv6 = iph; + + iphv6-payload_len = cpu_to_be16(len + tcph-doff * 4); + + /* Compute CSUM on the the pseudo-header */ + tcph-check = ~csum_ipv6_magic(iphv6-saddr, iphv6-daddr, + len + tcph-doff * 4, + IPPROTO_TCP, 0); + } else { + struct iphdr *iphv4 = iph; + + iphv4-tot_len = + cpu_to_be16(len + tcph-doff * 4 + iphv4-ihl * 4); + be16_add_cpu(iphv4-id, num_segment); + ip_send_check(iphv4); + + /* Compute CSUM on the the pseudo-header */ + tcph-check = ~csum_tcpudp_magic(iphv4-saddr, iphv4-daddr, +len + tcph-doff * 4, +IPPROTO_TCP, 0); + } +} + +/** + * struct iwl_lso_splitter - state of the split. + * @linear_payload_len: The length of the payload inside the header of the + * original GSO skb. + * @gso_frag_num: The fragment number from which to take the data in the + * original GSO skb. + * @gso_payload_len: The length of the payload in the original GSO skb. + * @gso_payload_pos: The incrementing position in the payload of the original + * GSO skb. + * @gso_offset_in_page: The offset in the page of gso_frag_num. + * @gso_current_frag_size: The size of gso_frag_num. + * @gso_offset_in_frag: The offset in the gso_frag_num. + * @frag_in_mpdu: The index of the frag inside the new (split) MPDU. + * @mss: The maximal segment size. + * @si: Points to the the shared info of the original GSO skb. + * @ieee80211_hdr *hdr: Points to the WiFi header. + * @gso_nr_frags: The number of frags in the original GSO skb. + * @wifi_hdr_iv_len: The length of the WiFi header including IV. + * @tcp_fin: True if TCP_FIN is set in the original GSO skb. + * @tcp_push: True if TCP_PSH is set in the original GSO skb. + */ +struct iwl_lso_splitter { + unsigned int linear_payload_len; + unsigned int gso_frag_num; + unsigned int gso_payload_len; + unsigned int gso_payload_pos; + unsigned int gso_offset_in_page; + unsigned int gso_current_frag_size; + unsigned int gso_offset_in_frag; + unsigned int frag_in_mpdu; + unsigned int mss; + struct skb_shared_info *si; + struct ieee80211_hdr *hdr; + u8 gso_nr_frags; + u8 wifi_hdr_iv_len; + bool tcp_fin; + bool tcp_push; +}; + +/* + * Adds a TCP segment from skb_gso to skb. All the state is taken from + * and fed back to p. This function takes care about the payload only. + * This MSDU might already have msdu_sz bytes of payload that come from + * the original GSO skb's header. + */ +static unsigned int +iwl_add_tcp_segment(struct iwl_mvm *mvm, struct sk_buff *skb_gso, + struct sk_buff *skb, struct iwl_lso_splitter *p, + unsigned int msdu_sz) +{ + while (msdu_sz p-mss) { + unsigned int frag_sz = + min_t(unsigned int, p-gso_current_frag_size, + p-mss - msdu_sz); + + if (p-frag_in_mpdu = mvm-trans-max_skb_frags) + return msdu_sz; + + skb_add_rx_frag(skb, p-frag_in_mpdu, + skb_frag_page(p-si-frags[p-gso_frag_num]), + p-gso_offset_in_page, frag_sz, 0); + + /* We just added one frag to the mpdu
[RFC v2 0/3] add TSO / A-MSDU TX for iwlwifi
We enable TSO to get a lot of data at once to build A-MSDUs. Our hardware doesn't have (yet) TCP CSUM offload, so we do it manually. TSO won't be enabled on hardware that don't support CSUM offload in release code, computing TCP CSUM in the driver is just a way to start coding the flows. This is why the CSUM offload implementation in the driver in so bad in terms of efficiency. I preferred to have the flows as close as they will be when the hardware will be able to the CSUM than to try to seek efficiency. The hardware that will have CSUM offload will still require the driver to split the skb in software including the IP / TCP header copy and update etc... We could have enabled A-MSDU based on xmit-more, but the rationale of using LSO is that when using pfifo-fast, the Qdisc gets one packet and dequeues is straight away which limits the possibility to get a lot of packets at once. (Am I right here?). A note about A-MSDUs for non-wireless people: * An A-MSDU is a aggregated frame. It is one big 802.11 packet that contains several subframes. Each subframe is a TCP segment. One A-MSDU is represented by one single skb which means that we need to copy / duplicate the TCP / IP / SNAP headers in one single skb. This is why those headers are copied to a separate page: that page is added multiple times to the skb with different offsets. Each subframes needs at least 2 frags: 1 for the headers, 1 (or more) for the payload. I am quite a newbie in skb handling, so I guess that this code can be improved. I have tested it decently using iperf, but this doesn't mean that there are no issues using other applications. We are enabling pktgen on TCP (using patches that were sent a year ago or so) to test the different layouts of the skb (payload partition amongst the header and the different frags). I'll be very happy to get comments on that code, this is why I am sending it to netdev as well since the TSO experts are there :) Emmanuel Grumbach (3): iwlwifi: mvm: add real TSO implementation iwlwifi: mvm: allow to create A-MSDUs from a large send iwlwifi: mvm: transfer the truesize to the last TSO segment drivers/net/wireless/iwlwifi/mvm/mac80211.c | 3 +- drivers/net/wireless/iwlwifi/mvm/sta.c | 4 +- drivers/net/wireless/iwlwifi/mvm/sta.h | 6 +- drivers/net/wireless/iwlwifi/mvm/tx.c | 669 ++-- 4 files changed, 647 insertions(+), 35 deletions(-) -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC v2 3/3] iwlwifi: mvm: transfer the truesize to the last TSO segment
This allows to release the backpressure on the socket only when the last segment is released. Now the truesize looks like this: if the truesize of the original skb is 65420, all the segments will have a truesize of 704 (skb itself) and the last one will have 65420. Change-Id: I3c894cf2afc0aedfe7b2a5b992ba41653ff79c0e Signed-off-by: Emmanuel Grumbach emmanuel.grumb...@intel.com --- drivers/net/wireless/iwlwifi/mvm/tx.c | 17 - 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/drivers/net/wireless/iwlwifi/mvm/tx.c b/drivers/net/wireless/iwlwifi/mvm/tx.c index 5046833..046e50d 100644 --- a/drivers/net/wireless/iwlwifi/mvm/tx.c +++ b/drivers/net/wireless/iwlwifi/mvm/tx.c @@ -764,7 +764,7 @@ static int iwl_mvm_tx_tso(struct iwl_mvm *mvm, struct sk_buff *skb_gso, bool ipv6 = skb_shinfo(skb_gso)-gso_type SKB_GSO_TCPV6; struct iwl_lso_splitter s = {}; struct page *hdr_page; - unsigned int mpdu_sz; + unsigned int mpdu_sz, sum_truesize = 0; u8 *hdr_page_pos, *qc, tid; int i, ret; @@ -898,6 +898,7 @@ static int iwl_mvm_tx_tso(struct iwl_mvm *mvm, struct sk_buff *skb_gso, mpdu_sz, tcp_hdrlen(skb_gso)); __skb_queue_tail(mpdus_skb, skb_gso); + sum_truesize += skb_gso-truesize; /* mss bytes have been consumed from the data */ s.gso_payload_pos = s.mss; @@ -1034,6 +1035,20 @@ static int iwl_mvm_tx_tso(struct iwl_mvm *mvm, struct sk_buff *skb_gso, } __skb_queue_tail(mpdus_skb, skb); + sum_truesize += skb-truesize; + } + + /* Release the backpressure on the socket only when +* the last segment is released. +*/ + if (skb_gso-destructor == sock_wfree) { + struct sk_buff *tail = mpdus_skb-prev; + + swap(tail-truesize, skb_gso-truesize); + swap(tail-destructor, skb_gso-destructor); + swap(tail-sk, skb_gso-sk); +atomic_add(sum_truesize - skb_gso-truesize, + skb_gso-sk-sk_wmem_alloc); } ret = 0; -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC v2 2/3] iwlwifi: mvm: allow to create A-MSDUs from a large send
Now that we can get a big chunk of data from the network stack, we can create an A-MSDU out of it. The purpose is to get a throughput improvement since sending one single A-MSDU is more efficient than sending several MSDUs at least under ideal link conditions. type=feature Change-Id: I5ea1b1132a57542187cd4c34c5299dbf44fe8b01 Signed-off-by: Emmanuel Grumbach emmanuel.grumb...@intel.com --- drivers/net/wireless/iwlwifi/mvm/mac80211.c | 3 +- drivers/net/wireless/iwlwifi/mvm/sta.c | 4 +- drivers/net/wireless/iwlwifi/mvm/sta.h | 6 +- drivers/net/wireless/iwlwifi/mvm/tx.c | 159 ++-- 4 files changed, 160 insertions(+), 12 deletions(-) diff --git a/drivers/net/wireless/iwlwifi/mvm/mac80211.c b/drivers/net/wireless/iwlwifi/mvm/mac80211.c index 3dd4e97..dd15e04 100644 --- a/drivers/net/wireless/iwlwifi/mvm/mac80211.c +++ b/drivers/net/wireless/iwlwifi/mvm/mac80211.c @@ -925,7 +925,8 @@ static int iwl_mvm_mac_ampdu_action(struct ieee80211_hw *hw, ret = iwl_mvm_sta_tx_agg_flush(mvm, vif, sta, tid); break; case IEEE80211_AMPDU_TX_OPERATIONAL: - ret = iwl_mvm_sta_tx_agg_oper(mvm, vif, sta, tid, buf_size); + ret = iwl_mvm_sta_tx_agg_oper(mvm, vif, sta, tid, + buf_size, amsdu); break; default: WARN_ON_ONCE(1); diff --git a/drivers/net/wireless/iwlwifi/mvm/sta.c b/drivers/net/wireless/iwlwifi/mvm/sta.c index df216cd..606fc09 100644 --- a/drivers/net/wireless/iwlwifi/mvm/sta.c +++ b/drivers/net/wireless/iwlwifi/mvm/sta.c @@ -976,7 +976,8 @@ int iwl_mvm_sta_tx_agg_start(struct iwl_mvm *mvm, struct ieee80211_vif *vif, } int iwl_mvm_sta_tx_agg_oper(struct iwl_mvm *mvm, struct ieee80211_vif *vif, - struct ieee80211_sta *sta, u16 tid, u8 buf_size) + struct ieee80211_sta *sta, u16 tid, u8 buf_size, + bool amsdu) { struct iwl_mvm_sta *mvmsta = iwl_mvm_sta_from_mac80211(sta); struct iwl_mvm_tid_data *tid_data = mvmsta-tid_data[tid]; @@ -995,6 +996,7 @@ int iwl_mvm_sta_tx_agg_oper(struct iwl_mvm *mvm, struct ieee80211_vif *vif, queue = tid_data-txq_id; tid_data-state = IWL_AGG_ON; mvmsta-agg_tids |= BIT(tid); + tid_data-amsdu_in_ampdu_allowed = amsdu; tid_data-ssn = 0x; spin_unlock_bh(mvmsta-lock); diff --git a/drivers/net/wireless/iwlwifi/mvm/sta.h b/drivers/net/wireless/iwlwifi/mvm/sta.h index eedb215..26d1e31 100644 --- a/drivers/net/wireless/iwlwifi/mvm/sta.h +++ b/drivers/net/wireless/iwlwifi/mvm/sta.h @@ -258,6 +258,8 @@ enum iwl_mvm_agg_state { * Tx response (TX_CMD), and the block ack notification (COMPRESSED_BA). * @reduced_tpc: Reduced tx power. Holds the data between the * Tx response (TX_CMD), and the block ack notification (COMPRESSED_BA). + * @amsdu_in_ampdu_allowed: true if A-MSDU in A-MPDU is allowed. Relevant only + * if state is %IWL_AGG_ON. * @state: state of the BA agreement establishment / tear down. * @txq_id: Tx queue used by the BA session * @ssn: the first packet to be sent in AGG HW queue in Tx AGG start flow, or @@ -272,6 +274,7 @@ struct iwl_mvm_tid_data { /* The rest is Tx AGG related */ u32 rate_n_flags; u8 reduced_tpc; + bool amsdu_in_ampdu_allowed; enum iwl_mvm_agg_state state; u16 txq_id; u16 ssn; @@ -387,7 +390,8 @@ int iwl_mvm_sta_rx_agg(struct iwl_mvm *mvm, struct ieee80211_sta *sta, int iwl_mvm_sta_tx_agg_start(struct iwl_mvm *mvm, struct ieee80211_vif *vif, struct ieee80211_sta *sta, u16 tid, u16 *ssn); int iwl_mvm_sta_tx_agg_oper(struct iwl_mvm *mvm, struct ieee80211_vif *vif, - struct ieee80211_sta *sta, u16 tid, u8 buf_size); + struct ieee80211_sta *sta, u16 tid, u8 buf_size, + bool amsdu); int iwl_mvm_sta_tx_agg_stop(struct iwl_mvm *mvm, struct ieee80211_vif *vif, struct ieee80211_sta *sta, u16 tid); int iwl_mvm_sta_tx_agg_flush(struct iwl_mvm *mvm, struct ieee80211_vif *vif, diff --git a/drivers/net/wireless/iwlwifi/mvm/tx.c b/drivers/net/wireless/iwlwifi/mvm/tx.c index a63686c..5046833 100644 --- a/drivers/net/wireless/iwlwifi/mvm/tx.c +++ b/drivers/net/wireless/iwlwifi/mvm/tx.c @@ -488,8 +488,10 @@ static void iwl_update_ip_tcph(void *iph, struct tcphdr *tcph, bool ipv6, * @ieee80211_hdr *hdr: Points to the WiFi header. * @gso_nr_frags: The number of frags in the original GSO skb. * @wifi_hdr_iv_len: The length of the WiFi header including IV. + * @amsdu_pad: Number of bytes for the A-MSDU subframe * @tcp_fin: True if TCP_FIN is set in the original GSO skb. * @tcp_push: True if TCP_PSH is set in the original GSO skb. + * @amsdu: True if we are building an A-MSDU */ struct iwl_lso_splitter { unsigned int
Re: [PATCH 1/2] iwlwifi: convert hex_dump_to_buffer() to %*ph
On Tue, Aug 4, 2015 at 1:47 PM, Andy Shevchenko andriy.shevche...@linux.intel.com wrote: On Thu, 2015-07-16 at 15:42 +0300, Andy Shevchenko wrote: There is no need to use hex_dump_to_buffer() in the cases like this: hexdump_to_buffer(buf, len, 16, 1, outbuf, outlen, false); /* len = 16 */ sprintf(%s\n, outbuf); since it maybe easily converted to simple: sprintf(%*ph\n, len, buf); Note: it seems in one case the output is groupped by 2 bytes and looks like a typo. Thus, patch changes that to plain byte stream. Any comments on this, anyone? The idea is to minimize usage of hexdump_to_buffer(), i.e. not using it when it's an obvious overkill. Sorry for the delay, since the buffer is small (less than 16 bytes, it should be fine). -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] iwlwifi: convert hex_dump_to_buffer() to %*ph
On Tue, Aug 4, 2015 at 2:48 PM, Emmanuel Grumbach egrumb...@gmail.com wrote: On Tue, Aug 4, 2015 at 1:47 PM, Andy Shevchenko andriy.shevche...@linux.intel.com wrote: On Thu, 2015-07-16 at 15:42 +0300, Andy Shevchenko wrote: There is no need to use hex_dump_to_buffer() in the cases like this: hexdump_to_buffer(buf, len, 16, 1, outbuf, outlen, false); /* len = 16 */ sprintf(%s\n, outbuf); since it maybe easily converted to simple: sprintf(%*ph\n, len, buf); Note: it seems in one case the output is groupped by 2 bytes and looks like a typo. Thus, patch changes that to plain byte stream. Any comments on this, anyone? The idea is to minimize usage of hexdump_to_buffer(), i.e. not using it when it's an obvious overkill. Sorry for the delay, since the buffer is small (less than 16 bytes, it should be fine). I applied it on our internal tree. Thank you. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] net/wireless: enable wiphy device to suspend/resume asynchronously
On Thu, Jul 30, 2015 at 8:18 AM, Fu, Zhonghui zhonghui...@linux.intel.com wrote: Enable wiphy device to suspend/resume asynchronously. This can improve system suspend/resume speed. How will that impact the timing with respect to the suspend call coming from the bus? I think that a few drivers rely on the suspend call of the wiphy device happening before the suspend call to the bus device. Not sure though. Signed-off-by: Zhonghui Fu zhonghui...@linux.intel.com --- net/wireless/core.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/net/wireless/core.c b/net/wireless/core.c index 2a0bbd2..bc5e68f 100644 --- a/net/wireless/core.c +++ b/net/wireless/core.c @@ -416,6 +416,7 @@ use_default_name: device_initialize(rdev-wiphy.dev); rdev-wiphy.dev.class = ieee80211_class; rdev-wiphy.dev.platform_data = rdev; + device_enable_async_suspend(rdev-wiphy.dev); INIT_LIST_HEAD(rdev-destroy_list); spin_lock_init(rdev-destroy_list_lock); -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] iwlwifi: Deinline iwl_{read,write}{8,32}
On Tue, Jul 14, 2015 at 3:41 PM, Denys Vlasenko vda.li...@googlemail.com wrote: On Tue, Jul 14, 2015 at 2:38 PM, Sergei Shtylyov sergei.shtyl...@cogentembedded.com wrote: +#define IWL_READ_WRITE(static_inline) \ +static_inline void iwl_write8(struct iwl_trans *trans, u32 ofs, u8 val) \ +{ \ + trace_iwlwifi_dev_iowrite8(trans-dev, ofs, val); \ + iwl_trans_write8(trans, ofs, val); \ +} \ [...] +#if !defined(CONFIG_IWLWIFI_DEVICE_TRACING) +IWL_READ_WRITE(static inline) Not static_inline? Yes. Here we are putting two words, static inline, in front of every function definition. -- I'll try to come up with a patch that is easier for me to read, but I am really busy right now. Ping me in a week if you have heard from me earlier. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] mac80211: Use kstrdup to simplify code
The subject is wrong. You are not patch mac80211, but bcrmsmac. On Wed, Jul 8, 2015 at 10:32 PM, Christophe JAILLET christophe.jail...@wanadoo.fr wrote: Replace a kmalloc+strcpy by an equivalent kstrdup in order to improve readability. Signed-off-by: Christophe JAILLET christophe.jail...@wanadoo.fr --- drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c b/drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c index 4813506..8a6c077 100644 --- a/drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c +++ b/drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c @@ -1476,9 +1476,7 @@ struct brcms_timer *brcms_init_timer(struct brcms_info *wl, wl-timers = t; #ifdef DEBUG - t-name = kmalloc(strlen(name) + 1, GFP_ATOMIC); - if (t-name) - strcpy(t-name, name); + t-name = kstrdup(name, GFP_ATOMIC); #endif return t; -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe linux-wireless in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html