Re: [ath9k-devel] AR9462 problems connecting again..

2015-02-26 Thread Peter Stuge
Jouni Malinen wrote:
 punishing the vast majority of cases where the AP is perfectly fine
 with higher MCS rates

First make it work (everywhere), then make it fast (where possible).


//Peter
--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ath9k-devel] AR9462 problems connecting again..

2015-02-26 Thread Adrian Chadd
On 26 February 2015 at 02:20, Jouni Malinen j...@w1.fi wrote:
 On Wed, Feb 25, 2015 at 10:14:45AM -0800, Linus Torvalds wrote:
 While I realize that people may disagree about the exact details of
 how to fix this in the long run, may I suggest that in the meantime we
 at least get the two workaround patches applied?

 Does anybody hate Jouni's two patches *so* much that they can
 articulate *why* it would be wrong to apply them as interim patches?
 And if so, do you have better patches for me to try? Because if not..

 Of all people, I do actually have some hatred on the one-liner to force
 minimum rate for all EAPOL TX attempts. That is punishing the vast
 majority of cases where the AP is perfectly fine with higher MCS rates
 being used (and MCS 0 being sufficient fallback option) for EAPOL. Being
 able to use higher TX rates as the initial attempt is a nice feature and
 even though this may be limited to number of upstream Linux drivers
 today, that part of the feature is an improvement, IMHO. This can even
 be more robust in some environments especially when going through long
 EAP exchange with certain types of interference.

I remember running the math - well, the airtime math, and realised
it was almost always cheaper to do the single, non-aggregated EAPOL
frame exchange at the lowest rate than to try higher rates and fall
back to the lowest rates.




-adrian
--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ath9k-devel] AR9462 problems connecting again..

2015-02-26 Thread Arend van Spriel

On 02/26/15 06:55, Linus Torvalds wrote:

Because it looks like the brcmsmac driver has*exactly*  the same
behavior with this AP (in an Apple Macbook air). I assume brcmsmac
uses the net/80211/tx.c logic too.


That makes sense as brcmsmac is a mac80211 driver that uses the 
minstrel-ht rate control algorithm. iwlwifi is also a mac80211 driver 
but it uses a different rate control algorithm so if you have that lying 
around it would be interesting to test that (if you did not already do 
that).


Regards,
Arend
--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ath9k-devel] AR9462 problems connecting again..

2015-02-26 Thread Jouni Malinen
On Wed, Feb 25, 2015 at 10:14:45AM -0800, Linus Torvalds wrote:
 While I realize that people may disagree about the exact details of
 how to fix this in the long run, may I suggest that in the meantime we
 at least get the two workaround patches applied?

 Does anybody hate Jouni's two patches *so* much that they can
 articulate *why* it would be wrong to apply them as interim patches?
 And if so, do you have better patches for me to try? Because if not..

Of all people, I do actually have some hatred on the one-liner to force
minimum rate for all EAPOL TX attempts. That is punishing the vast
majority of cases where the AP is perfectly fine with higher MCS rates
being used (and MCS 0 being sufficient fallback option) for EAPOL. Being
able to use higher TX rates as the initial attempt is a nice feature and
even though this may be limited to number of upstream Linux drivers
today, that part of the feature is an improvement, IMHO. This can even
be more robust in some environments especially when going through long
EAP exchange with certain types of interference.

That said, it is clear to me that we need to modify the current behavior
to be more conservative with some of the EAPOL frame retries. I'm not
familiar enough with the current minstrel implementation to prepare a
clean patch doing the approach I think that should be used here (replace
the last rate with a low basic rate and add couple of tries with it). I
have reason to believe that Felix might be able to look at this in
couple of weeks, though.

In other words, I think I'll send out a more formal version of the
patches so that they can be applied now if desired while keeping in mind
that we may want to replace the minrate-for-EAPOL with something else
later. As far as the EAPOL-Key msg 4/4 retries without encryption patch
is concerned, I think I have enough concern on the couple of corner
cases that I know it does not address (and may break), so it may take
some more time before I'm ready to suggest it (or a bit modified version
of it) to be applied, though.

-- 
Jouni MalinenPGP id EFC895FA
--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ath9k-devel] AR9462 problems connecting again..

2015-02-25 Thread Jouni Malinen
On Wed, Feb 25, 2015 at 06:00:08PM +1300, Felix Fietkau wrote:
 Minstrel_ht does *NOT* use mrr[3], nor should it. For normal data
 packets, a little packet loss under tough conditions is good, otherwise
 we risk lots of wasted airtime and bufferbloat.

I agree for normal data packets, but EAPOL frames are not really normal
data packet even though they happen to be transmitted as Data frames.
EAPOL frames are rare enough to not wast much airtime (and recovery from
an issue would anyway use way more airtime) and bufferbloat is
irrelevant for EAPOL frames. For EAPOL frames, little packet loss is not
good. Especially for EAPOL-Key msg 4/4, the only recovery option in many
cases is to reassociate with the AP and start from scratch.

  That mrr[3]:= basic_rate is the part I was really asking for as far as
  EAPOL frames are concerned.
 I don't think we need that. If we just exclude EAPOL from both probing
 and aggregation, it should be safe. While it's connecting, that leaves
 in low rates in the retry chain anyway.

Not low enough IMHO. EAPOL is a special case and needs to be addressed
as such. It is special for at least two reasons: being very early in the
association (well, the very _first_ Data frames using rate control) and
being critical for maintaining the connection (AP will disconnect if it
does not reply response).

What happens now is way too optimistic:
- one try at MCS 2 followed by four tries at MCS 0 for EAPOL-Key msg 2/4
- one try at MCS 9 (or so) followed by four tries at MCS 0 for EAPOL-Key
  msg 4/4 (this being the most critical frame in the connection sequence
  due to not having a good recovery mechanism)

Dropping probing from these would allow one more attempt at the first
rate and I guess it would also drop the first rate to somewhat lower.

I'm fine with using these MCS rates as the first option, but I do think
that we have to add one more rate to the end (or change the 3rd rate if
that is easier for implementation) to be non-MCS and I think one of the
basic rates (say, 6 Mbps on 5 GHz and maybe 2 or 5.5 Mbps on 2.4 GHz)
with number of tries (say, 4).

There have been way too many cases reported where strange issues with
4-way exchange (those EAPOL-Key frames) result in connection failing.
While not all of these can be explained with the TX rate, I'm pretty
sure large portion of these issues are indeed caused by too optimistic
TX rate selection. Such policies may be acceptable for other Data
frames, but not for EAPOL.

 If it still fails often enough to be noticeable under normal conditions,
 there must be something seriously wrong outside of rate control, and we
 should not paper over it with a crude band-aid workaround.

There may be something else wrong (say, some kind of interference), but
there is no way we can assume normal users to be able to fix such
issues. If we make EAPOL frames go through more robustly, the connection
can be established in more cases and this can result in relatively
functional network connection and rate control can handle the less
critical data frames through whatever means to get optimal throughput
from the network. As such, I do think we do need to paper over this
for EAPOL frames.

-- 
Jouni MalinenPGP id EFC895FA
--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ath9k-devel] AR9462 problems connecting again..

2015-02-25 Thread Jouni Malinen
On Tue, Feb 24, 2015 at 11:38:02PM +0100, Thomas Hühn wrote:
 Minstrel_HT does only set mrr[0..2] and does not touch the fourth mrr[3], 
 assumed chips with 4 mrr stages.
 In function minstrel_ht_update_rates() the rate table struct 
 ieee80211_sta_rates is set for the first 3 out of 4 rates and the fouth rate 
 (mrr[3]) is „-1“.

Agreed.

 In case of ath9k, the driver in xmit.c allocates in function 
 ath_tx_fill_desc() a  struct ath_tx_info info by  memset(info, 0, 
 sizeof(info)) .. so all info-rates[xy].Rate == 0.
 Than function ath_buf_set_rate() sets all info-rates[xy].Rate to those 
 values specified in the rate table (struct ieee8021_sta_rate) IF 
 (!rates[i].count || (rates[i].idx  0)) …
 … so info-rates[3].Rate is untouched and still  „0“.

Sure, it can be 0 and so is the number of tries at that rate..

 I just added a printk() to xmit.c in function ath_tx_fill_desc() just before 
 ath9k_hw_set_txdesc() is called.
 From the log I get, e.g.:
 
 [  169.543554] mrr setup: mrr[0]:133 mrr[1]:132 mrr[2]:134 mrr[3]:0 
 
 I have not check yet if info-rates[3].Rate == 0 is the lowest possible rate… 
 but I would guess so.

It is not and even if it were, it does not matter since this 4th item is
used for _0_ tries. I've verified the exact behavior with a sniffer for
a case where the target device does not ACK frames. ath9k ends up
sending at exactly the three different rates indicated in the first
three values and nothing else. With RC probing (which happens to occur
for the initial EAPOL frames, this results in only one attempt at
MCS(0) and two + two attempts at MCS0. No non-MCS rates are tried. As
pointed out previously, this is likely fine for normal data frames, but
not for EAPOL.

-- 
Jouni MalinenPGP id EFC895FA
--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ath9k-devel] AR9462 problems connecting again..

2015-02-25 Thread Linus Torvalds
On Wed, Feb 25, 2015 at 6:47 AM, Jouni Malinen j...@w1.fi wrote:

 There may be something else wrong (say, some kind of interference), but
 there is no way we can assume normal users to be able to fix such
 issues. If we make EAPOL frames go through more robustly, the connection
 can be established in more cases and this can result in relatively
 functional network connection and rate control can handle the less
 critical data frames through whatever means to get optimal throughput
 from the network. As such, I do think we do need to paper over this
 for EAPOL frames.

While I realize that people may disagree about the exact details of
how to fix this in the long run, may I suggest that in the meantime we
at least get the two workaround patches applied?

I'm talking about the two from Jouni - the don't encrypt EAPOL
frames one, and the one-liner that makes all EAPOL frames go at the
lowest data rate.

Even if lowest data rate is ridiculously low, and even if that might
disturb other things going on on the same channel at the same time,
those authentication packets shouldn't be so common as to be a
problem.  No?

Jouni has a few packet dumps for me, and he's stumped as to what
exactly is going on, but those two patches (well, the one-liner low
data rate EAPOL in particular, it seems) do seem to make my
connections go through reliably.

And it seems that other drivers already are working around the EAPOL
issue in similar ways, judging by the comments about iwlwifi.

Last time I had connection issues with this laptop, nothing ended up
happening in the end, and I had people pipe up saying they had had
similar problems. I'd hate for the same nothing to happen this time
just because people aren't 100% sure what the final right thing is to
do. So I'd really like people to apply the simple workarounds for now
because clearly something is badly wrong, and *if* there is some
better resolution later, that's fine.

I'll happily test patches. It seems to be pretty repeatable for me,
even if that pretty repeatable seems to be very much about the
laptop being in one very particular place (it's right next to another
AP, there's random other electronics around, since it's on my messy
desk etc). So I wouldn't be at all surprised by horribly interference.
And the AP is supposed to be ceiling- or wall-mounted, but because I'm
just testing things out it's just sitting on a table in the next room,
so for all I know it's in the *exact* worst position for the antennas
etc etc.

So I'm sure I can improve reception of my laptop, but that's not the
point. The point is that bad wireless networks aren't so unusual, and
right now things clearly don't work as well as they could.

Does anybody hate Jouni's two patches *so* much that they can
articulate *why* it would be wrong to apply them as interim patches?
And if so, do you have better patches for me to try? Because if not..

 Linus
--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ath9k-devel] AR9462 problems connecting again..

2015-02-25 Thread Peter Stuge
Linus Torvalds wrote:
 Last time I had connection issues with this laptop, nothing ended up
 happening in the end, and I had people pipe up saying they had had
 similar problems. I'd hate for the same nothing to happen this time
 just because people aren't 100% sure what the final right thing is to
 do. So I'd really like people to apply the simple workarounds for now
 because clearly something is badly wrong, and *if* there is some
 better resolution later, that's fine.
..
 So I'm sure I can improve reception of my laptop, but that's not the
 point. The point is that bad wireless networks aren't so unusual, and
 right now things clearly don't work as well as they could.

Those two patches (the one-liner in particular) should have gone
in already 5-10 years ago. This has been an embarrassing problem
for many years.

I'm glad that people jumped this time around.


//Peter
--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ath9k-devel] AR9462 problems connecting again..

2015-02-25 Thread Adrian Chadd
On 25 February 2015 at 10:14, Linus Torvalds
torva...@linux-foundation.org wrote:


 While I realize that people may disagree about the exact details of
 how to fix this in the long run, may I suggest that in the meantime we
 at least get the two workaround patches applied?

 I'm talking about the two from Jouni - the don't encrypt EAPOL
 frames one, and the one-liner that makes all EAPOL frames go at the
 lowest data rate.

 Even if lowest data rate is ridiculously low, and even if that might
 disturb other things going on on the same channel at the same time,
 those authentication packets shouldn't be so common as to be a
 problem.  No?

 Jouni has a few packet dumps for me, and he's stumped as to what
 exactly is going on, but those two patches (well, the one-liner low
 data rate EAPOL in particular, it seems) do seem to make my
 connections go through reliably.

 And it seems that other drivers already are working around the EAPOL
 issue in similar ways, judging by the comments about iwlwifi.

[snip]

 So I'm sure I can improve reception of my laptop, but that's not the
 point. The point is that bad wireless networks aren't so unusual, and
 right now things clearly don't work as well as they could.

 Does anybody hate Jouni's two patches *so* much that they can
 articulate *why* it would be wrong to apply them as interim patches?
 And if so, do you have better patches for me to try? Because if not..

I agree with you. I think you should just have EAPOL frames go out at
the lowest rate for now and then worry about experimenting with more
interesting ways to make EAPOL / DHCP frames cheaper. It fixes a lot
of problems in noisy areas. That hack was hiding around in various
commercial drivers I've seen, and it's been in FreeBSD for a while.

Same with DHCP traffic too - it's the second set of data frames that
the rate control code sees, and it's the primary reason I dropped the
initial sample rate down in FreeBSD so the DHCP exchange would have a
better chance of succeeding after association.



-adrian
--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ath9k-devel] AR9462 problems connecting again..

2015-02-25 Thread Andrew McGregor
On Thu, Feb 26, 2015 at 7:22 AM, Adrian Chadd adr...@freebsd.org wrote:
 On 25 February 2015 at 10:14, Linus Torvalds
 torva...@linux-foundation.org wrote:


 While I realize that people may disagree about the exact details of
 how to fix this in the long run, may I suggest that in the meantime we
 at least get the two workaround patches applied?

 I'm talking about the two from Jouni - the don't encrypt EAPOL
 frames one, and the one-liner that makes all EAPOL frames go at the
 lowest data rate.

 Even if lowest data rate is ridiculously low, and even if that might
 disturb other things going on on the same channel at the same time,
 those authentication packets shouldn't be so common as to be a
 problem.  No?

 Jouni has a few packet dumps for me, and he's stumped as to what
 exactly is going on, but those two patches (well, the one-liner low
 data rate EAPOL in particular, it seems) do seem to make my
 connections go through reliably.

 And it seems that other drivers already are working around the EAPOL
 issue in similar ways, judging by the comments about iwlwifi.

 [snip]

 So I'm sure I can improve reception of my laptop, but that's not the
 point. The point is that bad wireless networks aren't so unusual, and
 right now things clearly don't work as well as they could.

 Does anybody hate Jouni's two patches *so* much that they can
 articulate *why* it would be wrong to apply them as interim patches?
 And if so, do you have better patches for me to try? Because if not..

 I agree with you. I think you should just have EAPOL frames go out at
 the lowest rate for now and then worry about experimenting with more
 interesting ways to make EAPOL / DHCP frames cheaper. It fixes a lot
 of problems in noisy areas. That hack was hiding around in various
 commercial drivers I've seen, and it's been in FreeBSD for a while.

 Same with DHCP traffic too - it's the second set of data frames that
 the rate control code sees, and it's the primary reason I dropped the
 initial sample rate down in FreeBSD so the DHCP exchange would have a
 better chance of succeeding after association.



 -adrian

+1

Really, who cares about efficiency here; these are rare control packets.
--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ath9k-devel] AR9462 problems connecting again..

2015-02-25 Thread Linus Torvalds
On Wed, Feb 25, 2015 at 10:14 AM, Linus Torvalds
torva...@linux-foundation.org wrote:

 I'm talking about the two from Jouni - the don't encrypt EAPOL
 frames one, and the one-liner that makes all EAPOL frames go at the
 lowest data rate.

So I just found out and confirmed that this is not Atheros-specific in
any way - it looks like it's simply the UniFi AP that does not like
high-data-rate authentification frames at all.

Because it looks like the brcmsmac driver has *exactly* the same
behavior with this AP (in an Apple Macbook air). I assume brcmsmac
uses the net/80211/tx.c logic too.

And Jouni's one-liner fixes that one too, although as usual, maybe
there is some testing noise, and I screwed something up. This time I
only did the one-liner, so that's the critical one.

It's interesting to note how nothing else has been unhappy with that
network (admittedly it's been mainly android devices and a HP printer
that I've tested), so it looks like everybody else does low-rate
authentication packets anyway.

So this actually looks like a Ubiquiti UniFi AP bug to me, but it also
looks like presumably everybody else does low-rate initial packets,
and our kernel 802.11 layer should just follow suit. The whole
robustness principle and be conservative in what you send, and
liberal in what you accept etc.

Linus
--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ath9k-devel] AR9462 problems connecting again..

2015-02-24 Thread Dave Taht
On Tue, Feb 24, 2015 at 2:26 AM, Jouni Malinen j...@w1.fi wrote:
 On Tue, Feb 24, 2015 at 01:29:27PM +1100, Andrew McGregor wrote:
 Over the weekend I found a bug in minstrel-ht that might well be
 implicated here.

 The last retransmit rate is meant to be a 'get the packet there
 reliably' rate; minstrel-ht doesn't do that right, and can pick a
 fairly flaky rate instead.

 Can't generate a proper patch right now, so this diff might not apply
 cleanly, but the fix is simply to change 75 to 99 in the two places
 below:

 While this may indeed be helpful, I don't think it is sufficient for
 this EAPOL frame related issue. What I would like to see is minstrel_ht
 using a basic rate (something non-HT) at the end of the retry series for
 EAPOL frames.

 The current behavior looks very suspicious to me. The early EAPOL frames
 after association are being used to probe for higher rates. This results
 in the total number of retry attempts actually getting smaller than any
 other frame, i.e., minstrel_ht seems to be using significantly _less_
 robust choices for the EAPOL frames than following normal data frames!
 This should really be the other way around..

 As an example, I'm seeing this on 5 GHz band (with the 75 to 99 change
 in place, but behavior was more or less identical without it):
 - the first EAPOL frame (msg 2/4) getting one attempt at MCS 3, 2
   attempts at MCS 0, 2 attempts at MCS 0 (yes, identical to the previous
   one) with total maximum of 5 attempts
 - the second EAPOL frame (msg 4/4) getting one attempt at MCS 9, 2
   attempts at MCS 0, 2 attempts at MCS 0 with total maximum of 5
   attempts
 - another data frame after this: 5 attempts at MCS 9, 5 attempts at MCS
   3, 5 attempts at MCS 3 with total maximum of 15 attempts(!!)

I would in general prefer that the excessive retries in the present
driver layers in wifi be
dramatically reduced, the packet dropped and the problem punted to
higher layers.

 This cannot be the best approach here..

Falling back faster to the lowest possible rate with minimum retries,
and then giving up sooner would be better. 15 attempts? jeeze

 For the
 IEEE80211_TX_CTRL_PORT_CTRL_PROTO cases, there are identified issues
 where failing to deliver the frame results is significant issues either
 in getting connected in the first place or getting disconnected if
 rekeying fails.

 I'm not sure how this would be implemented cleanly in minstrel_ht or
 whether that is even the best place (i.e., rate.c could do this
 instead), but I'd like that third attempt for control port cases to be
 dropped to use a (lowish) basic rate and non-MCS at that since there may
 be some interop issues with HT MCS early during association.
 Alternatively with drivers like ath9k that support 4 rate values, it
 would also be fine to add this basic rate attempt (or well, I'd have
 multiple, say 4, such attempts) as an additional 4th entry which does
 not currently seem to get used with minstrel at all.

 The (lowish) basic rate here could be defined as 6 Mbps OFDM for 5 GHz
 band and either that or maybe even 2 Mbps or 5.5 Mbps on 2.4 GHz (if
 included by the AP in basic rate set).

 --
 Jouni MalinenPGP id EFC895FA
 ___
 ath9k-devel mailing list
 ath9k-de...@lists.ath9k.org
 https://lists.ath9k.org/mailman/listinfo/ath9k-devel



-- 
Dave Täht
Let's make wifi fast, less jittery and reliable again!

https://plus.google.com/u/0/107942175615993706558/posts/TVX3o84jjmb
--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ath9k-devel] AR9462 problems connecting again..

2015-02-24 Thread Thomas Hühn
Hi Jouni,

Currently Minstrel_HT just skips EAPOL packets for its rate sampling on non-mrr 
chips by testing: (info-control.flags  IEEE80211_TX_CTRL_PORT_CTRL_PROTO)
On mrr hardware it uses them for probing. 
But the general MRR-chain should look like this for ath5k and ath9k chips that 
support 4 mrr chains:

mrr[0]:= max_tp_rate[0]
mrr[1]:= max_tp_rate[1]
mrr[2]:= max_prob_rate
mrr[3]:= basic_rate

So for Minstrel Sampling Packets as well as for data packets, the 4th mrr stage 
should use the slowest rate in case all other 3 mrr stages failed with their 
retry attempts.

I do see two possible options for control frames like EAPOL to be send out in a 
more robust fashion:
 - exclude those frames from AMPDU aggragates 
 - change their mrr setup to be more conservative

What do you think about this ?

Greetings Thomas


 Am 24.02.2015 um 11:26 schrieb Jouni Malinen j...@w1.fi:
 
 On Tue, Feb 24, 2015 at 01:29:27PM +1100, Andrew McGregor wrote:
 Over the weekend I found a bug in minstrel-ht that might well be
 implicated here.
 
 The last retransmit rate is meant to be a 'get the packet there
 reliably' rate; minstrel-ht doesn't do that right, and can pick a
 fairly flaky rate instead.
 
 Can't generate a proper patch right now, so this diff might not apply
 cleanly, but the fix is simply to change 75 to 99 in the two places
 below:
 
 While this may indeed be helpful, I don't think it is sufficient for
 this EAPOL frame related issue. What I would like to see is minstrel_ht
 using a basic rate (something non-HT) at the end of the retry series for
 EAPOL frames.
 
 The current behavior looks very suspicious to me. The early EAPOL frames
 after association are being used to probe for higher rates. This results
 in the total number of retry attempts actually getting smaller than any
 other frame, i.e., minstrel_ht seems to be using significantly _less_
 robust choices for the EAPOL frames than following normal data frames!
 This should really be the other way around..
 
 As an example, I'm seeing this on 5 GHz band (with the 75 to 99 change
 in place, but behavior was more or less identical without it):
 - the first EAPOL frame (msg 2/4) getting one attempt at MCS 3, 2
  attempts at MCS 0, 2 attempts at MCS 0 (yes, identical to the previous
  one) with total maximum of 5 attempts
 - the second EAPOL frame (msg 4/4) getting one attempt at MCS 9, 2
  attempts at MCS 0, 2 attempts at MCS 0 with total maximum of 5
  attempts
 - another data frame after this: 5 attempts at MCS 9, 5 attempts at MCS
  3, 5 attempts at MCS 3 with total maximum of 15 attempts(!!)
 
 This cannot be the best approach here.. For the
 IEEE80211_TX_CTRL_PORT_CTRL_PROTO cases, there are identified issues
 where failing to deliver the frame results is significant issues either
 in getting connected in the first place or getting disconnected if
 rekeying fails. 
 
 I'm not sure how this would be implemented cleanly in minstrel_ht or
 whether that is even the best place (i.e., rate.c could do this
 instead), but I'd like that third attempt for control port cases to be
 dropped to use a (lowish) basic rate and non-MCS at that since there may
 be some interop issues with HT MCS early during association.
 Alternatively with drivers like ath9k that support 4 rate values, it
 would also be fine to add this basic rate attempt (or well, I'd have
 multiple, say 4, such attempts) as an additional 4th entry which does
 not currently seem to get used with minstrel at all.
 
 The (lowish) basic rate here could be defined as 6 Mbps OFDM for 5 GHz
 band and either that or maybe even 2 Mbps or 5.5 Mbps on 2.4 GHz (if
 included by the AP in basic rate set).
 
 -- 
 Jouni MalinenPGP id EFC895FA
 ___
 ath9k-devel mailing list
 ath9k-de...@lists.ath9k.org
 https://lists.ath9k.org/mailman/listinfo/ath9k-devel

--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ath9k-devel] AR9462 problems connecting again..

2015-02-24 Thread Jouni Malinen
On Tue, Feb 24, 2015 at 06:54:47PM +0100, Thomas Hühn wrote:
 Currently Minstrel_HT just skips EAPOL packets for its rate sampling on 
 non-mrr chips by testing: (info-control.flags  
 IEEE80211_TX_CTRL_PORT_CTRL_PROTO)

Yeah, I noticed that when going through the implementation, but it was
indeed only for cases other than ath9k-like drivers.

 On mrr hardware it uses them for probing. 
 But the general MRR-chain should look like this for ath5k and ath9k chips 
 that support 4 mrr chains:
 
 mrr[0]:= max_tp_rate[0]
 mrr[1]:= max_tp_rate[1]
 mrr[2]:= max_prob_rate
 mrr[3]:= basic_rate

Where is that mrr[3] part implemented? I did not find it when reviewing
the design (hw-max_rates = 3 is used, but not = 4) and this does not
match my experiments either when printing out all four values from
ath9k. In every single case I observed, the last entry was unused (idx =
-1) and only MCS values were used (i.e., not even a single case of basic
rate visible; basic rates being 6, 12, 24 Mbps OFDM in this specific
case with the AP that I used in the tests).

 So for Minstrel Sampling Packets as well as for data packets, the 4th mrr 
 stage should use the slowest rate in case all other 3 mrr stages failed with 
 their retry attempts.
 
 I do see two possible options for control frames like EAPOL to be send out in 
 a more robust fashion:
  - exclude those frames from AMPDU aggragates 

ath9k does that for IEEE80211_TX_CTL_RATE_CTRL_PROBE which seemed to
get set for the initial EAPOL frames. I guess this could be done more
generically for all EAPOL frames.

  - change their mrr setup to be more conservative

That mrr[3]:= basic_rate is the part I was really asking for as far as
EAPOL frames are concerned.

-- 
Jouni MalinenPGP id EFC895FA
--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ath9k-devel] AR9462 problems connecting again..

2015-02-24 Thread Adrian Chadd
Hi,

I thought about doing this for rate probing with FreeBSD's sample rate
algorithm, but after actually having to use the damned thing in noisy
environments I realised that it just wasn't worth the effort to
optimise rate control selection whilst doing EAPOL frames.

If we did more useful software retransmission at lower rates then
sure, one could just be much more conservative about the rates and
retry a few times at a lower rate. However, since that isn't the case
and we only get three or so attempts at EAPOL exchange - once a second
- before an AP decides we're no good, I figured I'd just skip anything
other than the lowest basic rates for the frame exchange and make it
associate quicker.

I had the same problem with DHCP - initial DHCP leases were failing
because the rate control code started high/medium and didn't degrade
quickly enough before the next round of DHCP attempts. DHCP backs off
even more aggressively, so failing that initial DHCP was quite
expensive.



-adrian
--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ath9k-devel] AR9462 problems connecting again..

2015-02-24 Thread Thomas Hühn
Hi Jouni,

 Where is that mrr[3] part implemented? I did not find it when reviewing
 the design (hw-max_rates = 3 is used, but not = 4) and this does not
 match my experiments either when printing out all four values from
 ath9k. In every single case I observed, the last entry was unused (idx =
 -1) and only MCS values were used (i.e., not even a single case of basic
 rate visible; basic rates being 6, 12, 24 Mbps OFDM in this specific
 case with the AP that I used in the tests).

Minstrel_HT does only set mrr[0..2] and does not touch the fourth mrr[3], 
assumed chips with 4 mrr stages.
In function minstrel_ht_update_rates() the rate table struct 
ieee80211_sta_rates is set for the first 3 out of 4 rates and the fouth rate 
(mrr[3]) is „-1“.
In case of ath9k, the driver in xmit.c allocates in function ath_tx_fill_desc() 
a  struct ath_tx_info info by  memset(info, 0, sizeof(info)) .. so all 
info-rates[xy].Rate == 0.
Than function ath_buf_set_rate() sets all info-rates[xy].Rate to those values 
specified in the rate table (struct ieee8021_sta_rate) IF (!rates[i].count || 
(rates[i].idx  0)) …
… so info-rates[3].Rate is untouched and still  „0“.

I just added a printk() to xmit.c in function ath_tx_fill_desc() just before 
ath9k_hw_set_txdesc() is called.
From the log I get, e.g.:

[  169.543554] mrr setup: mrr[0]:133 mrr[1]:132 mrr[2]:134 mrr[3]:0 

I have not check yet if info-rates[3].Rate == 0 is the lowest possible rate… 
but I would guess so.


Greetings Thomas

--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ath9k-devel] AR9462 problems connecting again..

2015-02-24 Thread Felix Fietkau
On 2015-02-25 07:14, Jouni Malinen wrote:
 On Tue, Feb 24, 2015 at 06:54:47PM +0100, Thomas Hühn wrote:
 Currently Minstrel_HT just skips EAPOL packets for its rate sampling on 
 non-mrr chips by testing: (info-control.flags  
 IEEE80211_TX_CTRL_PORT_CTRL_PROTO)
 
 Yeah, I noticed that when going through the implementation, but it was
 indeed only for cases other than ath9k-like drivers.
 
 On mrr hardware it uses them for probing. 
 But the general MRR-chain should look like this for ath5k and ath9k chips 
 that support 4 mrr chains:
 
 mrr[0]:= max_tp_rate[0]
 mrr[1]:= max_tp_rate[1]
 mrr[2]:= max_prob_rate
 mrr[3]:= basic_rate
 
 Where is that mrr[3] part implemented? I did not find it when reviewing
 the design (hw-max_rates = 3 is used, but not = 4) and this does not
 match my experiments either when printing out all four values from
 ath9k. In every single case I observed, the last entry was unused (idx =
 -1) and only MCS values were used (i.e., not even a single case of basic
 rate visible; basic rates being 6, 12, 24 Mbps OFDM in this specific
 case with the AP that I used in the tests).
Minstrel_ht does *NOT* use mrr[3], nor should it. For normal data
packets, a little packet loss under tough conditions is good, otherwise
we risk lots of wasted airtime and bufferbloat.

 So for Minstrel Sampling Packets as well as for data packets, the 4th mrr 
 stage should use the slowest rate in case all other 3 mrr stages failed with 
 their retry attempts.
 
 I do see two possible options for control frames like EAPOL to be send out 
 in a more robust fashion:
  - exclude those frames from AMPDU aggragates 
 
 ath9k does that for IEEE80211_TX_CTL_RATE_CTRL_PROBE which seemed to
 get set for the initial EAPOL frames. I guess this could be done more
 generically for all EAPOL frames.
I agree.

  - change their mrr setup to be more conservative
 
 That mrr[3]:= basic_rate is the part I was really asking for as far as
 EAPOL frames are concerned.
I don't think we need that. If we just exclude EAPOL from both probing
and aggregation, it should be safe. While it's connecting, that leaves
in low rates in the retry chain anyway.

If it still fails often enough to be noticeable under normal conditions,
there must be something seriously wrong outside of rate control, and we
should not paper over it with a crude band-aid workaround.

- Felix
--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ath9k-devel] AR9462 problems connecting again..

2015-02-22 Thread Dave Taht
On Sun, Feb 22, 2015 at 10:30 AM, Linus Torvalds
torva...@linux-foundation.org wrote:
 On Sun, Feb 22, 2015 at 10:24 AM, Adrian Chadd adr...@freebsd.org wrote:

 Just a wild shot - try disabling fast authentication and see if that
 makes a difference?

 wpa_supplicant.conf:

 fast_reauth=0

 I recall having issues with fast_reauth once, but I never stuck around
 that location long enough to debug it.

 Nope. Did that, killed wpa_supplicant (which restarts it), tried
 connecting, still failed.

Hint: Several unifi (and most ubnt) products are well supported by
openwrt directly, which by reflashing your device to it via their web
interface would

A) probably fix the problem, and
B) give you more insight into fixing it, if it persists, by giving you
full access on both sides of the connection.

https://downloads.openwrt.org/snapshots/trunk/ar71xx/generic/

I have been replacing ubnt´s default firmware on first boot for 6+
years now. It is good hardware, after you do that.

-- 
Dave Täht

http://www.bufferbloat.net/projects/cerowrt/wiki
--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ath9k-devel] AR9462 problems connecting again..

2015-02-22 Thread Kyle Bassett
On Sun, Feb 22, 2015 at 4:54 PM, Kyle Bassett kylebass...@gmail.com wrote:
 On Sun, Feb 22, 2015 at 4:45 PM, Linus Torvalds
 torva...@linux-foundation.org wrote:

 On Sun, Feb 22, 2015 at 10:58 AM, Dave Taht dave.t...@gmail.com wrote:
 
  Hint: Several unifi (and most ubnt) products are well supported by
  openwrt directly,

 I want Linux to just work. None of this oh, you can change
 something else and it probably works.  I want to fix the problem in
 *linux*.


 There's clearly something wrong with the AR9462 driver and/or how it
 uses the wireless infrastructure, and it should be fixed. Not worked
 around with use some other AP software.




A side note on this, what is the current method for unit testing WAP's and
client devices?  I'm sure every vendor has their own flavor, although I feel
some clearly defined FOSS method should exist as well.

I bring this up is because I have personally experienced bugs inside the
UniFi offerings, and on miscellaneous Linux/non-linux client devices as
well.
--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ath9k-devel] AR9462 problems connecting again..

2015-02-22 Thread Linus Torvalds
On Sun, Feb 22, 2015 at 10:58 AM, Dave Taht dave.t...@gmail.com wrote:

 Hint: Several unifi (and most ubnt) products are well supported by
 openwrt directly,

I want Linux to just work. None of this oh, you can change
something else and it probably works.  I want to fix the problem in
*linux*.

There's clearly something wrong with the AR9462 driver and/or how it
uses the wireless infrastructure, and it should be fixed. Not worked
around with use some other AP software.

Especially since this has happened before.

Besides, the reason I'm trying to use UniFi is because I want to have
seamless roaming (zero-handoff). And I do *not* want to play the
endless openwrt configuration games in the hopes I can get something
like that working. I've tried openwrt, and I don't like tinkering with
my AP's. I just want things to work out of the box.

  Linus
--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ath9k-devel] AR9462 problems connecting again..

2015-02-22 Thread Sujith Manoharan
Linus Torvalds wrote:
 14:07:23.542927: wlp1s0: Event DEAUTH (12) received
 14:07:23.542946: wlp1s0: Deauthentication notification
 14:07:23.542964: wlp1s0:  * reason 2
 14:07:23.542982: wlp1s0:  * address 20:9f:db:e7:80:80
 14:07:23.542997: Deauthentication frame IE(s) - hexdump(len=0): [NULL]
 14:07:23.543055: wlp1s0: CTRL-EVENT-DISCONNECTED bssid=20:9f:db:e7:80:80 
 reason=2

Sporadic association problems could be a problem with the chosen rates.
This would show the rates for the EAPOL frames:

iw dev wlp1s0 interface add mon0 type monitor
ifconfig mon0 up
tshark -i mon0 -Y eapol -T fields -e radiotap.datarate -e wlan -e eapol -e 
wlan.sa -e wlan.da
Run wpa_s.

Does setting a lower rate improve connectivity ?

'iw dev wlp1s0 set bitrates ht-mcs-2.4 1' would choose a lower
rate for the key-exchange frames.

Sujith
--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ath9k-devel] AR9462 problems connecting again..

2015-02-22 Thread Sujith Manoharan
Sujith Manoharan wrote:
 'iw dev wlp1s0 set bitrates ht-mcs-2.4 1' would choose a lower
 rate for the key-exchange frames.

Or 'iw dev wlp1s0 set bitrates ht-mcs-2.4 0' to choose the lowest
HT rate.

Sujith
--
To unsubscribe from this list: send the line unsubscribe linux-wireless in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html