Re: [Server-devel] Large groups of XO-1 do not work with access points

2014-02-07 Thread James Cameron
G'day Anne,

Yes, a manual "sudo iwlist eth0 scan" in Terminal, or an "iwlist eth0
scan" in console, until the AP appears, is all it takes to fix.  You
derived an independent workaround for the same problem!

If you have some XO-1 that work and some that don't, then you might
have one broken antenna cable, which might cause this; the XO-1 uses
one antenna for transmit and the other for receive, some of the time.

(If the transmit antenna is broken, an active scan may not be heard by
an access point.  If the receive antenna is broken, the reply from an
access point might not be heard by the laptop.  Both can result in no
scan results containing the access point.  But regardless, my tests so
far have shown the problem happens with two known good antennas.)

I'd be interested in some test results though, using the script in my
previous mail, if you have time.  It would be good if we could have
wider data than just Terry and I.

The antenna test is also useful data.  With it, and practice, you can
identify laptops with single broken antennas.
http://wiki.laptop.org/go/Antenna_testing

-- 
James Cameron
http://quozl.linux.org.au/
___
Server-devel mailing list
Server-devel@lists.laptop.org
http://lists.laptop.org/listinfo/server-devel


Re: [Server-devel] Large groups of XO-1 do not work with access points

2014-02-07 Thread Anna
This is all very interesting, particularly when James Cameron stated,
"...all it takes is for two active scans to miss the access point."  All
the years I've been working with these things, I really had no idea.  And
did I inadvertently do the correct workaround?

I've got a couple of XO-1's that repeatedly don't automatically see a
couple of my AP's in Sugar's Network Neighborhood on boot (some of my XO-1
units "just work," btw).  In a console, I'll do `iwlist eth0 scan |grep ` multiple times until it shows up (or grep on ESSID for the list
of what all it sees).  Then switch back to Network Neighborhood, find the
AP's "circle" (which now shows up) and associate.

After that, networking is fine until reboot, but then I just repeat the
above procedure.

What I found peculiar was that the AP doesn't initially show up on those
XO-1's even when the XO-1 is on the table right next to the AP.  But, hey,
I figured out how to scan for it and then moved on.  I didn't know others
had this issue in other environments.

My home environment is relatively noisy.  I'm looking at an XO-1 now and it
can see 16 APs: four on channel 1, one on channel 4, one on channel 5,
three on channel 6, three on channel 8, one on channel 10, three on channel
11.  Only three of those are in my house - Tyler's AP on channel 1 (which
is WPA encrypted and I don't typically use), my "regular AP" on channel 11,
and the XSCE's AP on channel 6.

Musing upon it now, I should probably switch the channels between my
"regular AP" on 11 and the XSCE's - the XSCE's channel 6 might be getting
crowded out by my neighbors on 4,5,6, and 8.

Anna


On Fri, Feb 7, 2014 at 7:21 PM, James Cameron  wrote:

> On Sat, Feb 08, 2014 at 12:16:06PM +1100, James Cameron wrote:
> > 1.  sometimes, an active scan by the XO-1 does not have the access
> > point listed in the scan results, despite the XO-1 transmitting an
> > acknowledgement to the access point,
>
> This implies a problem in the firmware or the kernel.
>
> --
> James Cameron
> http://quozl.linux.org.au/
> ___
> Devel mailing list
> de...@lists.laptop.org
> http://lists.laptop.org/listinfo/devel
>
___
Server-devel mailing list
Server-devel@lists.laptop.org
http://lists.laptop.org/listinfo/server-devel


Re: [Server-devel] Large groups of XO-1 do not work with access points

2014-02-07 Thread James Cameron
There seems to be a lot of speculation, so I'll add more technical
details on what Terry and I have been investigating.

1.  sometimes, an active scan by the XO-1 does not have the access
point listed in the scan results, despite the XO-1 transmitting an
acknowledgement to the access point,

2.  an active scan by the XO-1 is done only twice during boot before
Sugar starts, and is repeated every 30 seconds,

3.  if these two active scans do not contain the access point, Sugar
waits 10 seconds before it decides that we don't have any access point
available, and commits to using mesh.

Therefore all it takes is for two active scans to miss the access
point.  This can be easily reproduced with "sudo iwlist eth0 scan" and
looking at the "Last beacon" time for the access point.

--

The probability of failure in step 1 has considerable variance across
the test populations.  Here are some determinants:

a.  the probability varies by access point, even the same model access
point with the same firmware.  We see an extreme variation across
access points.  I see 5%, 23% and 32% with unused XO-1.  Terry sees
worse with his used XO-1 stock.

b.  the probability is higher if mesh is enabled in the firmware; my
32% fail rate drops down to 5.8% by turning off mesh using lbs_mesh,
and making no other changes.

c.  the probability is higher if many XO-1 are present and connected.

d.  the probability is higher if antennas or coax are broken (because
the two antennas are used at different times).

e.  the probability is much higher if there are other access points
present on the same channel at some distance.

f.  the probability is unchanged with or without encryption, with or
without power limits on the access point, and with or without 802.11n
enabled.

I'm interested to know if anybody has any ideas as to what else to
vary in the experiments.

The test method is to place "sudo iwlist eth0 scan" in a loop, with a
five second repeat cycle, and count the number of scans where a
previous scan result was used.  Here's an example test:

--

#!/bin/bash
MA=5C:63:BF:D8:F6:C0
while true; do
T0=$(date +%s)
if [[ $(( $T0 % 5 )) != 0 ]]; then
sleep 0.1
continue
fi
R0=$(sudo iwlist eth0 scan 2>/dev/null | awk "BEGIN{x=0;m=1} /$MA/{x=1;m=0} 
/Last beacon/{gsub(\"ms\",\"\"); if (x) print \$4} /IE: Unknown/{x=0} END{if(m) 
print \"missed\"}")
if [[ "$R0" == "missed" ]]; then
echo missed
sleep 3
continue
fi
echo $T0 $R0
echo $T0 $R0 >> scan.log
sleep 3
done

--

To generate the percentage failure:

awk 'BEGIN {p=0;f=0} { if ($2 > 1000) { f++ } else { p++ } } END { print p, f, 
f * 100 / p }' scan.log

-- 
James Cameron
http://quozl.linux.org.au/
___
Server-devel mailing list
Server-devel@lists.laptop.org
http://lists.laptop.org/listinfo/server-devel


[Server-devel] Large groups of XO-1 do not work with access points

2014-02-06 Thread James Cameron
On
https://docs.google.com/document/d/1o6QtzLb6e58YKWqMf_junux2XyBRLFm31un8YLcYslg

Anna, Ben, Jon, Martin, Adam, Tim, TK, and Tom wrote:
> Very confounding Wifi: while connecting initially, large numbers of
> XO-1s fail to (re?)associate with Village Telco / Mesh Potato and
> other AP’s.  Even while larger numbers of XO-1s connect very
> successfully to TP-Link 3020 AP; and can see Android phone AP’s.
> 802.11s mesh networking cannot be the only root cause, but might
> turning off mesh and/or setting the channel to 9 mitigate the worst
> problems?

You would need to test at least ten TP-Link 3020 APs before you could
reliably claim that XO-1s connect very successfully, because of the
manufacturing variances in the APs.

The behaviour described sounds like a normal response to a dense
networking environment.  After my investigation yesterday, and
reproducing with a group of six XO-1, I can give you some advice.

When designing a classroom network, the XO-1 has a special feature
to consider.

The XO-1 will consume more of the available bandwidth than an XO-1.5
or later, because each XO-1 continually transmits mesh beacons even if
mesh is not in use.  Each XO-1 also responds to every scan (probe
request) by every other laptop.

As the number of laptops in a classroom grows, the available bandwidth
will be depleted much sooner with an XO-1 than with an XO-1.5 or later.

How to deal with it in the field?  Make a scoring system for each
channel, 1, 6, and 11.

Give each classroom AP on a channel a score of 12.

Scan, and give each AP outside the classroom on the same channel a
score of 12.

Give each XO-1 a score of 4.

Give each other device, such as XO-1.5, XO-1.75, or XO-4, desktop
computers with wireless, other laptops with wireless, tablet
computers, Android phones, and iOS phones, a score of 1.

Work to minimise the score.

Here's some examples:

1.  A group of 10 XO-1.5 and one AP will have a score of 22,

2.  A group of 30 XO-1.5 and one AP will have a score of 42,

3.  A group of 10 XO-1 and one AP will have a score of 52,

4.  A group of 30 XO-1 and one AP will have a score of 132.

5.  A group of 30 XO-1, two tablets, two mobile phones, two APs, two
APs next door, will have a score of 172.

--

Technical details:

"Give each AP on a channel a score of 12."  An AP transmits
a beacon every tenth of a second (102.4ms), and responds to every
probe request sent.

"Give each XO-1 a score of 4."  An XO-1 transmits a beacon every four
tenths of a second (409.6ms), responds to every probe request, and
transmits probe requests every 30 seconds for scanning.

"Give each other device a score of 1."  Each other device will
transmit probe requests for scanning.

Verification:

Configure a nearby Linux system to act as a wireless monitor, for example;

ifconfig mlan0 down
iwconfig mlan0 mode monitor
ifconfig mlan0 up
tcpdump -i mlan0 -s 0 -w mlan0.tpcdump
(^C after a minute)
wireshark mlan0.tcpdump

Measure the transmission rates for beacons and probe requests.

-- 
James Cameron
http://quozl.linux.org.au/
___
Server-devel mailing list
Server-devel@lists.laptop.org
http://lists.laptop.org/listinfo/server-devel