Joerg Reisenweber wrote: > WLAN-hw automatically entering power-save-mode comes to mind. Leaving this > fast power-save mode on detection of heavy traffic.
WLAN secretly conspiring with GSM ? ;-) It could indeed be something like this, although my current suspect is interrupt signaling between WLAN module and kernel. We already know that the driver is picking up interrupts where there should be none, and that it apparently sometimes misses interrupts as well. I have some more data: in order to determine whether this affects the transmit path, the receive path, or both, I wrote a basic "one-way ping" (for simplicity, it uses UDP instead of ICMP): http://svn.openmoko.org/developers/werner/owping/ It works as follows: the sending application puts a timestamps into each packet it sends. The receiving application obtains the packet timestamp attached by the networking stack (i.e., the time the packet was seen at the interface) and its own current application-layer time. The one-way times are the differences between these values and the sender's timestamp, minus any difference between the sender's and the receiver's clock. First some baseline data on the path PC-(ether)-AP-(WLAN)-laptop. Clocks were crudely synchronized by running ntpdate on both sides before the test. PC->laptop: received 99 packets itf: min/avg/max/sdev = -5.773/14.762/37.676/12.033 ms app: min/avg/max/sdev = -5.748/14.787/37.701/12.033 ms Laptop->PC: received 99 packets itf: min/avg/max/sdev = 5.274/10.152/22.472/3.638 ms app: min/avg/max/sdev = 5.302/10.176/22.466/3.640 ms Now let's look at the Neo. The path is PC-(ether)-AP-(WLAN)-Neo. Neo->PC: received 100 packets itf: min/avg/max/sdev = -29.613/-24.699/14.443/5.119 ms app: min/avg/max/sdev = -29.596/-24.677/14.465/5.119 ms The standard deviation is excellent. The negative end-to-end time is caused by the clocks not being perfectly synchronized. This doesn't matter since we're only interested in delay variations here. PC->Neo: received 100 packets itf: min/avg/max/sdev = 29.222/93.890/338.450/50.635 ms app: min/avg/max/sdev = 30.345/94.968/339.494/50.633 ms And here we meet the problem again. A standard deviation of 50ms is clearly excessive. While we don't know the exact one-way time, we know from the ping experiments that it has an upper bound of 80ms, which means that the outliers are above 350ms. Since both application and interface time show the same pattern, this indicates that the problem happens somewhere in the lower reaches of the stack, i.e., either in the WLAN module, the SDIO driver, the communication between the two, the SDIO stack, the Atheros WLAN stack, or its interface to the regular Linux networking stack. - Werner
