but with current mac80211 versions (current means last 2-3 years). they
are just unstable and running out of memory after a while
the only thing which helped was cutting of the memory limit of fq_codel
inside mac80211
i also have another fancy testunit which is a linksys wrt400 with 32 mb
ram and 2 ath9k based wifi chipsets. no hope here fonr running stable
for only 5 minutes even with a single connection under load (my crashing
test is running a hdtv iptv stream converted to unicast using a
stateless eoip tunnel)

I try to encourage folk to run the rtt_fair tests in flent when
twiddling with wifi. Those really shows how bad things are when you
don't have ATF + FQ + Per station aggregation and lots of
clients. Single threaded tests are misleading.
i know but even single threaded tests arent working good on such
devices. so there is no need to talk about the benefits of atf,fq_codel etc.
but there is need to talk about configurable use of it which also allows
to disable it if required.
I 110% agree that a system that can stay up for years is much better
than one that is fast for 5 minutes!

However I'd like a chance, in collaborating with you and your upcoming
patches - to try and narrow
down crash bugs to various subsystems and be able to get some
benchmarks done that I simply
couldn't do anymore at the financial conclusion of the make-wifi-fast
and cake projects.

I think I have a lot of gear that is dd-wrt compatible - apu2,
wndr3700s, 3800s....
if its v4, these are having 128 mb (i have them too). and apu2 has 2 gb. so its getting real interesting if you choose such a bad one with 32 mb ram which are still commonly used by "freifunk"
The reduce truesize patch had helped a lot at the time (2012). There
were all kinds of flaky bugs that disappeared.
i tested and it helped to make ethernet unavailable. it worked for wifi interfaces. but the eth0 and eth1 on my ipq8064 based testboard did not work anymore. no dhcp lease, no ping. but i was able to capture inbound packets. (qos was not even enabled while testing, so no cake, fq_code letc. just standard sfq scheduler)
so i reverted and all worked again

the new drop monitor patchset looks WONDERFUL for seeing more about
packet drop behavior in the stack, but
it's a 5.3(?) feature only.
i love backporting :-)

I note that I run 18.06.1 on my 32MB pico and nanostations on the
lupin campus, but I run no gui, few additional applications at all
(except babel, snmpd, netperf, and the other core needed daemons).  My
uptimes are principally governed by power failures. I can't remember
the last  "crash, crash" I had, and I do track memory leaks (none).
That said, I'm painfully aware that I should probably give dd-wrt and
openwrt 19.x some testing just to make sure there's no regressions,
but have been reluctant to get involved again without more partners in
crime, because the scars from deploying 18.x widely are only beginning
to heal... and only last week did the needed babel 1.9 upgrade arrive
so I can finally redeploy ipv6 universally. I fear my current
reliability metrics are so good because I took down ipv6 last year....
my workaround with memory problems is also disabling http normally. i have some of these nanostations in the field

just running hostapd, snmp, syslog. but anything else is disabled due the oom problematics. it never was a real crash.

but oom. but i never played with babel. ospf etc. all working out of the box based on quagga on low end devices and frr on bigger ones.


Pico:

root@pool2:~# free
              total         used         free       shared      buffers
Mem:         28480        23796         4684           92         1868
-/+ buffers:              21928         6552
Swap:            0            0            0

root@pool2:~# uptime
  11:38:09 up 43 days, 21:37,  load average: 0.04, 0.03, 0.04

Same workload over here, on a wndr3800, almost exactly the same config

root@couch:~# free
              total       used       free     shared    buffers     cached
Mem:         60320      22872      37448         68       1960       6120
-/+ buffers/cache:      14792      45528
Swap:            0          0          0

NS2

root@TRO1:~# free

              total        used        free      shared buff/cache   available
Mem:          29124       19228        3552           0 6344        7752
Swap:             0           0           0

wndr3700v4

root@DD-WRT:~# free
              total        used        free      shared buff/cache   available
Mem:         125884       23048       92940           0 9896       99824
Swap:             0           0           0
root@DD-WRT:~#



Disabling the fq part won't actually gain you much in terms of memory
usage, though, as most of it is packet memory which is already
configurable.

The one exception to this is the static overhead of 'struct fq_flow', of
which mac80211 currently allocates 4k. That's 300k of memory which is
currently not configurable. But that could be fixed :)

-Toke
--

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740

_______________________________________________
Cake mailing list
[email protected]
https://lists.bufferbloat.net/listinfo/cake

Reply via email to