Jivin Jamie Lokier lays it down ... > David McCullough wrote: > > Feel free to send in some patches :-) > > When they let me past the dark age of 2.4.26-uc0, maybe I will :-) > > I have a few ideas to combine the better fragmentation performance of > page_alloc2.c with the speed of page_alloc.c (a hybrid of buddy and > bitmap search), plus some fragmentation-reducing strategies using > zones (nothing to do with uclinux) that were proposed for 2.6 kernels > and did well in measurements. > > You know, when that copious free time rolls around :-)
I think everyone is waiting for that one :-) > > Are you low on memory ? page_alloc2 gets pretty nasty about trying to > > clear the caches etc as often as possible to keep as much contiguous > > memory available at all times. > > Rapidly allocating and freeing memory: it's streaming video from disk > at rates of 1-2MB/s, on a device with 32MB total for Linux. Free > memory oscillates, decreasing and then jumping up every 5 seconds (on > the vendor-patched kernel). "Straight" uclinux keeps the free memory > up more consistently, but at the cost of very high kswapd CPU while > streaming. > > > That said, I have seen systems where kswapd CPU usage is not a problem, > > and oviously there are those where it is. I don't know the cause. 2 > > possibilities: > > > > 1) I haven't actively used a 2.4 kernel on a non-MMU system for some > > time and the page_alloc2 code may just be wrong due to a kernel > > update and bit rot. > > > > 2) The usage on these systems is triggering the behaviour. > > > > If you boot te hsystem is a configuration that doesn't use much RAM and > > don't start and nasty big apps is the system idle (ie kswapd is > > behaving). If so what triggers it's rampage ? > > I think it's the high rate of page allocation which triggers it. > > There shouldn't be a need to run kswapd constantly, for file cache > pages: it should be possible to reclaim cache pages rapidly during > allocation, recycling them. I think that's where page_alloc2.c goes > wrong. The heuristic interaction between page_alloc.c and kswapd is > rather subtle and tricky, but the basic difference is that > page_alloc.c doesn't maximise free memory all the time; instead, it > keeps track of rapidly reclaimable memory. > > Apart from the CPU difference, that means page_alloc2.c tends to fail > allocations if it really does run out of memory while kswapd is > catching up asynchronously. (And failed allocations result in execs > crashing, ahem). It's crashes due to memory shortage which prompted > me to investigate; the CPU differences were a surprise. > > A side effect of the high CPU of kswapd with page_alloc2.c in these > situations is that allocation is noticably slower. I noticed, to my > great surprise, that rsync was able to fetch files over the network > and write them to disk twice as fast with page_alloc.c. (4MB/s > instead of 2MB/s). For ages, I'd assumed it was the driver or hardware. > > To summarise, I found these differences: > > page_alloc.c: > > Pro: Lower CPU usage of kswapd, especially when streaming files. > Pro: Doesn't fail allocations when lots of data in filecache; > reclaims cache pages when needed. > Pro: Keeps file data cached, if the pages are not required > for something else. > Pro: Faster allocation, surprisingly faster sometimes. > Con: After long uptimes, with fork/execs causing large > contiguous allocations, eventually memory will be too > fragmented for fork/execs and the allocator is unable > to recover. So after long uptimes, the system will > fail to allow telnet logins, for example, but will still > be functioning in other ways. > > page_alloc2.c: > > Con: Higher CPU usage of kswapd, especially when streaming files. > Con: Fails allocations when lots of data in filecache which could > be reclaimed, sometimes. > Con: Evicts cached file data regularly. Even tiny files which are > read very often from disk will do I/O periodically, instead > of always reading from cache. > Con: Slower allocation, surprisingly so sometimes. > Pro: After long uptimes, with fork/execs causing large contiguous > allocations, and simultaneous streaming file data, it > manages to keep different types of allocation separate > enough that fragmentation is not inevitable. Indefinitely > long uptimes are realistically possible. > > In the end, we stuck with page_alloc2.c because of that last point. > Our systems either crash and burn (with watchdog recovery), or telnet > still works :) But we like every performance characteristic of > page_alloc.c more. > > The CPU usage of kswapd was a problem, and the crashing when too much > file data cached (due to fast streaming) was a big problem, so we > tuned kswapd to a sweet spot for this application, and did everything > possible with XIP-in-RAM to free up memory. Currently we have 11MB > free (out of 32MB) which seems to be enough. It seems extravagant, > but we found much less and the system crashes from time to time. Great summary. Basically it backs up every reason why page_alloc2 was created. We had routers running ipsec/pptp/whatever and 4MB of RAM. Without page_alloc2 they pretty much failed to boot, let alone stay up for months on end, thus page_alloc2. I am with you though, it should be able to do what it does without the CPU overhead. Cheers, Davidm -- David McCullough, [EMAIL PROTECTED], Ph:+61 734352815 Secure Computing - SnapGear http://www.uCdot.org http://www.cyberguard.com _______________________________________________ uClinux-dev mailing list [email protected] http://mailman.uclinux.org/mailman/listinfo/uclinux-dev This message was resent by [email protected] To unsubscribe see: http://mailman.uclinux.org/mailman/options/uclinux-dev
