So the behavior is that writing a large file (one N MB long) eats up N MB of memory quickly. Why?
Well, suppose dd is writing a big file, and getting data from /dev/zero. Or suppose ld is writing a big file and getting data from some other in-core segment. What happens? Well, it basically copies the data into the memory object backed by the file--as fast as it can. (I'll call this object the "hosage object".) In both those cases (dd or ld) it can do it pretty darn fast, because it isn't waiting on anything. (If it's writing into /dev/null, then it never gets copied into a hosage object, and so none of what follows applies, which explains Roland's observation that it matters whether dd is writing to a file or /dev/null.) So now we have two processes: one which is filling the hosage object as fast as it possibly can, and another (the filesystem's pager threads) which is writing out the data. As the writer proceeds to fill the hosage object, the pool of free pages in the system plummets (*quickly*) and the pageout thread in the kernel kicks in. The kernel notices the sequential access pattern (contrary to my previous statements it already does this; more on it in a separate message, because there are still problems), and pretty much all the pages in the hosage object are marked "inactive", and therefore ripe for immediate pageout. So the situation is a fairly small pool of active pages, a small pool of "ordinary" inactive pages (idle pages from other things, ripe for pageout), and a gajillion inactive pages belonging to the hosage object. And the kernel pageout thread looks at this situation. As long as the system decides memory is needed, the pageout thread tries to free pages, and what happens depends on exactly how much memory is free: 1) There are more than 15 pages free. In that case, the pageout daemon hands off the first inactive pages it finds; odds are these are almost all pages from the hosage object, and so we toss a bunch at the filesystem pager. Now the kernel is studiously careful to avoid swamping the filesystem pager, so it might not page out as many as it could, but given that there is a huge demand for pages (because dd is hosing them As Fast As It Can), we cannot reach a steady state here. The number of free pages *will* drop below 15, because dd can hose pages much faster than the disk can write them out. 2) If there are between 10 and 15 pages free then the system stops letting most processes allocate pages. dd stops now, and the filesystem pages a little--but not much--because the filesystem does need to allocate pages for things, including pageout. The default pager, however, is still happy. The pageout thread continues taking pages and trying to send them to the filesystem pager, however, and this takes a little memory. In this state, the pageout thread is still sending pages to the filesystem, which is basically not able to run. So while dd is no longer hosing new pages, the kernel uses up more memory queuing the pages to the filesystem pagers, and probably does so faster than the default pager can write pages to disk. If you have no default pager, then you lose totally at this point. 3) When memory drops below 10 pages free, the kernel gives up on the filesystem pager entirely. Now pages are paged only to the default pager. Almost all the inactive pages are in the hosage object, but the kernel is smart: it just directly pages these pages into the swap partition. At this point there is nothing much allocating memory; only the default pager and the kernel are allowed to, and they don't take that much. 4) If memory drops below 5 pages, then the pageout thread stops entirely to let the default pager catch up. We don't get here much. So the system quckly plummets to the less-than-10 free pages, at which point the default pager pages out five. Now there's memory to allocate! A little memory is quickly taken by the filesystem and the hosage pageout process, and a little progress is made. Memory drops. Pages go to the swap partition. The filesystem is now paging at a crawl. Eventually when dd or ld or whatever finishes writing, the default pager frees up enough memory for the filesystem to run freely, and it does so, and begins writing the file at a reasonable speed and it clears up. Now what to do about this problem? For that, see part 2. Thomas

