On May 13, 2005, at 9:17 PM, Charles Lockhart wrote:

Brian Chee wrote:

Actually I have a question...why would you want to run a machine without swap? There are good reasons if you're running an embedded linux machine, but for normal machines I've seen folks setup unix boxes that boot from the network but ONLY do swap to the local hard disk. Those old xterms did this
alot just so that you don't swap over the network.

We're not using our machines as general desktop platforms. They're part of a system. In the current case I'm looking at, the computer is receiving data via a fiber link on one pci bus, processing that data, and then writing the data to disk across a second pci bus while we're still reading in more data over the first pci bus. We've (hopefully) managed to handle the contention issues internal to the program, but we found that the system loses balance and starts dropping data (irreplacable data) somewhat randomly if physical memory fills up and the mm starts using the swap partition. Basically the primary application will be working fine, then we'll start up something other stuff (slickedit, firefox, tkcvs, etc.), and at some point the swap partition starts getting used and the primary application performance starts being randomly flaky. This would be fine if we had some big shiny flag that would shoot up and alert the user that the system needs to be re-balanced. But we don't. One way that we could possibly fix this is to just disable the swap partition. I'd been hoping that new applications that would exceed the physical memory on process load would just fail, flagging to the user that they're misbehaving, but instead the machine just slows down a lot. This is slightly more problematic for how we use the system.

I've also talked to other people that were designing instrumentation for astronomy, and there interest in getting rid of the drives was based on what I'm told is a high rate of disk failure at altitude. If the primary source of failure is the disk, then why have it? But, please somebody correct me if I'm wrong, no disk no swap space?

Man, you gotta love all the white-out on that black project tech.  :-)

Not all hard drives have disks, it turns out. For (ahem) mil-spec (did I say that) applications at altitude (most folks don't know just how hard an airplane can vibrate under certain conditions. This makes the heads hit the disks, which is... bad), some of the (cough) contractors with which I am familiar use various (battery backed) RAM or flash based disks.

Companies like BiTMICRO Networks, M-Systems and Texas Memory Systems all make IDE (or SCSI) based drives that are completely solid state. Might be worth thinking about, not so much for swap, but you're writing down that data *someplace* inside the airframe, no?

As for altitude without the (cough) "mobile platform" aspect (so no high cyclic rate, high-G vibrations), consumer grade drives can't stomach high altitude quite literally because the air is (too) thin, so the heads fly on a thinner cushion of air, and this is a bit too close. Its a bummer.

As for keeping your application running while the VM system tries to be fair to firefox or slickedit:

1) which kernel are you running?
2) have you considered mlock()/mlockall()?
(post 2.6.9 you don't have to be (e)uid==0 to successfully call mlock() and friends) 3) you might also look at sched_setscheduler() and friends (assuming linux 2.6 kernels) 4) for the truly time-critical application, you could look at RTLinux and friends.

#2 and #3 can probably be combined to ensure that its firefox, tkcvs and slickedit that get sick in a low free pages situation. You'll want to be careful to not livelock yourself, create priority inversion, etc.

You could also call setrlimit(RLIMIT_AS, ...) in the parent of any process that is likely to start firefox, tkcvs, etc. This will smack any program that attempts to allocate over the limit you set. If you want to be somewhat kinder, you could setrlimit(RLIMIT_RSS, ...) to limit the resident set of these "unclean" programs. That will make the pager work harder, but if you've a) locked the critical pages/applications in core and b) told everyone else that can only have X MB (each) of resident pages, you might find a solution.

They're just ideas.

jim

Reply via email to