Mykal Funk wrote: > Rod Waldren wrote: >> I've had a bad battery manifest problems in many ways, not just losing >> time while powered down. Most recently I was having random problems and >> odd instability with a system. It was rock solid after replacing the >> battery. It's wasn't as bad as old Macs which completely lost their >> minds if the battery was bad, but along similar lines. If it's bad >> replace it or temporarily swap in a good one to see it it helps.
This indicates either a hardware problem, incompetent design, or coincidence. > I replaced the battery and the behavior didn't change. The time loss That's what I would expect. > occurs only under high load. When uptime reports a high load average, > the system loses time like crazy. When it is just sitting doing nothing > it keeps perfect time. I don't know what to make of it. Perhaps someone > else can. This sounds like dropped interrupts. I've co designed, written, and helped to write or support a few embedded RTOS, and on one of them we eventually had to include a "missed clock interrupt" counter. If the RTI handler discovered it was re entered while still processing, it incremented a counter, and immediately returned. This occurred on small machine with little processing power under the hood. We had app designed who insisted that some of their work needed to be done at interrupt level, and over the objections of the kernel crew (including me) they got their way. Anyway, if you have people who think that all CPUs come with an infinite supply of cycles per second, sometimes you will find that you run out. The kernel crew's eventual way out was simply to note the fact, and on the next interupt where we hadn't been re entered (due to lots of apps hooking the interrupt chain) we then processed the "missed ticks". It sounds like something in the kernel is holding interrupts off too long, or is processing too long, so that there are missing clock interrupts. You might try reducing the real time clock interrupt rate, or trying to select a more "real time" style scheduling strategy. I'm not any sort of expert on the Linux kernel to know how to go about that, but some others here may be able to assist. ISTR there is a way to tune the maximum interrupt rate the kernel will program the RTI for. For the PC, it's IRQ 8. ISTR there is a way to tell the kernel to use a high-res timer, that is about 1000 interrupts per second. If that's true, then there should also be a way to cut it back. You might investigate CONFIG_HZ and CONFIG_HIGH_RES_TIMERS. Certainly, you'd want to turn the latter off, I think. I searched for "linux clock interrupt rate" (no quotes in the search, of course) and turned up some stuff which looks somewhat confirmatory of my hypothetical cause. This looks like it might have related information http://www.cs.huji.ac.il/labs/parallel/stud/Etsion-MSc.pdf This http://support.novell.com/docs/Tids/Solutions/10100597.html may be the reason for the problem, 1000 interrupts per second may be more than your hardware can support when the system becomes loaded, and spends more time with interrupts disabled, or at least in interrupt processing. Now, they are looking at it from the other perspective, that is, increasing the HW clock rate on the host machine relative to the guest. "The 2.6 Linux kernel in SLES9 changes the amount of interrupts it uses for clock ticks as compared to the 2.4 kernel in SLES8 from 100\second to 1000\second. A dual- processor Linux 2.6 kernel can fire up to 3000\sec. This is usually not an issue when running on a bare metal server. " In your case, it may well be. So, figuring out how to make your kernel revert back to the pre 2.6 days rate of 100 interrupts per second may fix your problem. Anyone here who knows how to do this is requested to chime in. Mike -- p="p=%c%s%c;main(){printf(p,34,p,34);}";main(){printf(p,34,p,34);} Oppose globalization and One World Governments like the UN. This message made from 100% recycled bits. You have found the bank of Larn. I speak only for myself, and I am unanimous in that! -- http://linuxfromscratch.org/mailman/listinfo/lfs-support FAQ: http://www.linuxfromscratch.org/lfs/faq.html Unsubscribe: See the above information page
