George Sakkis wrote: > Steven D'Aprano wrote: > > >>On Sun, 09 Oct 2005 23:00:04 +0300, Ville Voipio wrote: >> >> >>>I would need to make some high-reliability software >>>running on Linux in an embedded system. Performance >>>(or lack of it) is not an issue, reliability is. >> >>[snip] >> >> >>>The software should be running continously for >>>practically forever (at least a year without a reboot). >>>Is the Python interpreter (on Linux) stable and >>>leak-free enough to achieve this? >> >>If performance is really not such an issue, would it really matter if you >>periodically restarted Python? Starting Python takes a tiny amount of time: > > > You must have missed or misinterpreted the "The software should be > running continously for practically forever" part. The problem of > restarting python is not the 200 msec lost but putting at stake > reliability (e.g. for health monitoring devices, avionics, nuclear > reactor controllers, etc.) and robustness (e.g. a computation that > takes weeks of cpu time to complete is interrupted without the > possibility to restart from the point it stopped).
Er, no, I didn't miss that at all. I did miss that it needed continual network connections. I don't know if there is a way around that issue, although mobile phones move in and out of network areas, swapping connections when and as needed. But as for reliability, well, tell that to Buzz Aldrin and Neil Armstrong. The Apollo 11 moon lander rebooted multiple times on the way down to the surface. It was designed to recover gracefully when rebooting unexpectedly: http://www.hq.nasa.gov/office/pao/History/alsj/a11/a11.1201-pa.html I don't have an authoritive source of how many times the computer rebooted during the landing, but it was measured in the dozens. Calculations were performed in an iterative fashion, with an initial estimate that was improved over time. If a calculation was interupted the computer lost no more than one iteration. I'm not saying that this strategy is practical or useful for the original poster, but it *might* be. In a noisy environment, it pays to design a system that can recover transparently from a lost connection. If your heart monitor can reboot in 200 ms, you might miss one or two beats, but so long as you pick up the next one, that's just noise. If your calculation takes more than a day of CPU time to complete, you should design it in such a way that you can save state and pick it up again when you are ready. You never know when the cleaner will accidently unplug the computer... -- Steven. -- http://mail.python.org/mailman/listinfo/python-list