On Mon, 10 Oct 2005 12:18:42 +1000, Steven D'Aprano <[EMAIL PROTECTED]> wrote: >George Sakkis wrote: > >> Steven D'Aprano wrote: >> >> >>>On Sun, 09 Oct 2005 23:00:04 +0300, Ville Voipio wrote: >>> >>> >>>>I would need to make some high-reliability software >>>>running on Linux in an embedded system. Performance >>>>(or lack of it) is not an issue, reliability is. >>> >>>[snip] >>> >>> >>>>The software should be running continously for >>>>practically forever (at least a year without a reboot). >>>>Is the Python interpreter (on Linux) stable and >>>>leak-free enough to achieve this? >>> >>>If performance is really not such an issue, would it really matter if you >>>periodically restarted Python? Starting Python takes a tiny amount of time: >> >> >> You must have missed or misinterpreted the "The software should be >> running continously for practically forever" part. The problem of >> restarting python is not the 200 msec lost but putting at stake >> reliability (e.g. for health monitoring devices, avionics, nuclear >> reactor controllers, etc.) and robustness (e.g. a computation that >> takes weeks of cpu time to complete is interrupted without the >> possibility to restart from the point it stopped). > > >Er, no, I didn't miss that at all. I did miss that it >needed continual network connections. I don't know if >there is a way around that issue, although mobile >phones move in and out of network areas, swapping >connections when and as needed. > >But as for reliability, well, tell that to Buzz Aldrin >and Neil Armstrong. The Apollo 11 moon lander rebooted >multiple times on the way down to the surface. It was >designed to recover gracefully when rebooting unexpectedly: > >http://www.hq.nasa.gov/office/pao/History/alsj/a11/a11.1201-pa.html >
This reminds me of crash-only software: http://www.stanford.edu/~candea/papers/crashonly/crashonly.html Which seems to have some merits. I have yet to attempt to develop any large scale software explicitly using this technique (although I have worked on several systems that very loosely used this approach; eg, a server which divided tasks into two processes, with one restarting the other whenever it noticed it was gone), but as you point out, there's certainly precedent. Jp -- http://mail.python.org/mailman/listinfo/python-list