2009/10/5 Jonathan S. Shapiro <[email protected]>: > Coyotos moves very strongly in the opposite direction. We favor > isolation over everything else. This decision was based on empirical > evidence of real [mis]behavior in real systems in real production > scenarios. But with safe languages gaining acceptance, I think we now > would need to re-examine that.
I am very glad you have bought this out. I have a few thoughts on the subject, which I thought I would share. None may be new to the majority here (certainly not to you, shap, considering your current position), but I thought they were worth collecting together in one place. I know my English is dry so I won't be offended if I come across anyone that tl;dr. 0. Language level object models tend to be finer grained than those exposed by the operating system. This item is to be taken with a grain of salt, because unix is best used when composing simpler processes at a fine granularity. However, this does not seem to be the common pattern today, and I think there are two different reasons. Antrik has mentioned before that the concept of the monolithic application is designed to serve the interests of the proprietary software developer - so the user can associate features (that could have been obvious for the user to implement had the program been structured better) with the application; I think that the resulting impression on modern programmers is sadly not going away in the short term. There is another reason, which is that the environments provided by some languages simply feel much nicer than that which unix provides. The filesystem, for example, feels like a duplicate object model, with less transparent support from the (most?) language. Given the distance some languages allow you to get from the operating system, moving from the status quo to fine-grained objects the user can interact with seems possible only when arbitrary, self-describing data structures and functions become the axioms out of which the system is composed. The Hurd has bought us a long way here : it looks like it would not take much to publish functions and objects for use from other programs with pyhurd; going from there to transparent sharing of functions and data does not seem like much of a stretch, but this would be more obvious in a system engineered around language / vm level sharing, rather than process / application level. The impression I am trying to give here is that the sort of interactivity that Antrik talks about in his post[1] on Deep Mehta may be much more applicable when there is less distance between objects and functions as the system sees them and objects as the application developer sees them. 1. Safe languages introduce new opportunities for optimisation. Geoffrey Irving recently pointed out that a JIT may become important even for (safe) static languages. Besides taking advantage of symmetry in the data, this has implications at the system level. When a piece of code can be shown to succeed, and the semantics of the language(s), preserved by the compiler[0], show that confinement boundaries are not leaked, pieces of code can be inlined across address space boundaries. The performance properties of this approach depend intimately on the implementation of address spaces, and it's not yet obvious that it can work in general, but the potential for speedup could be significant : cutting the number of IPC calls can lead to speedups on the same order. A recent minor refactoring of an application I maintain cut the time taken for a certain reporting feature from about two minutes to about a second, a difference which can only be attributed to a reduction in IPC calls (ms sql queries, formerly all by id, and all certainly from cache). The potential performance increase for drivers and the X server could be similarly significant. 2. Safe languages provide security benefits that go beyond confinement. Even when a stack overflow exploit in a network server can't do any damage to the filesystem directly, it can forge communications, which could be just as bad. Safe languages don't eliminate all possible bugs, but they sure make a difference, and depending on the intended target audience that could be a serious positive. -1. There is a large amount of legacy code that is not just going to go away. An approach that does away with unsafe languages will probably fare badly. Using another, virtualised OS to run them is a solution that has not been feasible until recently. But supporting unsafe code from the ground up, just like any other code, requires more work. Or does it? This is where I would like to come back to thinking about why the Hurd and Coyotos are interesting in light of the safe language revolution. What Coyotos provides is a lightweight, composable facility for confinement of unsafe code. It is probably lightweight enough that you could have unsafe and safe code calling each other, as safe as if they were in different processes (state changes on the safe side only via the provided interface), and yet the functionality is so integrated that call/cc would still be possible. The Hurd is interesting also, because it kind of lives between two worlds: one of rpc and one of unix. It's an interesting foundation to build on because it is already a modular implementation of most of a unix runtime. William Leslie Feetnotes: [0]: it should be obvious at this point that this does not require compilation to be just in time, indeed, it could just as easily be a link time or compile time feature depending on the language, but JIT compilation is good because it is a general case. [1]: http://tri-ceps.blogspot.com/2006/11/mehtahurd.html
