On Mon, Apr 24, 2006 at 09:21:44PM -0400, Jonathan S. Shapiro wrote: > On Tue, 2006-04-25 at 00:46 +0200, Bas Wijnen wrote: > > So forget about all others for the moment, please. > > I am not interested in considering move-only capabilities. We looked at > them several times over many years and decided that (a) they create more > problems than they solve, (b) they don't actually solve any of the > problems that people *think* they solve, and (c) they have negative > performance consequences.
Are there some archives of those discussions of previous years? I'd like to see the arguments for these statements. > Coyotos will not implement "move only" capabilities. Period. I'm sorry to hear that. Not that I am convinced that this is the perfect solution, but it is a solution to the problem we found, and timeouts seem like a very inferior replacement as a solution. Of course there's the death-notification idea (similar to the task-info capabilities in L4.X2), but although I haven't thought much about that: - I don't have much confidence in it. - I'm pretty sure you don't want that in the kernel either. > > > > The case when many capabilities are overwriten with a single IPC is > > > > most likely a bug in the server. > > > > > > Actually, it is the near-universal practice for a single-threaded > > > server. Arguments are commonly accepted in a way that overwrites the > > > arguments from the last invocation. > > > > This is not a problem. The invocation only happens when valid send-once > > capabilities get overwritten. Each and every valid send-once capability > > is directly related to a client waiting for a response. If you overwrite > > it, it will never get that response, because you are guaranteed to be the > > only party who is capable of responding. > > Yes, so now you have a situation where client A is notified of the > server's mishandling of client B. This is a security error. Coyotos will > not expose this fact. I must be missing something here. A and B have a communication channel (the reply capability), so the fact that A is sending a message to B cannot be the problem. A has given B a move-only reply-capability. Both A and B can see this, and know what it means. If B doesn't like it, then it can reply "I don't accept these, because I want to be able to crash or be compromised without you noticing". Then A can try the same operation again with a normal reply capability. Or it can think "You must have gone completely crazy, so I'll stop talking to you". Actually the latter seems more reasonable to me, because I cannot think of a valid reason why A is not supposed to know that B is overwriting the reply capability without using it. It reminds me of the DRM discussion we had: The client doesn't _have to be_ notified when its capability is dropped, but it can agree with the kernel that it will be. Kernel support is needed for this agreement, though. So the question is: why do you want to prevent such agreements from being made? > > As Marcus also pointed out already, this does not define the send-once > > capabilities we were talking about. > > You are confusing "send once" with "move only". Please pick one. Ok, you have a good point there. "send-once" was a very bad choice of words. As will have become clear by now, we meant "move-only and send-once". I have some more thoughts that I like to share: If the checks for capabilities being move-only give a too big performance hit, it would perhaps be possible to make a different object type of them, with a parallel set of kernel operations (except that they move, not copy). I'm not sure how feasable that would be, but it can be considered if the move-only capabilities really solve a problem (I think they do), but are just too bad for performance to be implemented. Separating them will improve branch prediction for both, which should improve performance. On the other hand some other checks must be doubled (from "are there capabilities" to "are there capabilities or move-only capabilities"), so in total it may not actually be an improvement. You can probably estimate if it is. I want to describe the problem I am talking about again, because it surprises me how unimportant you seem to think it is: There is a client C, which wants to make a call to S. The programmer think of it like this: call => result result is the interesting value, or the reason why the call failed. This call is implemented as two steps: 1. C invokes a capability to S, providing a reply capability. 2. S invokes the reply capability with the result or reason of failure. The problem now is that 2 may not happen. And in that case C will not be able to recover if it cannot discriminate this from S taking a long time to respond. So what are the reasons that 2 doesn't happen: A. S is malicious and wants C to wait forever. B. S overwrites the capability. (because of a bug) D. S dies before replying. (because of a bug, or user intervention) Obviously, A is not distinguishable from S taking a long time to reply (except by code inspection, which is not considered here). However, at some point the user will become impatient. Assuming that he can track the problem down to S, he will kill S. In that case, this situation becomes the same as D. Getting back to the beginning, C is waiting for S, but for the above reasons is never going to receive a reply. This is conceptually too hard for the programmer, so we add an extra possible result: C is notified (in some way) that there will not be a reply. Now things are easy for the programmer again, because C may simply wait for a reply, check if it is an error condition, and continue as usual. The notification is simply one of the possible error conditions. However, how is this magic step of the notification taking place? I see several options: E. Using the move-only-send-once-notify-on-destroy capabilities that we discussed in this thread. This will solve the problem. (You say above it doesn't, I like to hear why if you still think so.) It will however cost some performance. This may be worth it and it may not. F. Using death-notifications which have to be registered with the kernel. G. Using time-outs. C defines how long it accepts S to take, and if things take longer, it assumes something has gone wrong and "notifies" itself about it. Problem with this approach is that it yields false positives under load. H. User intervention. Assume everything is fine, and let the user figure it out if it isn't. That is, the user may send some sort of signal to C telling it that the operation failed, or he just kills C completely. The obvious drawback of this approach is that it requires very detailed knowledge on the part of the user about the inner workings of the system. J. Do nothing and take the hit. If this happens, well, tough luck, let's do other things. C will be waiting forever. This may be acceptable on a non-persistent system, where you can at least clean up the junk every now and then by rebooting, but it isn't on a persistent system, where this means a permanent memory-leak. To me, H and J are unacceptable. G really is unacceptable as well, but a bit less so. E and F (and perhaps other similar solutions) are fine, and should be compared with respect to performance in particular. So far, it seems you want to do J (combined with H?) and recently perhaps also G. This worries me a bit. Thanks, Bas -- I encourage people to send encrypted e-mail (see http://www.gnupg.org). If you have problems reading my e-mail, use a better reader. Please send the central message of e-mails as plain text in the message body, not as HTML and definitely not as MS Word. Please do not use the MS Word format for attachments either. For more information, see http://129.125.47.90/e-mail.html
signature.asc
Description: Digital signature
_______________________________________________ L4-hurd mailing list L4-hurd@gnu.org http://lists.gnu.org/mailman/listinfo/l4-hurd