Re: Ugly, slow shutdown
On 2000-Aug-07 14:49:34 -0700, David Greenman [EMAIL PROTECTED] wrote: No, that's not true, and there are many examples in the kernel where a bogus wakeup would lead to bad things happening. I recall some code in the advisory locking code, and VM system, that assume that there is only one wakeup event and that the thing causing it assures that certain other things have occured before issuing it. That's just the way it has worked since the dawn of time. In the beginning, there was sleep(). The V6, V7, 2.9BSD and 2.11BSD [which is all I've checked] sources all include the following comment on sleep(): * Callers of this routine must be prepared for * premature return, and check that the reason for * sleeping has gone away. 2.9BSD tsleep() is implemented using sleep() and it's not immediately obvious that tsleep() won't also return prematurely. In 2.11BSD, sleep() is implemented using a tsleep() `"borrowed" from 4.4BSD'. Presumably a conscious decision was made to change the semantics between sleep() and tsleep(). Peter To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Ugly, slow shutdown
Well, I've failed in my main objective (to deuglify the shutdown messages), but an interesting debate has resulted instead, so I can't feel too bad. I did a little research to support my position on sleep/wakeup, and here's the best I have. This is pretty long, and unlikely to shake your world view, so those of you with drooping eyelids can just head over to slashdot, or something. :-) Some pseudo code from "The Design of the Unix Operating System", by Maurice Bach, page 33 shows how sleep() is used: while (condition is true) sleep (event: the condition becomes false); set condition true; and the next page shows how wakeup() is used: set condition false; wakeup (event: the condition is false); In the description, it says `Thus, the "while-sleep" loop insures that at most one process can gain access to a resource.' Not the most convincing evidence, but on the other hand, he does not mention the idea of *not* protecting against sudden wakeup. From "Writing a Unix Device Driver", by Egan and Teixeira, on page 92 we find It is not uncommon for several processes to sleep on the same channel. They may be competing for the same resource, or they may be waiting for different reasons that have been associated with the same channel value. In this situation a single wakeup call on the common channel will cause all the sleeping processes to become executable; ... A driver routine must not assume that it can proceed after a return from a sleep call. It should check to see whether the event it was waiting for has actually occurred; if it has not it should sleep again, and repeat this cycle until the awaited event has actually occurred. The book is oriented rather towards I/O, so perhaps not all possible uses of routines are covered. But again, no mention of *not* using a while loop. Quite the opposite. Also "Magic Garden Explained" points out that you really want to sleep on an "event", but all you have is the address of some data. So, you often have multiple semantically different events represented by the same integer wakeup channel. A good reason to program defensively, I think. But the best evidence is from kern_synch.c from 4.2 BSD, line 98, in the header comment of the sleep() routine: * Callers of this routine must be prepared for * premature return, and check that the reason for * sleeping has gone away. That comment on sleep() is present from 4.0 BSD up to and including 4.3 tahoe, but disappears in 4.3 reno, when the 4.4 style tsleep() was introduced. After a bit of searching through the PUPS archive, I see it is even present in Edition 6, character for character, in a file called slp.c. Well, I knew I wasn't a senile old fart yet, and Kirk's BSD CD compendium and the PUPS archive show that I remember some things correctly still. For a considerable portion of Unix history, sleep() could return for no good reason at all, and was documented to do so (if only in the source code). Now, how does this relate to the current day? Nobody in the BSD world uses plain sleep() any more. Once tsleep() appeared, the rules seem to have changed. Perhaps some people had gotten away with ignoring the dire warnings in the sleep() code, and decided that unexpected wakeups weren't such a useful part of the API. I hope Kirk or other BSD veterans can be coaxed into offering an opinion. I'd offer at least one beer for this purpose. :-) Regardless of the history of it all, FreeBSD is full of places where unexpected wakeups can stuff you right up. Should we regard tsleep() like the older sleep() call, as suspect, and program defensively? Should we be pragmatic, admit "We've gotten away with it so far", and document the "no sudden wakeups" behaviour? I quite like the general principle outlined in one of the earlier replies, that a while loop can be shown to be correct through a local code reading, but a simple conditional must be verified by reading all the rest of the code. That's close to the same argument I use against global variables. Their use is too hard to verify as correct. In short, I'd like to see all cases where tsleep() is not carefully used in a loop repaired. Practically speaking, though, I can't see that happening, especially if we have any major players against the idea (DG for example). Given that, I'd like as a minimum a bit more of the history of sleep() in the tsleep() manual page, and a discussion of when a while-loop protected tsleep() is mandatory, and when it is optional. Some sort of pronouncement against issuing wakeup() calls against arbitrary addresses would help too. I would do that right now, except I'm escaping computing for a few months. Almost heresy nowadays, I suppose. And I won't be the first in line for a brain implanted net connection either. ;-) Stephen. PS By the time you read this, I've probably
Re: Ugly, slow shutdown
In message [EMAIL PROTECTED], Stephen McKay writes: Regardless of the history of it all, FreeBSD is full of places where unexpected wakeups can stuff you right up. Should we regard tsleep() like the older sleep() call, as suspect, and program defensively? Should we be pragmatic, admit "We've gotten away with it so far", and document the "no sudden wakeups" behaviour? I quite like the general principle outlined in one of the earlier replies, that a while loop can be shown to be correct through a local code reading, but a simple conditional must be verified by reading all the rest of the code. That's close to the same argument I use against global variables. Their use is too hard to verify as correct. I support this notion with a footnote to the effect that if the test is expensive the while() can be left out if a comment points out exactly why the while() isn't needed. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 [EMAIL PROTECTED] | TCP/IP since RFC 956 FreeBSD coreteam member | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Ugly, slow shutdown
In article [EMAIL PROTECTED], David Greenman [EMAIL PROTECTED] wrote: I will add that this is the pattern that Kirk teaches in his kernel internals class. If that's true, Do you want me to fax you a copy of page 15 of his class notes from the course he gave at last year's FreeBSDCon, or will you just take my word for it? I'm not calling you a liar. I think it is possible that you may have misunderstood what Kirk was saying, however. Having not been in the class myself, it is perhaps a bit presumptuous for me to think that. There are cases where one must check for the condition after the wakeup - the cases where multiple sleepers/consumers are waiting on the same condition/resource, for example. ...and there are cases that simply don't work that way and aren't suseptible to that inherent race condition. Is it possible that Kirk was speaking about the race condition cases? then he should practice what he preaches. Some of the code that I'm refering to (e.g. lockf) was apparantly written by him. Whether Kirk practices what he preaches is irrelevant to this discussion. Instead of focusing on a 1-sentence "I will add ..." from my posting, why not respond to the main thrust of it -- the paragraph I quoted from the Birrell paper? Because I haven't decided whether I agree with it or not. I think an argument could be made that adding the checks in a case where a bogus wakeup can never happen might actually obscure the code by leading the programmer into thinking that there could be multiple sleepers/consumers. Perhaps I read more into code than I should, but I naturally assume that if a check is made for something then the condition being checked for can happen. If it cannot, then that just leads me astray. Certainly comments are a good thing to keep people on the right path, but then this applies whether you check the post-tsleep state or not. I'll say again, however, that some of the cases that rely on the historical symantics would become very expensive if they had to go through a series of complex checks (perhaps list traversals, etc), in order to verify that the wakeup wasn't bogus. I personally don't think this is an improvement. Some of them might be expensive, but most of them would not. That could be - I honestly don't know and it seems to me that a thorough code review would be needed before a conclusion can be drawn. Obviously the waker-upper knows that the condition is true. Otherwise the existing code which doesn't check wouldn't work. In the expensive cases the waker-upper could simply set a flag for the sleeper to check. For me, that doesn't make the code easier to read or understand - it has the opposite effect. ...but then I'm used to the historical symantics and naturally consider the possible cases when looking at the code. Note, I am not expressing an opinion about whether the sleeps should be terminated prematurely during shutdown. But I am expressing a strong opinion about whether sleepers should do a reality check before proceeding. I could be persuaded to think that way, but I still remain unconvinced. Again, I'm used to the historical symantics, so changing them requires a pretty good justification and a bit of brain rewiring, which I naturally resist. -DG David Greenman Co-founder, The FreeBSD Project - http://www.freebsd.org Manufacturer of high-performance Internet servers - http://www.terasolutions.com Pave the road of life with opportunities. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Ugly, slow shutdown
* David Greenman [EMAIL PROTECTED] [000807 23:15] wrote: In article [EMAIL PROTECTED], David Greenman [EMAIL PROTECTED] wrote: Obviously the waker-upper knows that the condition is true. Otherwise the existing code which doesn't check wouldn't work. In the expensive cases the waker-upper could simply set a flag for the sleeper to check. For me, that doesn't make the code easier to read or understand - it has the opposite effect. ...but then I'm used to the historical symantics and naturally consider the possible cases when looking at the code. Note, I am not expressing an opinion about whether the sleeps should be terminated prematurely during shutdown. But I am expressing a strong opinion about whether sleepers should do a reality check before proceeding. I could be persuaded to think that way, but I still remain unconvinced. Again, I'm used to the historical symantics, so changing them requires a pretty good justification and a bit of brain rewiring, which I naturally resist. It's not just that, if you always have to cover your behind when doing tsleep you may wind up masking wakeup bugs. Places like "vfs_bio.c" line 586 of 3182: bp-b_xflags |= BX_BKGRDWAIT; tsleep(bp-b_xflags, PRIBIO, "biord", 0); if (bp-b_xflags BX_BKGRDINPROG) panic("bwrite: still writing"); } If replaced by a while() _may_ obscure a buffercache bug. Personally I'd like to be able to catch such bugs than let them go because the API (wakeups can happen at any time) prohibits this. -Alfred To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Ugly, slow shutdown
It's not just that, if you always have to cover your behind when doing tsleep you may wind up masking wakeup bugs. Places like "vfs_bio.c" line 586 of 3182: bp-b_xflags |= BX_BKGRDWAIT; tsleep(bp-b_xflags, PRIBIO, "biord", 0); if (bp-b_xflags BX_BKGRDINPROG) panic("bwrite: still writing"); } If replaced by a while() _may_ obscure a buffercache bug. Personally I'd like to be able to catch such bugs than let them go because the API (wakeups can happen at any time) prohibits this. No in a fully threaded world. Nate To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Ugly, slow shutdown
On Sun, 06 Aug 2000 01:49:49 +1000, Stephen McKay wrote: I think shutdown time has gotten uglier and slower than it needs to be. Probably because you already understand what's going on. The existing text for the "stopping process" messages is designed to help folks stay calm while their machines apparently lock up at shutdown time. I think they're very healthy and should stay exactly as they are. The second patch is of no importance to me. Ciao, Sheldon. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Ugly, slow shutdown
* Stephen McKay [EMAIL PROTECTED] [000805 08:49] wrote: Patch 2 is smaller and possibly controversial. Normally bufdaemon and syncer are sleeping when they are told to suspend. This delays shutdown by a few boring seconds. With this patch, it is zippier. I expect people to complain about this shortcut, but every sleeping process should expect to be woken for no reason at all. Basic kernel premise. You better bet it's controversial, this isn't "Basic kernel premise" it's quite possible and acceptable for some piece of code to expect this sequence: me someone else put myself at the head of a queue waiting for some resource finish with resource and perform wakeup assume wakeup was notification of resource availability and use it without checking for a 'stray' wakeup. *boom* *crash* *ow* :) I've been running these patches on a 4.x machine for a while now. No problems except I am now surprised by the slow and ugly shutdown of unpatched machines. :-) If you want to speed it up "properly" then either set up some protocol or do something along the lines of what speedup_syncer() does: if (updateproc-p_wchan == lbolt) setrunnable(updateproc); It 'knows' that the syncer is safe to wakeup when sleeping on lbolt (only) and only wakes it up then, otherwise it may be in some other random part of vfsbio and blow up. but... Actually I think that speedup_syncer() is somewhat bogus and should also check some sort of variable that says "syncer is idle" rather than lbolt, because we never know when an lbolt sleep may happen in the codepath. (ugh) The syncer would need to set this variable before sleeping to declare itself "safe" for per-emptive wakeup. It's a bit of hysteria but it's definetly possible. hope this helps, -Alfred To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Ugly, slow shutdown
* Stephen McKay [EMAIL PROTECTED] [000805 08:49] wrote: Patch 2 is smaller and possibly controversial. Normally bufdaemon and syncer are sleeping when they are told to suspend. This delays shutdown by a few boring seconds. With this patch, it is zippier. I expect people to complain about this shortcut, but every sleeping process should expect to be woken for no reason at all. Basic kernel premise. You better bet it's controversial, this isn't "Basic kernel premise" Actually, that depends. It is definitely poor programming practice to not check the condition for which you slept on wakeup. *boom* *crash* *ow* :) Doctor: So don't do that. In this case, the relevant processes just need to learn to check whether they've been woken in order to die. -- ... every activity meets with opposition, everyone who acts has his rivals and unfortunately opponents also. But not because people want to be opponents, rather because the tasks and relationships force people to take different points of view. [Dr. Fritz Todt] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Ugly, slow shutdown
* Mike Smith [EMAIL PROTECTED] [000807 01:25] wrote: * Stephen McKay [EMAIL PROTECTED] [000805 08:49] wrote: Patch 2 is smaller and possibly controversial. Normally bufdaemon and syncer are sleeping when they are told to suspend. This delays shutdown by a few boring seconds. With this patch, it is zippier. I expect people to complain about this shortcut, but every sleeping process should expect to be woken for no reason at all. Basic kernel premise. You better bet it's controversial, this isn't "Basic kernel premise" Actually, that depends. It is definitely poor programming practice to not check the condition for which you slept on wakeup. Stephen's patches didn't give them that option, the syncer could be in some other part of vfs that doesn't expect to be woken up, perhaps in uniterruptable sleep... perhaps waiting for a DMA transfer? How does one check if the data filled into a buffer is actually from the driver and not just stale? *boom* *crash* *ow* :) Doctor: So don't do that. In this case, the relevant processes just need to learn to check whether they've been woken in order to die. No, they need to signify that it's safe to wake them up early. -Alfred To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Ugly, slow shutdown
* Mike Smith [EMAIL PROTECTED] [000807 01:25] wrote: * Stephen McKay [EMAIL PROTECTED] [000805 08:49] wrote: ... every sleeping process should expect to be woken for no reason at all. Basic kernel premise. You better bet it's controversial, this isn't "Basic kernel premise" Actually, that depends. It is definitely poor programming practice to not check the condition for which you slept on wakeup. Stephen's patches didn't give them that option, the syncer could be in some other part of vfs that doesn't expect to be woken up, perhaps in uniterruptable sleep... perhaps waiting for a DMA transfer? How does one check if the data filled into a buffer is actually from the driver and not just stale? The time honoured standard is: raise cpu priority while (we do not have exclusive use of some item) { set some sort of "I want this item" flag (optional) sleep on a variable related to the item } use the item/data we waited for lower cpu priority A typical example from vfs_subr.c: s = splbio(); while (vp-v_numoutput) { vp-v_flag |= VBWAIT; error = tsleep((caddr_t)vp-v_numoutput, slpflag | (PRIBIO + 1), "vinvlbuf", slptimeo); if (error) { splx(s); return (error); } } ... the code plays a little with vp here ... splx(s); A simpler example from swap_pager.c: s = splbio(); while ((bp-b_flags B_DONE) == 0) { tsleep(bp, PVM, "swwrt", 0); } ... code uses bp here ... splx(s); Both of these examples are safe from side effects due to waking up early. This is how all such code should be. To do otherwise is to introduce possible race conditions. At your prompting, though, I've looked at more code and have found an example that violates this principle. I assume it is a bug waiting to bite us. In the 4.1.0 source (sorry, that's all I have on operational computers at this moment) line 581 of vfs_bio.c sleeps without looping. It would seem that Alfred's assertion of lurking danger is correct. This stuff should be fixed. *boom* *crash* *ow* :) Doctor: So don't do that. In this case, the relevant processes just need to learn to check whether they've been woken in order to die. No, they need to signify that it's safe to wake them up early. When I return to the land of FreeBSD I'll offer a speedup that does not wake processes in arbitrary places (to avoid tickling lurking bugs). To do this I would make processes that want to use the suspension mechanism call a routine in kern_kthread.c for their just-loafing-about sleep. Then that module will have enough information to do the job quickly. And back to the simpler bit (the bike shed bit). Does everyone else actually *like* the verbose messages currently used? And the gratuitous extra newline in the "syncing..." message? Stephen. PS My main machine has blown its power supply. Contact with me will be patchy. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Ugly, slow shutdown
In message [EMAIL PROTECTED] Stephen McKay writes: : And back to the simpler bit (the bike shed bit). Does everyone else actually : *like* the verbose messages currently used? And the gratuitous extra newline : in the "syncing..." message? I like the newer messages in your patch, but I don't care enough to bikeshed this :-) Warner To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Ugly, slow shutdown
: * Stephen McKay [EMAIL PROTECTED] [000805 08:49] wrote: : : Patch 2 is smaller and possibly controversial. Normally bufdaemon and : syncer are sleeping when they are told to suspend. This delays shutdown : by a few boring seconds. With this patch, it is zippier. I expect people : to complain about this shortcut, but every sleeping process should expect : to be woken for no reason at all. Basic kernel premise. : : You better bet it's controversial, this isn't "Basic kernel premise" : :Actually, that depends. It is definitely poor programming practice to :not check the condition for which you slept on wakeup. : : *boom* *crash* *ow* :) : :Doctor: So don't do that. I gotta agree. This is very bad programming practice. There are many, many places in the kernel where tsleep() is called with a 0 delay and assumed not to return until something meaningful happens. For example, for handling NFS retries, some of the locking code (I think), and I could probably find many others. In general 'quick hacks' only cause immense pain later on... sometimes years later on. It is NOT WORTH IT. In the case of buf_daemon, waking up the process is a definite no-no. You can wakeup lbolt, but you can't just go wakeup the process willy-nilly -- it could be sitting anywhere. What I would suggest is to add another argument to the shutdown event-handler registration, an optional wakeup address. If NULL, no wakeup is performed. Otherwise a wakeup on the specified address is performed. You can then pass lbolt to it when the syncer buf_daemon processes register. That is a simple, safe solution. -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Ugly, slow shutdown
In message [EMAIL PROTECTED], Matt Dillon writes: : * Stephen McKay [EMAIL PROTECTED] [000805 08:49] wrote: : : Patch 2 is smaller and possibly controversial. Normally bufdaemon and : syncer are sleeping when they are told to suspend. This delays shutdown : by a few boring seconds. With this patch, it is zippier. I expect people : to complain about this shortcut, but every sleeping process should expect : to be woken for no reason at all. Basic kernel premise. : : You better bet it's controversial, this isn't "Basic kernel premise" : :Actually, that depends. It is definitely poor programming practice to :not check the condition for which you slept on wakeup. : : *boom* *crash* *ow* :) : :Doctor: So don't do that. I gotta agree. This is very bad programming practice. There are many, many places in the kernel where tsleep() is called with a 0 delay and assumed not to return until something meaningful happens. For example, for handling NFS retries, some of the locking code (I think), and I could probably find many others. Then this code should be changed to do the right thing, which is to *always* check the condition being slept on before proceeding. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 [EMAIL PROTECTED] | TCP/IP since RFC 956 FreeBSD coreteam member | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Ugly, slow shutdown
In message [EMAIL PROTECTED], Alfred Perlstein writes: Then this code should be changed to do the right thing, which is to *always* check the condition being slept on before proceeding. Can you give a reason why we'll have to now start coding defensively because our arguments to tsleep() are just "advisory" now? It is not something we "suddenly have to do" it's been The Right Way even since I first sharpened my teeth on unix kernels many years ago. Coding defensively btw, is in the same category :-) I can also imagine some fun infinite loops like so: monitor issues wakeup producer wakes and terminates or goes away consumer spins checking on availability This is wrong code. It should be: monitor issues wakeup producer wakes and terminates or goes away consumer spins checking on producer still present and on availability You'll find plenty examples of such code in the tty code. Also, one must now do this? timeo = currenttime + 2; while (timeo != currenttime) tsleep(timeo); ? If the exact duration of your timeout is important, you should always calibrate it against getmicrotime() or getmicrouptime() (depending on it being UTC locked or not). This has also always been the case. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 [EMAIL PROTECTED] | TCP/IP since RFC 956 FreeBSD coreteam member | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Ugly, slow shutdown
Can you give a reason why we'll have to now start coding defensively because our arguments to tsleep() are just "advisory" now? It is not something we "suddenly have to do" it's been The Right Way even since I first sharpened my teeth on unix kernels many years ago. Uh, Poul, I think you're full of it. The previous behavior of tsleep where you can make assumptions about who wakes you and under what conditions is a long and well established idiom. We (the kernel developers of BSD) have always coded to the established kernel programming interface and most of us consider it bad form to check for conditions which can't happen because the kernel API doesn't allow it. For example, we don't check for a NULL return from malloc in the case of !NO_WAIT, because we knew that the code would never do that. There are many other examples of similar assumptions in the kernel. We write the code to be efficient and only check for bogus conditions that might happen. The only exception to this is programatic ASSERTs that do internal consistency checks, but the purpose of those is quite different. -DG David Greenman Co-founder, The FreeBSD Project - http://www.freebsd.org Manufacturer of high-performance Internet servers - http://www.terasolutions.com Pave the road of life with opportunities. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Ugly, slow shutdown
Just a quick perusal of the kernel code shows a number of possible unexpected side effects from unexpected wakeups. I see several places where a 'WANTED' flag is set in a loop waiting for something and assumed to be cleared after the tsleep() returns. Some of these side effects are quite persistent. For example, if PG_WANTED is set in a page the VM system will activate the page rather then deactivate it. An unexpected wakeup in the VM system could lead to a page with PG_WANTED *LEFT* *SET* on the page! It might not be fatal, but it sure isn't the type of operation we want! There are a couple of places in the NFS code where a sleep wakeup results in a retry. There is a place where LC_EXPIREDWANTED is checked and causes a whole sequence of conditionals to be skipped. This is after 5 minutes of searching. I'm not going to spend N hours trying to find every case. Just finding two in 5 minutes is enough proof for me that it's just too dangerous to go and change expected tsleep/wakeup semantics without going through and auditing and documenting every piece of code that uses tsleep/wakeup. -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Ugly, slow shutdown
David Greenman wrote: Can you give a reason why we'll have to now start coding defensively because our arguments to tsleep() are just "advisory" now? It is not something we "suddenly have to do" it's been The Right Way even since I first sharpened my teeth on unix kernels many years ago. Uh, Poul, I think you're full of it. The previous behavior of tsleep where you can make assumptions about who wakes you and under what conditions is a long and well established idiom. We (the kernel developers of BSD) have always coded to the established kernel programming interface and most of us consider it bad form to check for conditions which can't happen because the kernel API doesn't allow it. For example, we don't check for a NULL return from malloc in the case of !NO_WAIT, because we knew that the code would never do that. There are many other examples of similar assumptions in the kernel. We write the code to be efficient and only check for bogus conditions that might happen. The only exception to this is programatic ASSERTs that do internal consistency checks, but the purpose of those is quite different. In the particular case of sleeping though, a woken process does need to check the condition that it slept on because one of the other processes sleeping on that resource may have had a chance to run first and changed some state. So as a general rule, you shouldn't assume that everything is fine when you return from being asleep because it might not be. I agree with your sentiment about defensive coding in the kernel though. Defensive coding should only be used on the boundary of your interface, i.e. if you're getting non-deterministic data from somewhere, most commonly user behaviour. Everything inside the kernel implementation should be deterministic and therefore there shouldn't be any need for defensive programming. Any cause of errors that defensive programming catches in the kernel can only be caused by bugs and therefore falls into the category of diagnostics. There's actually a lot of defensive programming creeping into the kernel which should really be wrapped in an #ifdef INVARIANTS. Paul. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Ugly, slow shutdown
In the particular case of sleeping though, a woken process does need to check the condition that it slept on because one of the other processes sleeping on that resource may have had a chance to run first and changed some state. So as a general rule, you shouldn't assume that everything is fine when you return from being asleep because it might not be. No, that's not true, and there are many examples in the kernel where a bogus wakeup would lead to bad things happening. I recall some code in the advisory locking code, and VM system, that assume that there is only one wakeup event and that the thing causing it assures that certain other things have occured before issuing it. That's just the way it has worked since the dawn of time. -DG David Greenman Co-founder, The FreeBSD Project - http://www.freebsd.org Manufacturer of high-performance Internet servers - http://www.terasolutions.com Pave the road of life with opportunities. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Ugly, slow shutdown
David Greenman wrote: In the particular case of sleeping though, a woken process does need to check the condition that it slept on because one of the other processes sleeping on that resource may have had a chance to run first and changed some state. So as a general rule, you shouldn't assume that everything is fine when you return from being asleep because it might not be. No, that's not true, and there are many examples in the kernel where a bogus wakeup would lead to bad things happening. I recall some code in the advisory locking code, and VM system, that assume that there is only one wakeup event and that the thing causing it assures that certain other things have occured before issuing it. That's just the way it has worked since the dawn of time. I did say "as a general rule". If you know that "by design" nothing else is going to mess with what you're sleeping on before you wake up then you can make tighter optimisations but that's not the general case. There is such a thing as over optimisation though and for the sake of a simple if statement it is probably better to write code that is robust to changes made elsewhere in the system rather than squeeze every inch of performance out of the code, unless there's a real need to optimize in that particular area. Paul. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Ugly, slow shutdown
I did say "as a general rule". If you know that "by design" nothing else is going to mess with what you're sleeping on before you wake up then you can make tighter optimisations but that's not the general case. There is such a thing as over optimisation though and for the sake of a simple if statement it is probably better to write code that is robust to changes made elsewhere in the system rather than squeeze every inch of performance out of the code, unless there's a real need to optimize in that particular area. In some cases it isn't practical or very expensive to verify that the condition that caused the sleep in the first place has been satisfied - that's often why certain parts of the kernel rely on the established tsleep symantics. -DG David Greenman Co-founder, The FreeBSD Project - http://www.freebsd.org Manufacturer of high-performance Internet servers - http://www.terasolutions.com Pave the road of life with opportunities. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Ugly, slow shutdown
In article [EMAIL PROTECTED], Alfred Perlstein [EMAIL PROTECTED] wrote: * Poul-Henning Kamp [EMAIL PROTECTED] [000807 10:03] wrote: Then this code should be changed to do the right thing, which is to *always* check the condition being slept on before proceeding. Can you give a reason why we'll have to now start coding defensively because our arguments to tsleep() are just "advisory" now? I'm not really sure why for a single reader/writer situation we have to have hysterics for a stray wakeup, it bloats code and is not needed in all places. It is just basic good programming practice. In his classic paper, "An Introduction to Programming with Threads" [1] Andrew Birrell argues for the explicit test in his discussion of condition variables, which are very similar to the kernel's tsleep/wakeup constructs. After giving a couple of purely technical reasons, he goes on to say: But the main reason for advocating use of this pattern is to make your program more obviously, and more robustly, correct. With this style it is immediately clear that the "expression" is true before the following statements are executed. Without it, this fact could be verified only by looking at all the places that might signal the condition variable. In other words, this programming convention allows you to verify correctness by local inspection, which is always preferable to global inspection. I will add that this is the pattern that Kirk teaches in his kernel internals class. [1] http://gatekeeper.dec.com/pub/DEC/SRC/research-reports/abstracts/src-rr-035.html John -- John Polstra [EMAIL PROTECTED] John D. Polstra Co., Inc.Seattle, Washington USA "Disappointment is a good sign of basic intelligence." -- Chögyam Trungpa To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Ugly, slow shutdown
I will add that this is the pattern that Kirk teaches in his kernel internals class. If that's true, then he should practice what he preaches. Some of the code that I'm refering to (e.g. lockf) was apparantly written by him. I'll say again, however, that some of the cases that rely on the historical symantics would become very expensive if they had to go through a series of complex checks (perhaps list traversals, etc), in order to verify that the wakeup wasn't bogus. I personally don't think this is an improvement. -DG David Greenman Co-founder, The FreeBSD Project - http://www.freebsd.org Manufacturer of high-performance Internet servers - http://www.terasolutions.com Pave the road of life with opportunities. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Ugly, slow shutdown
In article [EMAIL PROTECTED], David Greenman [EMAIL PROTECTED] wrote: I will add that this is the pattern that Kirk teaches in his kernel internals class. If that's true, Do you want me to fax you a copy of page 15 of his class notes from the course he gave at last year's FreeBSDCon, or will you just take my word for it? then he should practice what he preaches. Some of the code that I'm refering to (e.g. lockf) was apparantly written by him. Whether Kirk practices what he preaches is irrelevant to this discussion. Instead of focusing on a 1-sentence "I will add ..." from my posting, why not respond to the main thrust of it -- the paragraph I quoted from the Birrell paper? I'll say again, however, that some of the cases that rely on the historical symantics would become very expensive if they had to go through a series of complex checks (perhaps list traversals, etc), in order to verify that the wakeup wasn't bogus. I personally don't think this is an improvement. Some of them might be expensive, but most of them would not. Obviously the waker-upper knows that the condition is true. Otherwise the existing code which doesn't check wouldn't work. In the expensive cases the waker-upper could simply set a flag for the sleeper to check. Note, I am not expressing an opinion about whether the sleeps should be terminated prematurely during shutdown. But I am expressing a strong opinion about whether sleepers should do a reality check before proceeding. John -- John Polstra [EMAIL PROTECTED] John D. Polstra Co., Inc.Seattle, Washington USA "Disappointment is a good sign of basic intelligence." -- Chögyam Trungpa To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message