Christopher Smith wrote:
No, but you'd probably want that CGI script's process to die and core dump, while the web server goes about its business.

Right. Or, I could use a safe language in a sandbox, and not dump core at all, but rather be able to provide a nice message back to the customer saying what happened, and etc.

Either way works. You're either writing the code in your own process space, or you're writing it in the kernel's process space. I don't see a compellingly good reason to pick one over the other globally.

This is exactly what I described as happening with our systems. Why does everyone think dumping core is the end of the world?

I don't. I just prefer to use a safe language where I'm in charge of dumping core, *and* where when something happens that shouldn't, I know I *do* get an error.

Okay, if the OS can't clean up file handles, sockets, and memory when your process dies, I very much doubt you're going to have much luck doing so in some catch block. Similarly for your database with database transactions. Seriously, process death is about as safe a way as you can find to clean up from an undefined error.

I'm not sure I agree. Of course, if that's the goal, then certainly after you do application-specific cleanup, you can exit your process and do all the stuff you want too.

But if you don't have a way to catch errors in the first place (such as array bounds checking, bounds checking on integers, and so on), it doesn't help nearly as much to know that if you leave the defined subset of the language so far that the kernel catches you, you find out about it.

Yes, because your interpreter/compiler is inherently having to trust all kinds of other code beyond the OS to be implemented correctly.

Like what?  Mind doesn't.

Most of that code doesn't have a good way to work from the "I don't trust any of this stuff" perspective, nor has it been tested in that regard nearly as much.

Um, I'd have to disagree, I think.

If you're relying on a core dump to detect an error, my process will dump core in exactly the same way as yours does if there's an error in stdio or something.

Sure, but there are far more subtle errors that pretty much or only going to show up in the form of your code finding itself in an unexpected state.

Certainly. I've had bad sectors in swap space. Yes? So? So the program dumps core, and the monitor on the other machine stops getting handshakes, and it dials out to see what happened, and then notifies me that the process not only encountered an error, but the errror reporting code failed too.

There are some errors you don't bother trying to recover from automatically. (Altho that last one I actually wrote a script to recover from.)
I couldn't agree more. Unexpected errors are very dangerous to try to recover from, particularly because by their very nature you can't be sure whether you really have a recoverable problem or not. Even exiting a process is no guarantee, but it is probably the best you can do.

Yeah. But the error of "division by zero actually means I overwrote my socket descriptors" isn't high on the list of likely suspects.

When you work with a safe language that's been around for decades, you're no more likely to encounter an error that drops you into a completely unknown state than you are to encounter an error in a C compiler that generates code that drops you into a completely unknown state. And you deal with them both the same way.

Why do you need a range-checking language?

Because running outside an array is a huge cause of errors, and if I can have my language or compiler check that instead of doing it by hand, it's both more efficient and safer.

How about a range checking range?

I'm not sure what that means, but I'd certainly appreciate if my range checked my ranges and turned down the heat instead of boiling over.

> Is there some magic that comes from having a range-checking
language that calls down to code in a non-range checking language vs. uses a range-checked iterator that calls down to code written in a non-range checked language?

Is there some magic in using an OO language instead of structures-with-function-pointers? Is there some magic in using Lint? Is there some magic in implementing "automatic pointers"?

Answer: Yes, in the same way there's magic in using device drivers instead of having every application code pokes to the hardware itself.

Because I've been burned by people doing exactly this kind of well intentioned coding far too often. Their stack has been corrupted, and they don't realize it, so instead of simply failing, they try to "recover" from the problem, only their stack is corrupted, so their "recovery" that is supposed to just clean things up ends up setting someone's account balance to zero, or causes the system that provably can't deadlock to deadlock, etc.

Yes. That's why it really only works well for safe languages, which is the premise I at least was starting with. If you have an unexpected error in an unsafe language, you have no idea what state is screwed up. If you have an unexpected error in a safe language, you can, with high confidence, continue to use the infrastructure of the language, because as far as the *language* is concerned, it's not an error.

Is invoking a method on a null pointer in Java as bad as invoking a method on a null pointer in C++? No. Why? Because it's not an error to do that in Java.

Of course, Java actually checks that, too, which means that in similar circumstances, C++ might *not* dump core, you might not *notice* the unexpected error, and you might just erase all those database transactions you were worried about, because you've just launched off in to some random piece of code that doesn't violate the standards the kernel tries to enforce. You will recognise this as "Pwned" when done intentionally.

I didn't say that good error handling can't be done, merely that if you expectation is that you are recovering from a logical error, I'd sure like the recovery code to be the product of some other development process.

I'm not sure why recovery code at the top level of the stack needs to be held to a higher standard than recovery code lower down.

Once you have transactional rollback in your database, it too covers all kinds of errors, including your application dumping core.
Yes, because the database has "client connection died" as one of its *expected* errors.

Yep. And catching an error at the top level is one of my "expected" errors.

--
  Darren New / San Diego, CA, USA (PST)
    His kernel fu is strong.
    He studied at the Shao Linux Temple.

--
[email protected]
http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-lpsg

Reply via email to