Christopher Smith wrote:
No, but you'd probably want that CGI script's process to die and core
dump, while the web server goes about its business.
Right. Or, I could use a safe language in a sandbox, and not dump core
at all, but rather be able to provide a nice message back to the
customer saying what happened, and etc.
Either way works. You're either writing the code in your own process
space, or you're writing it in the kernel's process space. I don't see a
compellingly good reason to pick one over the other globally.
This is exactly what I described as happening with our systems. Why does
everyone think dumping core is the end of the world?
I don't. I just prefer to use a safe language where I'm in charge of
dumping core, *and* where when something happens that shouldn't, I know
I *do* get an error.
Okay, if the OS can't clean up file handles, sockets, and memory when
your process dies, I very much doubt you're going to have much luck
doing so in some catch block. Similarly for your database with database
transactions. Seriously, process death is about as safe a way as you can
find to clean up from an undefined error.
I'm not sure I agree. Of course, if that's the goal, then certainly
after you do application-specific cleanup, you can exit your process and
do all the stuff you want too.
But if you don't have a way to catch errors in the first place (such as
array bounds checking, bounds checking on integers, and so on), it
doesn't help nearly as much to know that if you leave the defined subset
of the language so far that the kernel catches you, you find out about it.
Yes, because your interpreter/compiler is inherently having to trust all
kinds of other code beyond the OS to be implemented correctly.
Like what? Mind doesn't.
Most of
that code doesn't have a good way to work from the "I don't trust any of
this stuff" perspective, nor has it been tested in that regard nearly as
much.
Um, I'd have to disagree, I think.
If you're relying on a core dump to detect an error, my process will
dump core in exactly the same way as yours does if there's an error in
stdio or something.
Sure, but there are far more subtle errors that pretty much or only
going to show up in the form of your code finding itself in an
unexpected state.
Certainly. I've had bad sectors in swap space. Yes? So? So the program
dumps core, and the monitor on the other machine stops getting
handshakes, and it dials out to see what happened, and then notifies me
that the process not only encountered an error, but the errror reporting
code failed too.
There are some errors you don't bother trying to recover from
automatically. (Altho that last one I actually wrote a script to
recover from.)
I couldn't agree more. Unexpected errors are very dangerous to try to
recover from, particularly because by their very nature you can't be
sure whether you really have a recoverable problem or not. Even exiting
a process is no guarantee, but it is probably the best you can do.
Yeah. But the error of "division by zero actually means I overwrote my
socket descriptors" isn't high on the list of likely suspects.
When you work with a safe language that's been around for decades,
you're no more likely to encounter an error that drops you into a
completely unknown state than you are to encounter an error in a C
compiler that generates code that drops you into a completely unknown
state. And you deal with them both the same way.
Why do you need a range-checking language?
Because running outside an array is a huge cause of errors, and if I can
have my language or compiler check that instead of doing it by hand,
it's both more efficient and safer.
How about a range checking range?
I'm not sure what that means, but I'd certainly appreciate if my range
checked my ranges and turned down the heat instead of boiling over.
> Is there some magic that comes from having a range-checking
language that calls down to code in a non-range checking language vs.
uses a range-checked iterator that calls down to code written in a
non-range checked language?
Is there some magic in using an OO language instead of
structures-with-function-pointers? Is there some magic in using Lint? Is
there some magic in implementing "automatic pointers"?
Answer: Yes, in the same way there's magic in using device drivers
instead of having every application code pokes to the hardware itself.
Because I've been burned by people doing exactly this kind of well
intentioned coding far too often. Their stack has been corrupted, and
they don't realize it, so instead of simply failing, they try to
"recover" from the problem, only their stack is corrupted, so their
"recovery" that is supposed to just clean things up ends up setting
someone's account balance to zero, or causes the system that provably
can't deadlock to deadlock, etc.
Yes. That's why it really only works well for safe languages, which is
the premise I at least was starting with. If you have an unexpected
error in an unsafe language, you have no idea what state is screwed up.
If you have an unexpected error in a safe language, you can, with high
confidence, continue to use the infrastructure of the language, because
as far as the *language* is concerned, it's not an error.
Is invoking a method on a null pointer in Java as bad as invoking a
method on a null pointer in C++? No. Why? Because it's not an error to
do that in Java.
Of course, Java actually checks that, too, which means that in similar
circumstances, C++ might *not* dump core, you might not *notice* the
unexpected error, and you might just erase all those database
transactions you were worried about, because you've just launched off in
to some random piece of code that doesn't violate the standards the
kernel tries to enforce. You will recognise this as "Pwned" when done
intentionally.
I didn't say that good error handling can't be done, merely that if you
expectation is that you are recovering from a logical error, I'd sure
like the recovery code to be the product of some other development process.
I'm not sure why recovery code at the top level of the stack needs to be
held to a higher standard than recovery code lower down.
Once you have transactional rollback in your database, it too covers
all kinds of errors, including your application dumping core.
Yes, because the database has "client connection died" as one of its
*expected* errors.
Yep. And catching an error at the top level is one of my "expected" errors.
--
Darren New / San Diego, CA, USA (PST)
His kernel fu is strong.
He studied at the Shao Linux Temple.
--
[email protected]
http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-lpsg