-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Darren New wrote:
> Christopher Smith wrote:
>> Why not, since you just catch your errors and continue about your
>> business without exiting?
> 
> Wow. OK. We've hit strawman overflow here. Since you don't seem to be
> actually reading my answers, I'm going to stop writing them.

That wasn't a strawman here. If you catch and error, log and recover
without exiting, all is well as long as you are right that memory hasn't
been corrupted. However, if memory is corrupted, then you may continue
processing your next request (which may very well not be tied to the
same user) and unwittingly share some bits of memory from a previous
request. This is one of the key reasons we do exit when we think there's
a chance we have hit an unexpected error. Even if the odds of it
happening are 1 in 100, those are lousy odds when private customer data
is at stake.

>> I guess it depends on your development process. We try to cover
>> anticipated error conditions, and when an unanticipated one shows up, we
>>  normally get a core file, learn about the problem, and either eliminate
>> it or add it to the list of anticipated error conditions. This tends to
>> leave us with hardware problems being the most likely cause fairly early
>> on in the software's lifecycle.
> 
> I do the same thing, except I write my own stack trace to my own logging
> mechanism.

Then how did you reach the conclusion that the odds of there being a
hardware problem were so unlikely?

> I think it's way, way harder to have a C program with no buffer
> overflows than a similarly functional Java program with no buffer
> overflows. Wouldn't you agree?

I'd agree. With C++, it is about as easy (one of the oddities of C++ is
that while it is often easy to do the wrong thing, it is also often easy
to do the right thing). With C++ it is a heck of a lot easier than Java
to avoid resource leaks for anything besides memory or synchronization
(and when you compare experiences with valgrind verses various Java heap
profilers... I'm not so sure about the memory thing either).

With C++ it is a heck of a lot easier than Java to do things like
compile time verification of invariants, which again removes a whole
class of errors off the table. It is not at all clear to me from my
experience that one inherently leads to less error prone code than the
other once one has mastered the language (no question that C++ is more
error prone for beginners, particularly if those beginners already know C).

> No question there either. I'd be experimenting with Haskell right now if
> it supported the kinds of things I need to do, like unicode, web access,
> and so on.

Sounds like you ought to start experimenting with Haskell then. Unicode,
and web access are there. As is, I'm guessing whatever else you might
require.

>>>>> But if you don't have a way to catch errors in the first place
>>>>> (such as
>>>>> array bounds checking, bounds checking on integers, and so on)
>>>> I assure you that C++ is capable of doing such things.
>>> Sure, but you have to get it right Every Single Time.
>>
>> Actually, every language needs to get it right every single time.
> 
> Yep. But not every program needs to get it right every single time. How
> many applications are written, compared to how many STLs or GCCs?

Yes, this is why template libraries are so nice.

>> Perhaps we aren't in disagreement then. I had thought your point was
>> that if you are using C or C++ that you "don't have a way to catch
>> errors in the first place (such as array bounds checking, bounds
>> checking on integers, and so on)".
> 
> Right. You don't. All you can do is reduce the number of places that you
> do those things, and wrap stuff up in layers of checking code, and hope
> you got all those places. By the time you can guarantee that you got all
> those places (including in other peoples' code that you might not have
> the ability to wrap up), you've written a safe language in C++, and
> you're good to go.

Most of the dangerous stuff is already wrapped for you, and besides, as
all the good metaprogramming books teach you, writing a program *is*
about defining a language.

>> What if an unsafe library corrupts memory without generating a segv?
> 
> Then I'm in exactly the same situation you are with your entire program.
> Except my unsafe libraries are small, and yours are the size of your
> application, except to the extent that you implemeted a safe
> subset/library and stick to it.

No, the difference is my programs conservatively assume this is what may
be the cause of an unexpected error and so they exit, while yours seem
to optimistically assume that it'll all work out if you just unwind the
stack enough.

>>> You would be amazed. When's the last time you checked your swap space
>>> for errors?  What if there's a read error one out of say 1000 times?
>>
>> Oh, I know it happens. I'm just saying that it is not exactly subtle.
> 
> In my experience it can be, yes. I mean, if the graphics artist in the
> next cubicle complained Gimp dumped core once a week, would you say to
> yourself "Gee, maybe his swap partition has a weak sector", or would you
> say "Gee, there must be a bug in the Gimp"?

Yeah, we're working with definitions of subtle. I'm talking about the
kind of subtle where the OS doesn't automatically trigger a core dump.

> The fact that Java arrays are bizarre isn't due to the fact that they're
> range checked, tho. It's due to the fact that Java hates data.

No, if that were the case all data structures in Java would be equally
broken. Arrays get special treatment as being particularly broken. ;-)

>> Fine, you got me. It's manual coding. I manually write it once, work on
>> getting it right once, and then never worry about writing it again.
>> It is also a hell of a lot less code and a hell of a lot less error
>> prone code than I'd write in a number of "safe" languages.
> 
> But the ugliness of those languages isn't due to them being safe. It's
> due to them being crappy languages.

My point is that the "safeness" of a language is actually pretty
meaningless from a practical standpoint once you've encountered an
unknown error.

>> Oh that's impossible. That's why you never find a garbage collecting
>> runtime implementing in C++.... oh wait.
> 
> It wasn't an accusation. It was a question. I've asked it many times of
> many C++ experts, and every time I ask it, eventually the person who is
> telling me about how easy it is admits it essentially doesn't work
> without out-of-library (i.e., in the compiler) runtime support. Tell me
> of a runtime GC in C++ that actually isn't just scanning thru memory and
> keeping any page that any integer in some other page might be a pointer
> for. I'd like to read how it's done without using "undefined" behavior
> like indexing a pointer off an array to see what's stored in some other
> instance's memory.

Well, the easiest way is to have reference counting handles in to the
graph as a whole and/or having your inter node links be weak links.

Alternatively, you can use techniques much like those used to implement
VM's for other languages: a custom heap allocator that tags memory as it
allocates it. Even better is to manage your own stacks (you can do
things similar to how libunwind does it). You can override  operator new
and delete to tag all your heap allocations, and then combine that with
a base class that tracks all your allocations and destructor calls so
that you can maintain a proper root-set for all your stacks. Then you
can do the usual mark-and-sweep algorithms in a precise way. I've seen
some really hairy solutions that used a combination of templates and
macros to divine the structure of classes at compile time, but I tend to
 find ways to break that kind of stuff, so I try to avoid it. ;-)

A college roomate of mine actually got a working multi-threaded,
generational, incremental mark-and-sweep precise automatic memory
manager for C++ to work (search for Warne's GC, there are still some old
web pages that mention it in passing). Unfortunately he was a mess and
his work is, AFAIK, completely lost. However, he did prove that it'd
work, although you had to use his operator overloaded smart pointers for
your pointer arithmetic to avoid having to fallback to a conservative mode.

None of the solutions are perfect, much like GC tends not to be perfect.

> As far as I've been able to determine, reference counting is the only GC
> mechanism that is supported without modifications to the runtime or
> compiler by C++, and that isn't capable of handling arbitrary graph
> objects. If there's a different GC mechanism that works, point me to it.

There are algorithms for dealing with cyclical references. Not exactly
efficient, but they work.

> The "smart pointers" in STL aren't (as far as I can understand by
> reading them) smart, except to the extent that assignment erases the
> source pointer, yes?

The only pointer type in the STL are auto_ptr's, which are "smart" only
for the loosest interpretations of "smart". ;-)

> Or, when done with the graph, invoke the GC manually and have it close
> the open files for you.

...and this is how "safe" languages trade one set of problems for a new
set. ;-)

>> Bugs in my code are of course very common. Bugs that produce the kind of
>> unexpected errors like a null pointer where there shouldn't be one...
>> not so much. Bugs that exercise undefined parts of a language? Damn rare.
> 
> Me too. But it takes me a lot more effort to get that right than it does
> using a higher-level language. It's a shame there aren't more
> programmers of our quality out there, or there would be far fewer buffer
> overrun security holes.
> 
>> Object foo = new Object();
>>
>> if (foo == null) {
>>     //it you reach here, dumping and exiting is a good idea
>> }
> 
> If you actually code checks for that sort of thing in your C++ code,
> you're spending a lot of time you could be using more productively,
> methinks. :-)

Well of course not. Instead what happens is you get a null pointer
exception when you try to invoke something on foo. I was just trying to
be as clear as possible about what I meant.

The above example was intended to be Java code, btw.

> (Yes, I know that particular example isn't impossible in C++.)

Well, it is if the implementation is standards compliant and you haven't
already done something "undefined" in another thread.

>> Okay, how about that it will accomplish much the same results as your
>> "safe" languages?
> 
> Sure. If you program very defensively in C++, you can get some code
> that's pretty robust vs programmer error, assuming the programmers stick
> to using the stuff you designed. The latter part is hard.

No harder than to get programmers to stick to your "safe" language.
Arguably easier if they are more familiar with C++.

>> It's pretty tricky to fix an error you don't expect. I'd really like to
>> see an example of how this works for you.
> 
> I think I already gave a few examples. Go visit a web page with broken
> javascript code and watch your browser not dump core.

Sadly, they often do, or worse, they just hang. ;-)
Still, to the extent that they don't... most browsers are implemented in
C++.

> I realized you'd probably just say "well, that was an
> expected error."

Yeah, I'm convinced now that this is a semantic argument.

- --Chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGbk0kOagjPOywMBARAiruAKCd1APRXcqg3qjzDe5pW2I1K720oQCePy5l
z8kvm7BWlUJmWj9YAB8bvnE=
=Cl/m
-----END PGP SIGNATURE-----

-- 
[email protected]
http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-lpsg

Reply via email to