Re: [dev] Should assertions abort?

Stephan Bergmann Fri, 12 Feb 2010 01:36:14 -0800

On 02/12/10 09:11, Frank Schoenheit, Sun Microsystems Germany wrote:

issue 109142 (http://www.openoffice.org/issues/show_bug.cgi?id=109142)
requests to change the behavior of assertions (OSL_ASSERT/DBG_ASSERT and
friends) to abort if their condition is not met. The current behavior is
that the assertion text is reported to the user, usually by a message box.
All of the above applies to non-product builds only, in particular, the
request, as I understand it, is *not* to extend assertions to product
builds.

Right, for now I would be fine keeping it the way it is, that assertionsare only active in non-product builds. Mostly on the rationale thatthere might be assertion code in OOo that has negative impact onperformance, so would not be acceptable in product builds. However,contrary to how you read it below, I never intended to suggest thataborting on an assertion was not appropriate in a product build. It ismy firm belief that a program that aborts upon assertions is arguablymore robust than one that carries on (however paradoxical that maysuperficially sound).

Since this is a controversial topic, I think we should discuss it in a
wider audience before actually implementing the change - so, let's get
the party started ...


Actually, /me thinks we should *not* change the behavior as suggested.

An ::std::assert, which also aborts when hit, basically means something
like "My state is that inconsistent and unrecoverable, I can't
legitimately continue execution." In such a case, an abort is justified.
(However, in opposite to what seems to be suggested in the issue, this
case also justifies an abort in the product build, i.e. when running at
the customer's machine. Whether this is something a product like ours
should use more than to a negligible extent, is a separate question,
eligible for an even more controversial discussion).

However, OSL_*, and even more DBG_*, are a little bit different. In
particular, I always understood them (and I know a lot of others used
them that way, too) as "Something didn't go the way it was
intended/expected.". This surely indicates a bug which needs attention,
but it does *not necessarily* imply that an abort is the only legitimate
reaction. Contrary, a lot of code tries to get out of that concrete
situation gracefully, continuing with what still can be done. So, an
abort is not justified here, IMO.

If common understanding indeed is that OSL/DBG asserts flag situationswhere "[s]omething didn't go the way it was intended/expected" *in thesense that it is "a bug,"* then, honestly, for a product like OOo, I seeabsolutely no reason not to abort in such a case. Self healing software(i.e., software that can cope with bugs in the software itself) isprobably great, and there are application domains where the immenseadditional effort to write such software is probably warranted.However, my understanding is that none of us in the OOo developercommunity has deep experience writing such software, that OOo is in anapplication domain where the additional effort is not really warranted,and, ultimately, that any dilettante attempts at it incur severe costsin overall product stability and maintainability (everybody probably hasstories to tell about tracing a bug where some piece of code masked anunrecoverable situation instead of propagating it out).

The thing I would agree on is that there are most probably uses ofOSL/DBG asserts in the OOo code base that do not flag bugs in thesoftware itself, but rather flag inappropriate or otherwise interestingevents, like unexpected user input or a full disk when the user tries tosave a document. These inappropriate asserts have to be changed tothings like OSL_TRACE. My hope would be thatthis---trivial---adjustment would happen over time, whenever somebodystumbles over such an inappropriate assert firing. (And, enablingaborting assertions in product builds should seriously only beconsidered when it is known that there are no obviously inappropriateassertions in the code. One more reason why for now I am fine withkeeping assertions disabled in product builds.)

When asking for the goal of the change, the issue says

  Without such a strict measure, the many assertions occurring in a run
  of OOo (...) would probably never go a away, and the assertion tool
  would remain practically useless (...)

But making assertions abort instead of "just" being reported would
worsen the situation, IMO: We have that many assertions in non-product
builds because a) too few developers actually use non-product builds
(which means they silently introduce new ones) and b) fixing existing
assertions is not a priority [1].
Aborting in case of an assertion would just make a) much worse. Also, it
would put the burden onto the shoulders of those actually *using* the
non-product builds, not of those *causing* the pain. I.e., it would
punish the "wrong" people.


OK, something is obviously missing from my issue description.

I assume that software developers strive to produce quality software,that they are intrinsically motivated, and that they are interested inusing whatever adequate tools are available to help them. Like tests, Iconsider assertions useful tools here. However, as I outlined in theissue, the current state of assertions in OOo is such that they are notuseful, so developers do not consider them an adequate tool, so keepaway. My hope is that once assertions in OOo *are* a useful tool (i.e.,once that backdrop of existing assertion noise is weeded out, and eachassertion a developer sees fire with high likelihood points to an errorin the code she wrote five minutes ago) people will actually use them.

Of course, what I want to reach with that issue is not reached (and theissue not fixed) by simply letting assertions abort. We need a plan howto reach the desired zero assertion state. I consciously left it openfor now how that plan should look like---I simply do not have the timeright now to think about it in the necessary depth.

However, what is my firm belief, is that aborting assertions are theonly viable way to ultimately reach---and then keep---the zero assertionstate. I consider this technical approach a beautiful, highly effectivemeasure. Much more effective than any social approach, which would beprone to be circumvented, and which we have seen fail for too long bynow. (There are analogies to our earlier warning-free code project. Ifcompilers would not technically enforce warning-freedom, we would by nowhave lost it in large parts of the code again.)

Having said all this, I'd say the mentioned issue should be a WONTFIX.


The problem which the issue tries to address, however, is still valid,
and highly legitimate, in my opinion. Assertions are a cheap and
already-existing measure to improve our product quality (and in my
personal opinion, both the 3.1 and the 3.2 release have shown that we
should improve here). Still, too few people use non-product builds,
which means new assertions are introduced, which means more people are
scared away from using non-product builds ... which renders assertions
nearly useless.

So, my suggestions would be the following (admittedly, they're more on a
social level, than on a technical one, and thus harder to enforce):

- Developers should use non-product builds *only*. That's a very
  apparent measure, still, a lot of people don't do. If you ask why,
  often the answer is "it's unusable 'cause of the many assertions".
  Hmm?

- QA should use non-product builds *only*. Yes, I am not kidding about
  this. An assertion which fails indicates a *bug*, that's the very
  original intention of assertions: report bugs. Speaking strictly,
  QA which refuses to use non-product builds refuses to do their job,
  which at least in parts consists of finding bugs.

- Report assertions as you find them, and fix them as you get the issues
  assigned. Give them high priority, as every assertion might be a
  little effort for you to fix, but it is a lot of paint for a lot of
  other people as long as it is not fixed.

- Get *serious* management backup for the three previous items.
  (<insider>That's something which is highly missing since MA left years
  ago :) </insider>)

- Provide non-product builds for release code lines, too. At least do so
  much longer than we did with OOO320 (This item strictly applies to
  Sun-internal workflows.):
  For OOO320, we stopped providing non-product builds at branch-off day,
  which was about half a year before the product was actually released.
  During a significant part of this time, there still happened heavy
  development on that code line, which is prone to introducing new
  assertion failures, going unnoticed in product builds.

- Basically: Understand that assertions are not there to hinder your
  work. They exist to help you (at pretty low costs) finding bugs in
  your code, and thus improving our product - which is something we all
  should be interested in, right?

- Consequently, *use* assertions when writing new code. There's no such
  thing as "too much assertions" in the code - as long as they never
  fire, of course :)

I agree with your points. However, to repeat, I see no easier, moreeffective way to come to and inhabit that wonderful land you describe inyour points than by letting assertions abort.

Frank, assuming we have somehow reached that land (following the plan westill have to work on), would you then still have objections againstaborting assertions?


-Stephan

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [dev] Should assertions abort?

Reply via email to