On 02/12/10 09:11, Frank Schoenheit, Sun Microsystems Germany wrote:
issue 109142 (http://www.openoffice.org/issues/show_bug.cgi?id=109142)
requests to change the behavior of assertions (OSL_ASSERT/DBG_ASSERT and
friends) to abort if their condition is not met. The current behavior is
that the assertion text is reported to the user, usually by a message box.
All of the above applies to non-product builds only, in particular, the
request, as I understand it, is *not* to extend assertions to product
builds.

Right, for now I would be fine keeping it the way it is, that assertions are only active in non-product builds. Mostly on the rationale that there might be assertion code in OOo that has negative impact on performance, so would not be acceptable in product builds. However, contrary to how you read it below, I never intended to suggest that aborting on an assertion was not appropriate in a product build. It is my firm belief that a program that aborts upon assertions is arguably more robust than one that carries on (however paradoxical that may superficially sound).

Since this is a controversial topic, I think we should discuss it in a
wider audience before actually implementing the change - so, let's get
the party started ...


Actually, /me thinks we should *not* change the behavior as suggested.

An ::std::assert, which also aborts when hit, basically means something
like "My state is that inconsistent and unrecoverable, I can't
legitimately continue execution." In such a case, an abort is justified.
(However, in opposite to what seems to be suggested in the issue, this
case also justifies an abort in the product build, i.e. when running at
the customer's machine. Whether this is something a product like ours
should use more than to a negligible extent, is a separate question,
eligible for an even more controversial discussion).

However, OSL_*, and even more DBG_*, are a little bit different. In
particular, I always understood them (and I know a lot of others used
them that way, too) as "Something didn't go the way it was
intended/expected.". This surely indicates a bug which needs attention,
but it does *not necessarily* imply that an abort is the only legitimate
reaction. Contrary, a lot of code tries to get out of that concrete
situation gracefully, continuing with what still can be done. So, an
abort is not justified here, IMO.

If common understanding indeed is that OSL/DBG asserts flag situations where "[s]omething didn't go the way it was intended/expected" *in the sense that it is "a bug,"* then, honestly, for a product like OOo, I see absolutely no reason not to abort in such a case. Self healing software (i.e., software that can cope with bugs in the software itself) is probably great, and there are application domains where the immense additional effort to write such software is probably warranted. However, my understanding is that none of us in the OOo developer community has deep experience writing such software, that OOo is in an application domain where the additional effort is not really warranted, and, ultimately, that any dilettante attempts at it incur severe costs in overall product stability and maintainability (everybody probably has stories to tell about tracing a bug where some piece of code masked an unrecoverable situation instead of propagating it out).

The thing I would agree on is that there are most probably uses of OSL/DBG asserts in the OOo code base that do not flag bugs in the software itself, but rather flag inappropriate or otherwise interesting events, like unexpected user input or a full disk when the user tries to save a document. These inappropriate asserts have to be changed to things like OSL_TRACE. My hope would be that this---trivial---adjustment would happen over time, whenever somebody stumbles over such an inappropriate assert firing. (And, enabling aborting assertions in product builds should seriously only be considered when it is known that there are no obviously inappropriate assertions in the code. One more reason why for now I am fine with keeping assertions disabled in product builds.)

When asking for the goal of the change, the issue says

  Without such a strict measure, the many assertions occurring in a run
  of OOo (...) would probably never go a away, and the assertion tool
  would remain practically useless (...)

But making assertions abort instead of "just" being reported would
worsen the situation, IMO: We have that many assertions in non-product
builds because a) too few developers actually use non-product builds
(which means they silently introduce new ones) and b) fixing existing
assertions is not a priority [1].
Aborting in case of an assertion would just make a) much worse. Also, it
would put the burden onto the shoulders of those actually *using* the
non-product builds, not of those *causing* the pain. I.e., it would
punish the "wrong" people.

OK, something is obviously missing from my issue description.

I assume that software developers strive to produce quality software, that they are intrinsically motivated, and that they are interested in using whatever adequate tools are available to help them. Like tests, I consider assertions useful tools here. However, as I outlined in the issue, the current state of assertions in OOo is such that they are not useful, so developers do not consider them an adequate tool, so keep away. My hope is that once assertions in OOo *are* a useful tool (i.e., once that backdrop of existing assertion noise is weeded out, and each assertion a developer sees fire with high likelihood points to an error in the code she wrote five minutes ago) people will actually use them.

Of course, what I want to reach with that issue is not reached (and the issue not fixed) by simply letting assertions abort. We need a plan how to reach the desired zero assertion state. I consciously left it open for now how that plan should look like---I simply do not have the time right now to think about it in the necessary depth.

However, what is my firm belief, is that aborting assertions are the only viable way to ultimately reach---and then keep---the zero assertion state. I consider this technical approach a beautiful, highly effective measure. Much more effective than any social approach, which would be prone to be circumvented, and which we have seen fail for too long by now. (There are analogies to our earlier warning-free code project. If compilers would not technically enforce warning-freedom, we would by now have lost it in large parts of the code again.)

Having said all this, I'd say the mentioned issue should be a WONTFIX.


The problem which the issue tries to address, however, is still valid,
and highly legitimate, in my opinion. Assertions are a cheap and
already-existing measure to improve our product quality (and in my
personal opinion, both the 3.1 and the 3.2 release have shown that we
should improve here). Still, too few people use non-product builds,
which means new assertions are introduced, which means more people are
scared away from using non-product builds ... which renders assertions
nearly useless.

So, my suggestions would be the following (admittedly, they're more on a
social level, than on a technical one, and thus harder to enforce):

- Developers should use non-product builds *only*. That's a very
  apparent measure, still, a lot of people don't do. If you ask why,
  often the answer is "it's unusable 'cause of the many assertions".
  Hmm?

- QA should use non-product builds *only*. Yes, I am not kidding about
  this. An assertion which fails indicates a *bug*, that's the very
  original intention of assertions: report bugs. Speaking strictly,
  QA which refuses to use non-product builds refuses to do their job,
  which at least in parts consists of finding bugs.

- Report assertions as you find them, and fix them as you get the issues
  assigned. Give them high priority, as every assertion might be a
  little effort for you to fix, but it is a lot of paint for a lot of
  other people as long as it is not fixed.

- Get *serious* management backup for the three previous items.
  (<insider>That's something which is highly missing since MA left years
  ago :) </insider>)

- Provide non-product builds for release code lines, too. At least do so
  much longer than we did with OOO320 (This item strictly applies to
  Sun-internal workflows.):
  For OOO320, we stopped providing non-product builds at branch-off day,
  which was about half a year before the product was actually released.
  During a significant part of this time, there still happened heavy
  development on that code line, which is prone to introducing new
  assertion failures, going unnoticed in product builds.

- Basically: Understand that assertions are not there to hinder your
  work. They exist to help you (at pretty low costs) finding bugs in
  your code, and thus improving our product - which is something we all
  should be interested in, right?

- Consequently, *use* assertions when writing new code. There's no such
  thing as "too much assertions" in the code - as long as they never
  fire, of course :)

I agree with your points. However, to repeat, I see no easier, more effective way to come to and inhabit that wonderful land you describe in your points than by letting assertions abort.

Frank, assuming we have somehow reached that land (following the plan we still have to work on), would you then still have objections against aborting assertions?

-Stephan

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to