Single root for Exceptions WAS Re: [math] UnexpectedNegativeIntegerException

Phil Steitz Tue, 28 Aug 2012 13:37:58 -0700

On 8/28/12 5:36 AM, Luc Maisonobe wrote:
>  
<snip/>
> I thought I had. Perhaps this feature was set up after I gave up
> on this discussion.
>>>> It would be quite easy to change, if it would make your life easier.
>>>>
>>>> The more so that I never saw what is gained from copying the Java hierachy
>>>> (in the particular case of the exceptions): Because some exception inherits
>>>> from the Java standard one does not bring special benefits to the
>>>> application that has to catch that exception. I mean: Is there any piece of
>>>> code that would behave differently if it caught "IllegalArgumentException"
>>>> vs "IllegalStateException"? If not, it could as well be prepared to catch
>>>> a "MathRuntimeException" (and do the same thing).
>>>> [The various exception types are primarily there to discriminate between
>>>> various _problems_; but are not likely helpful to help the caller devise a
>>>> way to react to the exception once it is raised (other than acknowledge the
>>>> fact than CM could not perform the requested action).]
>>>>
>>>> In CM, the vastly overwhelming majority of exceptions are instances of
>>>> "MathIllegalArgumentException" or one of its subclasses.
>>>>
>>>> We have a "NullArgumentException" but we also agreed that it did not have 
>>>> to
>>>> be a subclass of the standard Java "NullPointerException". So in this case,
>>>> we already depart from the "standard". [But we also speculated that the
>>>> policy could to never check for "null" and let the JVM do that, This 
>>>> behaviour
>>>> is _not_ consistent throughout CM.]
>>>>
>>>> Number of occurrences of CM exceptions that are subclasses of those Java
>>>> standard exceptions:
>>>>  * IllegalStateException (43)
>>>>  * UnsupportedOperationException (22)
>>>>  * ArithmeticException (54)
>>>>
>>>> In summary, I have no problem with a "MathRuntimeException" base class 
>>>> which
>>>> "MathIllegalArgumentException", "MathIllegalStateException",
>>>> "MathUnsupportedOperationException", "MathArithmeticException" would 
>>>> inherit
>>>> from.
>>>>
>>>> Applications that call CM would be safe (apart from bugs raising "NPE")
>>>> with a unique catch clause intercepting "MathRuntimeException".
>>> I am happy (and surprised) to read that.
>>> I would really much like to go back to a single root exception
>>> hierarchy. This both helps top level application as depending on context
>>> they can either pinpoint the exception they want to catch or they can
>>> have a grab all strategy. It is their choice.
>> I like throwing (and catching) standard exceptions instead of
>> inventing variants of them, which is why I favored having MathIAE
>> inherit from IAE, etc.  I would have preferred to just throw IAE
>> directly, but we could not agree on how to do that and preserve
>> localization, so we ended up with the current setup where we have
>> custom variants, but they inherit from the standard exceptions.  I
>> am curious, Luc, about exactly what kinds of use cases will really
>> be easier / better for users if we go back to a single-rooted
>> hierarchy.  I get that instead of "catch Exception" or "catch
>> RuntimeException" you can at the top level "catch MathRTE" and that
>> will catch only the exceptions that come (at least originally) from
>> the [math] code.
> Yes, but this is only one aspect.
>
>> Can you help me understand via an example how that
>> is a big benefit that is worth more than being able to "catch IAE"
>> or "catch IOE" directly?
> One of the problems I encounter occurs when building large applications
> with several components layers. At an intermediate level, say just above
> [math], developers know what they are calling and they may decide to
> catch an exception they know about, if they are able to identify it is
> thrown (which is not always obvious). They may also decide the exception
> cannot be handled at their level and simply let it propagate upward. As
> you go upward in the software layers, with different development teams,
> you lose this knowledge and people don't even understand anything about
> mathematics. They can however still catch some large scope exceptions,
> one type per component (say a MathRuntimeException, and a MylibException
> if they know these two sub-components are used). They won't do anything
> with the exception but nicely display them in the graphical user
> interface and stop the application. This works well as long as there is
> one single root per library, but it does not scale with 40 different
> exceptions per libraries.


I think I get your point, but again given transitive / nested
dependencies I would not want to depend on it, even if all of the
components have single-rooted exception hierarchies.  This is
especially true if not all components adhere to the "wrap
everything" rule - i.e., if they can generate and/or propagate RTEs
that do not inherit from their base exception class.  From the
standpoint of the caller, for example, what is the difference
between [math]

0)  throwing IAE
1)  throwing MathIAE derived from IAE
2)  throwing MathIAE derived from MathRTE (base)
assuming that [math] is not signing up to wrap and rethrow every
exception - including IAE - we get from JDK classes?  Will the
caller actually do anything different if the RTE is math-wrapped vs
"naked" but coming out of the [math] code?  I understand that the
try/catch may be several layers removed from the code calling a
[math] API. 

Same applies to NPE, which we don't subclass now, but mostly handle
as IAE.

I guess one thing we might consider is trying to design for the
invariant that we never propagate RTEs without wrapping.  But that
would be a lot of work to retrofit and would have a performance cost.

> Another problem is maintenance. Even if you consider the intermediate
> developer did his work really accurately and managed to identify all
> exceptions thrown by the methods he calls in one version of Apache
> Commons Math. When we change an error detection and decide that a method
> that did throw only MaxCountExceededException a method should throw
> NumberIsToolLargeException instead (or in addition to the existing one),
> then the calling code would still compile, but the new exception would
> now go all the way upward. The two exceptions have no common ancestor
> that can be catched, except Exception itself. With a single rooted
> hierarchy, users can use some defensive programming: they can catch the
> common root and be safe when we change some internal details.
>
> A single root would also bring two things I find useful.
>
> The first useful thing is that the ExceptionContextProvider could be
> implemented at the root level, so we could retrieve this context (in
> fact, I sometime needs even to retrive the pattern and the arguments
> from the context, and we also miss getters for that, but they are easy
> to add). It is not possible to catch ExceptionContextProvider because it
> is not a throwable (Throwable is a class, not an interface, so we
> inherit the Throwable nature from the top level class, not as
> implementing the ExceptionContextProvider interface.
>
> The second useful thing is for [math] development itself. With a single
> root, we can temporarily change its parent class from RuntimeException
> to Exception, then fix all missing throws declaration and javadoc, then
> put the parent class back before committing. This would help having more
> up to date declarations. For now, I am sure we have missed a lot of our
> own exceptions and let them propagate upward without anybody knowing it.
> As a test, I have just changed the parent for
> MathIllegalArgumentException to Exception. I got 1384 compilation
> errors. Just going to the first one (a constructor of
> BaseAbstractUnivariateIntegrator), I saw we did not advertise the fact
> it may throw NumberIsTooSmallException and NotStrictlyPositiveException,
> neither in a throws declaration nor in the javadoc. I did not look at
> the 1383 other errors...

This is a good point.
>
>> What I am missing is how knowing that an
>> aspecific RTE came from within [math] makes a difference.  I am
>> skeptical about ever depending on that kind of conclusion because
>> dependencies may bring [math] code in at multiple levels.  Also, is
>> there an implied assumption in your ideal setup that *no* exceptions
>> propagate to [math] clients other than MRTE (i.e. we catch and wrap
>> everything)?
> No, I don't make this assumption. I consider that at upper levels, code
> can receive exception from all layers underneath ([math] at the very
> bottom, but also other layers in between). With two or three layers, you
> can still handle a few library-wide exceptions (see my example with
> MathRuntimeException, and MylibException above). However, if at one
> level the development rules state that all exception must be caught and
> wrapped (this happens in some critical systems contexts), then a single
> root hierarchy helps a lot.

But if we allow some exceptions to propagate unwrapped, this does
not work, unless I am missing the point here.
>
> My point is that with a single root, we can get the best of two worlds:
> large scope catches and pinpointed catches. The choice remains open for
> users. With a multi-rooted hierarchy, we force users to duplicate the
> same work for all exceptions we may throw, and we also force them to
> recheck everything when we publish a new version, even despite we
> ourselves fail to document these exceptions accurately.

We need to fix the documentation.  If going back to a single root
makes automatic detection of gaps possible, that by itself is almost
enough to get me to agree to go back to the single root.  Your
arguments above (which I honestly only partially follow) are enough
to make me +0 for this change.  I think I probably put too much
weight on favoring standard exceptions when we are really only
talking about "reinventing" a handful of them.

Phil
>
> best regards,
> Luc
>
>> Phil
>>> For sure, this is something that can be done only for a major release.
>>>
>>>>>> Client apps cannot do more with checked exceptions, and can be made as
>>>>>> "safe" by wrapping calls in try-blocks.
>>>>>> On the other hand, client source code is much cleaner without unnecessary
>>>>>> "throws" clauses or wrapping of checked expections at all levels.
>>>>>> Some Java experts go as far as saying that checked exceptions were a
>>>>>> language design mistake (never repeated in languages invented more
>>>>>> recently).
>>>>>>
>>>>>>> There is a reason that NaNs exist.  It is much cheaper to return a
>>>>>>> NaN than to raise (and force the client to handle) an exception. 
>>>>>>> This is not precise and probably can't be made so, but I have always
>>>>>>> looked at things more or less like this:
>>>>>>>
>>>>>>> 0) IAE (which I see no need to specialize as elaborately as we have
>>>>>>> done in [math]) is for clear violation of the documented API
>>>>>>> contract.  The actual parameters "don't make sense" in the context
>>>>>>> of the API.
>>>>>> The "elaboration" is actually very basic (but that's a matter of taste), 
>>>>>> but
>>>>>> it was primarily promoted (by me) in order to hide (as much as possible) 
>>>>>> the
>>>>>> ugliness (another matter of taste) of the "LocalizedFormats" enum, and 
>>>>>> its
>>>>>> inconsequent use (duplication). [Cf. discussions in the archive.]
>>>>>>
>>>>>>> 1) NaN can be returned as the result of a computation which, when
>>>>>>> started with legitimate arguments, does not result in a
>>>>>>> representable value.
>>>>>> According to this description, Sébastien's case _must_ be handled by an
>>>>>> exception: the argument is _not_ legtimate.
>>>>>> The usage of NaN I was referring to is to let a computation proceed 
>>>>>> ("follow
>>>>>> an unexceptional path") in the hope that the final result might still be
>>>>>> meaningful.
>>>>>> If the NaN persists, not checking for it and signalling the problem (i.e.
>>>>>> raise an exception) is a bug. This is to avoid that (and be robust) that 
>>>>>> we
>>>>>> do extensive precondition checks in CM. But this has the unavoidable
>>>>>> drawback that the use of NaN as suggested is much less likely to be 
>>>>>> feasible
>>>>>> when calling CM code. Once realizing that, it becomes much less obvious 
>>>>>> that
>>>>>> there is _any_ advantage of letting NaNs propagate...
>>>>>> [Anyone has an example of NaN usage? Please let me know.]
>>>>> I use NaN a lot as an indicator that a variable has not been fully
>>>>> initialized yet. This occurs for example in iterative algorithms, where
>>>>> some result is computed deep inside some loop and we don't know when the
>>>>> loop will end. Then I write something along these lines:
>>>>>
>>>>>   while (Double.isNaN(result)) {
>>>>>      // do something that hopefully will change result to non-NaN
>>>>>   }
>>>>>
>>>>>   // now I know result has been computed
>>>>>
>>>>> Another use is to initialize some fields in class to values I know are
>>>>> not meaningful. I can then us NaN as a marker to do lazy evaluation for
>>>>> values that takes time to compute and should be computed only when both
>>>>> really needed and when everything required for their computation is
>>>>> available.
>>>> I should have said "[...] example of NaN usage, beyond singling out
>>>> unitialized data [...]". The above makes use of NaN as "invalid" because it
>>>> is not initialized (yet).
>>> Yes.
>>>
>>>> I'd assume that if "result" stays NaN after the allowed number of
>>>> iterations, you raise an exception, i.e. you don't propagate NaN as the
>>>> output of a computation that cannot provide a useful result. However, this
>>>> (propagating NaN) is the behaviour of "srqt(-1)", for example.
>>>> Thus, if you raise an exception, your computation does not behave in the
>>>> same way as the function "sqrt".
>>>>
>>>>> Another use is simply to detect some special cases in computations (like
>>>>> sqrt(-1) or 0/0). I do the computation first and check the NaN
>>>>> afterwards. See for example the detection of NaNs in the linear
>>>>> combinations in MathArrays or in the nth order Brent solver.
>>>> OK, this is a good example, in line with the intended usage of NaN (as it
>>>> avoids inserting control structures in the computation).
>>> Yes. One of the main use case for this is when a computation involves a
>>> loop and failure is very rare. So we avoid costly numerous if statements
>>> within the loop and do a single check. In the few cases this single
>>> check fails, we go to a diffrent branch to handle the failure. This is
>>> exactly what is done in linear combination.
>>>
>>>>> Another use of NaNs occurs when integrating various code components from
>>>>> different origins in a single application. Data is forwarded between the
>>>>> various components in all directions. Components never share the same
>>>>> exceptions mechanisms. Components either process NaNs specially (which
>>>>> is good) or they let the processor propagate them (it is what the IEEE
>>>>> standard mandates) and at the end you can detect it reliably at
>>>>> application level.
>>>> I'm not sure I understand this. Is it good or bad that a component lets 
>>>> NaNs
>>>> propagate? Are there situations when it's good and others where it's bad?
>>> In the cases I encountered, it is always good to have NaNs propagated. A
>>> component that is not an application by itself but only a part (low or
>>> intermediate level) often cannot decide at its level how to handle NaNs
>>> except in rare cases. So it propagates them upward. The previous example
>>> (linear combination in [math]) is of course a counter-example: we are at
>>> low level, we know how to handle the NaN for this operation, so we
>>> detect it and fix it.
>>>
>>>> That's why I was asking (cf. quote from previous post below) what are the
>>>> criteria, so that contributors know how to write code when the feature 
>>>> falls
>>>> in one or the other category.
>>>>
>>>>>>> The problem is that contracts can often be written so that instances
>>>>>>> of 1) are turned into instances of 0).  Gamma(-) is a great
>>>>>>> example.  The singularities at negative integers could be viewed as
>>>>>>> making negative integer arguments "illegal" or "nonsense" from the
>>>>>>> API standpoint,
>>>>>> They are just nonsense (not just from an API standpoint).
>>>>>>
>>>>>>> or legitimate arguments for which no well-defined,
>>>>>>> representable value can be returned.  Personally, I would prefer to
>>>>>>> get NaN back from this function and just point out the existence of
>>>>>>> the singularities in the javadoc.
>>>>>> This is consistent with how basic math functions behave, but not with the
>>>>>> general rule/convention of most of CM code.
>>>>>> It may be fine that we have several ways to deal with exceptional
>>>>>> conditions, but it might be nice, as with formatting, to have rules so 
>>>>>> that
>>>>>> we know how to write contributions.
>>>>> Too many rules are not a solution, especially when there are no tools to
>>>>> help enforce these rules are obeyed. Relying only on the fact human
>>>>> proof-reading will enforce them is wishful thinking.
>>>>>
>>>> What is "too many"? ["How long should a person's legs be?" ;-)]
>>>> I don't agree with the "wishful thinking" statement; a "diff" could 
>>>> probably
>>>> show a lot a manual corrections to the code and comment formatting. [Mainly
>>>> in the sources which I touched at some point...]
>>> I'm not sure I understand your point. Mine is that rules that are not
>>> backed by automated tools are a pain to enforce, and hence are not
>>> fulfilled most of the time, except at a tremendous human resource cost.
>>> In fact, even rules which can be associated with tools are broken during
>>> development for some time. We do not use
>>> checkstyle/CLIRR/findbugs/PMD/RAT for all commits for example, but do a
>>> fix pass from time to time.
>>>
>>>> There are other areas where there is only human control, namely the "svn
>>>> log" messages where (no less picky) rules are enforced just because it
>>>> helps _humans_ in their change overview task.
>>>>
>>>> As pointed out by Jared, it's not a big problem to comply with rules once
>>>> you know them.
>>> I fully agree with that, but I also think Phil is right when he says too
>>> many rules may discourage potential contributors. I remember a link he
>>> sent to us months ago about to a presentation by Michael Meeks about
>>> interacting with new developers
>>> <http://people.gnome.org/~michael/data/2011-10-13-new-developers.pdf>.
>>> Slides numers 3 an 4 are a fabulous example. I think we are lucky Jared
>>> has this state of mind and accepts picky rules easily. I'm not sure such
>>> an open mind is widespread among potential contributors.
>>>
>>>> Keeping source code tidy is quite helpful, and potential contributors will
>>>> be happy that they can read any CM source files and immediately recognize
>>>> that they are part of the same library...
>>> Yes, of course. But the entry barrier should not be too high.
>>>
>>> best regards,
>>> Luc
>>>
>>>> Best regards,
>>>> Gilles
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>>>> For additional commands, e-mail: dev-h...@commons.apache.org
>>>>
>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>>> For additional commands, e-mail: dev-h...@commons.apache.org
>>>
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>> For additional commands, e-mail: dev-h...@commons.apache.org
>>
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Single root for Exceptions WAS Re: [math] UnexpectedNegativeIntegerException

Reply via email to