On Sun, Dec 30, 2012 at 2:27 PM, Pedro Giffuni <p...@apache.org> wrote:
>
>
> ----- Messaggio originale -----
>> Da: Rob Weir
> ...
>>>
>>>  BTW, I am considering doing something drastic there, like replacing all the
>> probability
>>>  distributions with with boost implementations. Would there be any good
>> reason to
>>>  avoid such approach?
>>>
>>
>> What is the advantage of changing?
>>
>
> Quality: the precision and performance in the boost implementations
> is notable. Most of our older implementations are also unmaintained. If
> you look at the Gamma implementation, for example, you will notice a
> comment that says it is based on the boost implementation. We surely
> haven't kept up with the bug fixes or improvements they may have made
> since then.
>
> The boost implementations also have the option of specifying math policy,
> which is something our current implementations lack.
>
> Quite honestly, I was hesitant to introduce boost stuff in Calc but we
> already depend on it for other things so it comes for free and the code
> is admittedly very good (with only some small issues in their PRNG).
>
>> Risk of any change is introducing a bug.  From a user's perspective
>> any difference in calculation, even if "correct" is something that may
>> cause them to halt their work until they understand why their complex
>> calculation gives an answer that is 0.1% different than AOO 3.4.1.
>>
>
> 0.1% is a huge number: If the new functions produce 0.1 % more correct
>
> results then I would say it's a bugfix and bug fixes are GREAT.
>
> I have to say that some of the current math functions are in poor shape,
> hopefully not 0.1% wrong but still pretty shameful.
>
> We have to verify each and every function we replace but I don't think the
> idea is to produce a crappy Spreadsheet that lies, and producing
> inaccuracies in cases where we can do much better is pretty much lying.
>
>> So we need both accuracy and release-to-release consistency.  Me may
>> improve accuracy and in the process yield results that differ from
>> earlier versions, but this needs to be tracked and communicated to
>> users carefully, so they understand what happened to their
>> spreadsheets.  I don't think we want "improvements" to be a
>> surprise
>> for the user, especially since at that point bugs and improvements are
>> indistinguishable to casual examination.
>>
>
> Of course, there are "Release notes" for that: there are no surprises
> here. Major release numbers bumps are useful precisely to indicate
>
> such changes.
>
> I would also think we will want to keep the legacy implementations
> available in a scaddin. The beauty of opensource is that anyone is free
> to lend a hand and do just that.
>
>> If we don't have a solid test suite to determine whether our
>> calculations are correct or even detect if our calculations differ
>> from release to release then I'm not really in favor of changing the
>> code.  But if we wanted to do a rigorous test of OpenOffice, per the
>> standard, and fix any bugs or inaccuracies that the test suite
>> reveals, then I think we end up with a stronger product, and one where
>> we can safely optimize the routines, knowing that the test cases "have
>> our back".
>>
>
> Please feel free to contribute a spreadsheet that calculates the edge
> cases. Any contribution of that kind is welcome, and that's why this list
> exists.
>

Well, what does boost use for its own testing?  They must do some sort
of testing?   Is there something we can easily convert into a
spreadsheet?  That would help in two ways. since we could test the
existing implementation against the same test cases, to see if there
actually are any issues.   I assume that would be good to know.

> We do have a serious problem in the current Calc: we are depending on
> system libraries for some calculations. This has the disadvantage that the
> results will differ if you do your calculation on Windows or in UNIX with
> none of them being too good in accuracy. Under it's current shape I
> wouldn't recommend a tool like calc for use in serious scientific use.
>
> For the record, some of the math libraries used in some libc implementations
> are derived from Netlib's Cephes and even though modern standards require
> them, they have been rejected for inclusion in FreeBSD due to their low
> quality.
>
> I am pretty sure I know what I am doing here. I have a patch in my box to fix

More bugs come from overconfidence than from a respectful humility and
realization that the work we do is critical for millions of users and
that we should do everything possible to ensure that our code is
tested by more than just our own personal feelings of satisfaction.
IMHO.

> some of the lower hanging fruit in Calc. You will probably not see any of it
> this year, but it will be coming through bugzilla so that you have time to 
> help
> review the changes with real code :)..
>
> Pedro.
>

Reply via email to