[jira] [Commented] (NUMBERS-142) Improve LinearCombination accuracy during summation of the round-off errors

Alex Herbert (Jira) Fri, 23 Apr 2021 07:08:06 -0700


    [ 
https://issues.apache.org/jira/browse/NUMBERS-142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17330797#comment-17330797
 ]


Alex Herbert commented on NUMBERS-142:
--------------------------------------

As discussed on the mailing list regarding geometry and numbers [1] I have 
updated LinearCombination to use the dot2s algorithm. This replaces the current 
2-fold precision algorithm that uses array allocation to maintain round-off 
errors. The affect should be a marginal increase in speed with no major loss of 
robustness in dot products involving cancellation.

Changes are in [PR 90.|https://github.com/apache/commons-numbers/pull/90]

[1] [Re: [geometry] 
1.0-beta2|https://lists.apache.org/thread.html/r103ce078505e618511d0f9f1a7d75a0804abbf115ae6ff376a39ca20%40%3Cdev.commons.apache.org%3E]

> Improve LinearCombination accuracy during summation of the round-off errors
> ---------------------------------------------------------------------------
>
>                 Key: NUMBERS-142
>                 URL: https://issues.apache.org/jira/browse/NUMBERS-142
>             Project: Commons Numbers
>          Issue Type: Improvement
>          Components: arrays
>    Affects Versions: 1.0
>            Reporter: Alex Herbert
>            Assignee: Alex Herbert
>            Priority: Minor
>         Attachments: array_performance.jpg, cond_no.jpg, dot.jpg, 
> error_vs_condition_no.jpg, inline_perfomance.jpg, linear_cached.jpg, 
> linear_inline_vs_array.jpg
>
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The algorithm in LinearCombination is an implementation of dot2 from [Ogita 
> el al|http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.2.1547] 
> (algorithm 5.3). There is a subtle difference in that the original dot2 
> algorithm sums the round-off from the products and the round-off from the 
> product summation together. The method in LinearCombination sums them 
> separately (using an extra variable) and adds them at the end. This actually 
> improves the accuracy under conditions where the round-off is of greater sum 
> than the products, as described below.
> The dot2 algorithm suffers when the total sum is close to 0 but the 
> intermediate sum is far enough from zero that there is a large difference 
> between the exponents of summed terms and the final result. In this case the 
> sum of the round-off is more important than the sum of the products which due 
> to massive cancellation is zero. The case is important for Complex numbers 
> which require a computation of log1p(x^2+y^2-1) when x^2+y^2 is close to 1 
> such that log(1) would be ~zero yet the true logarithm is representable to 
> very high precision.
> This can be protected against by using the dotK algorithm of Ogita et al with 
> K>2. This saves all the round-off parts from each product and the running 
> sum. These are subject to an error free transform that repeatedly adds 
> adjacent pairs to generate a new split pair with a closer upper and lower 
> part. Over time this will order the parts from low to high and these can be 
> summed low first for an error free dot product.
> Using this algorithm with a single pass (K=3 for dot3) removes the 
> cancellation error observed in the mentioned use case. Adding a single pass 
> over the parts changes the algorithm from 25n floating point operations 
> (flops) to 31n flops for the sum of n products.
> A second change for the algorithm is to switch to using 
> [Dekker's|https://doi.org/10.1007/BF01397083] algorithm (Dekker, 1971) to 
> split the number. This extracts two 26-bit mantissas from a 53-bit mantis (in 
> IEEE 754 the leading bit in front of the of the 52-bit mantissa is assumed to 
> be 1). This is done by multiplication by 2^s+1 with s = ceil(53/2) = 27:
> big = (2^s+1) * a
> a_hi = (big - (big - a))
> The extra bit of precision is carried into the sign of the low part of the 
> split number.
> This is in contrast to the method in LinearCombination that uses a simple 
> mask on the long representation to obtain the a_hi part in 26-bits and the 
> lower part will be 27 bits.
> The advantage of Dekker's method is it produces 2 parts with 26 bits in the 
> mantissa that can be multiplied exactly. The disadvantage is the potential 
> for overflow requiring a branch condition to check for extreme values.
> It also appropriately handles very small sub-normal numbers that would be 
> masked to create a 0 high part with all the non-zero bits left in the low 
> part using the current method. This will have knock on effects on split 
> multiplication which requires the high part to be larger.
> A simple change to the current implementation to use Dekker's split improves 
> the precision on a wide range of test data (not shown).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (NUMBERS-142) Improve LinearCombination accuracy during summation of the round-off errors

Reply via email to