On Mon, 19 Apr 2021 at 22:30, Gilles Sadowski <gillese...@gmail.com> wrote:

> Le lun. 19 avr. 2021 à 20:26, Matt Juntunen
>
> > What needs to be done on numbers before we're ready for
> > 1.0 (aside from moving over some code from geometry)?
>
> The most basic utilities haven't fundamentally changed.  It
> will be nice to increase the visibility of the many consistency
> and performance improvements.
> Modules to perhaps be left out are also TBD (in another thread).
>

I did a lot of work investigating improving the accuracy of
LinearCombination (see Numbers 142 [1]). This ended by improving the
accuracy of the linear combination that is used inside the Complex class in
a very special use case summing terms close to zero (i.e. where large
cancellations of terms are significant). However that is private to the
class. The main LinearCombination class remains the same which uses an
approximate quad level accuracy (128-bit) sum of linear terms. This could
be changed to improve accuracy at the expense of runtime speed.

Even if we keep the quad level accuracy the method to split the upper and
lower parts of the number before multiplication can be improved to retain
an extra bit of information and support sub-normal numbers. All the code
for variations and the performance benchmarks are in
'o.a.c.numbers.examples.jmh.arrays'. The main LinearCombinations class in
that package has all variants.

Currently the LinearCombination class is implementing a variant of the
dot2s method of Ogita et al (2005). The method for array inputs uses array
allocation in the computation. It also uses a split of the 53-bit double
mantissa (including the leading 1-bit) to two 26-bit doubles. This can be
changed to the reference dot2s method to avoid array allocation and use
Dekker's split to extract a 27-bit double and a 26-bit double with support
for sub-normal numbers. This would be the implementation in the
LinearCominations.Dot2s class.

Alternatively there is a version which computes an exact summation without
using BigDecimal (LinearCominations.ExtendedPrecision using ExactSum for
the summation). It is slower than the current 2-fold precision method but
much faster than using BigDecimal. Changing to this implementation may have
a big impact on the performance of Geometry which uses LinearCombination
extensively.

The Jira ticket has a lot of performance information. This post [2] seems
to summarise the speed relative to the current implementation as:

Value Method Relative
twoD dot2s 0.73
twoD extended 1.13
threeD dot2s 0.75
threeD extended 1.48
fourD dot2s 0.75
fourD extended 1.68
nD dot2s 0.56
nD extended 2.95

So an exact dot2s implementation is faster and the extended precision
implementation is about 3x slower on long arrays but not much slower on
small combinations. Here the extended precision sum is accurate to 1 ULP.
Making it exact to 0 ULP (extended2) takes a bit more time and the
performance is summarised later as:

"Using the extended2 method versus the fast dot2s will have a performance
impact dependent on the condition number of the dot product and the length
of the dot product. For short lengths (2 to 4) the method is about 1.2 to
2.5x slower. Long dot products are much slower (8x). The runtime is
insignificant compared to using BigDecimal which is a magnitude slower."

Any opinions on how changing LinearComination may affect Geometry? Either
we clean up the implementation to use the fast dot2s algorithm with correct
support for splitting 53-bit mantissas, or switch to the extended precision
version. But I do not think we should leave it as the current
implementation which has disadvantages against either of the alternatives.

Alex


[1] https://issues.apache.org/jira/browse/NUMBERS-142
[2]
https://issues.apache.org/jira/browse/NUMBERS-142?focusedCommentId=17040111&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17040111

Reply via email to