On Mon, 19 Apr 2021 at 22:30, Gilles Sadowski <gillese...@gmail.com> wrote:
> Le lun. 19 avr. 2021 à 20:26, Matt Juntunen > > > What needs to be done on numbers before we're ready for > > 1.0 (aside from moving over some code from geometry)? > > The most basic utilities haven't fundamentally changed. It > will be nice to increase the visibility of the many consistency > and performance improvements. > Modules to perhaps be left out are also TBD (in another thread). > I did a lot of work investigating improving the accuracy of LinearCombination (see Numbers 142 [1]). This ended by improving the accuracy of the linear combination that is used inside the Complex class in a very special use case summing terms close to zero (i.e. where large cancellations of terms are significant). However that is private to the class. The main LinearCombination class remains the same which uses an approximate quad level accuracy (128-bit) sum of linear terms. This could be changed to improve accuracy at the expense of runtime speed. Even if we keep the quad level accuracy the method to split the upper and lower parts of the number before multiplication can be improved to retain an extra bit of information and support sub-normal numbers. All the code for variations and the performance benchmarks are in 'o.a.c.numbers.examples.jmh.arrays'. The main LinearCombinations class in that package has all variants. Currently the LinearCombination class is implementing a variant of the dot2s method of Ogita et al (2005). The method for array inputs uses array allocation in the computation. It also uses a split of the 53-bit double mantissa (including the leading 1-bit) to two 26-bit doubles. This can be changed to the reference dot2s method to avoid array allocation and use Dekker's split to extract a 27-bit double and a 26-bit double with support for sub-normal numbers. This would be the implementation in the LinearCominations.Dot2s class. Alternatively there is a version which computes an exact summation without using BigDecimal (LinearCominations.ExtendedPrecision using ExactSum for the summation). It is slower than the current 2-fold precision method but much faster than using BigDecimal. Changing to this implementation may have a big impact on the performance of Geometry which uses LinearCombination extensively. The Jira ticket has a lot of performance information. This post [2] seems to summarise the speed relative to the current implementation as: Value Method Relative twoD dot2s 0.73 twoD extended 1.13 threeD dot2s 0.75 threeD extended 1.48 fourD dot2s 0.75 fourD extended 1.68 nD dot2s 0.56 nD extended 2.95 So an exact dot2s implementation is faster and the extended precision implementation is about 3x slower on long arrays but not much slower on small combinations. Here the extended precision sum is accurate to 1 ULP. Making it exact to 0 ULP (extended2) takes a bit more time and the performance is summarised later as: "Using the extended2 method versus the fast dot2s will have a performance impact dependent on the condition number of the dot product and the length of the dot product. For short lengths (2 to 4) the method is about 1.2 to 2.5x slower. Long dot products are much slower (8x). The runtime is insignificant compared to using BigDecimal which is a magnitude slower." Any opinions on how changing LinearComination may affect Geometry? Either we clean up the implementation to use the fast dot2s algorithm with correct support for splitting 53-bit mantissas, or switch to the extended precision version. But I do not think we should leave it as the current implementation which has disadvantages against either of the alternatives. Alex [1] https://issues.apache.org/jira/browse/NUMBERS-142 [2] https://issues.apache.org/jira/browse/NUMBERS-142?focusedCommentId=17040111&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17040111