Re: [julia-users] ANN: Float128 and Float{256,512,1024} for common math

Fredrik Johansson Sun, 03 Jan 2016 12:10:54 -0800

On Saturday, January 2, 2016 at 5:50:58 PM UTC+1, Scott Jones wrote:
>
> This is very interesting!  I'm curious as to how it will compare to Unums, 
> as it seems both Fredrik Johannson's Arb and Unums are trying to fix a lot 
> of the same problems with current floating point.
>


Arb is a library for arbitrary-precision interval arithmetic, the most 
significant difference compared to e.g. MPFI being that it uses a mid-rad 
representation instead of an inf-sup representation.

The mid-rad representation is worse for representing "wide" intervals, but 
superior for "narrow" intervals. It goes very well together with 
high-precision arithmetic. If you're doing floating-point arithmetic with 
128+ bits of precision, you may as well carry along a low-precision error 
bound, because updating the error bound costs much less than the 
high-precision midpoint arithmetic.

Mid-rad arithmetic an old idea (there are earlier implementations in 
Mathemagix, iRRAM, and probably others). There is a nice paper by Joris van 
der Hoeven on "ball arithmetic" [1]. It should be noted that Arb does "ball 
arithmetic" for real numbers, and builds everything else out of those real 
numbers, rather than using higher-level "balls"; you get rectangular 
enclosures (and not mid-rad disks) for complex numbers, entrywise error 
bounds (and not matrix-norm "matricial balls") for matrices, and so on. 
This has pros and cons.

A few the technical details in the blog post that Stefan Karpinski linked 
to are outdated now; I've rewritten the underlying floating-point 
arithmetic in Arb to improve efficiency. It now uses a custom 
arbitrary-precision floating-point type (arf_t) for midpoints and a 
fixed-precision unsigned floating-point type (mag_t) for error bounds. The 
reason why Arb uses custom floating-point types instead of MPFR has a 
little to do with efficiency, and a little to do with aesthetics.

Concerning efficiency, the arf_t mantissa is allocated dynamically to the 
number of used bits rather than to the full working precision (MPFR always 
zero-pads to full precision); this helps when working with integer-valued 
coefficients and mixed-precision floating-point values. The arf_t type also 
avoids separate memory allocations completely up to 128 bits of precision; 
the 256-bit arf_t struct then stores the whole value directly without a 
separate heap allocation for the mantissa. The mag_t type, of course, is a 
lot faster than an arbitrary-precision type.

As a result, Arb sometimes uses less memory than MPFR, and less than half 
as much memory as MPFI, and has less overhead for temporary allocations and 
deallocations. However, arithmetic operations are not currently as well 
optimized as MPFR in all cases; additions are a bit slower, for example. I 
hope to work more on this in the future (for example, by always throwing 
away bits that are insignificant compared to the error bound).

Concerning aesthetics, I wanted bignum exponents and I wanted to avoid 
having any kind of global (or thread-local) state for default precision, 
exponent ranges, and exception flags.

Anyway, the number representation makes some overall difference in 
efficiency, but the biggest difference comes from algorithms. The selling 
point of Arb is that I've spent a lot of time optimizing high-precision 
computation of transcendental functions (this was the subject of my PhD 
thesis). You can find more details in the docs, in my papers, and on my 
blog. The implementation of elementary functions in Arb for precisions up 
to a few thousands of bits is described in [2].

Obviously, Arb is not meant to be competitive with double precision 
software. It seems to be competitive with some of the existing software for 
quad precision (~100 bits) arithmetic. At even higher precision, it should 
do well overall, if you know its limitations.

Fundamentally, interval arithmetic suffers from the dependency problem. It 
works perfectly in some instances, not at all in others. I wrote Arb mainly 
for doing computational number theory, where it typically happens to work 
very well.

It turns out to work quite nicely as a black box for isolated parts of 
floating-point computations, too -- you need to evaluate, say, some complex 
Bessel functions; you feed it a floating-point number, you get an interval 
back and convert it to a floating-point approximation that is guaranteed to 
be accurate, and you go on doing floating-point arithmetic, being confident 
that at least the part with the Bessel function isn't going to cause 
trouble. I recently wrote a simple Arb wrapper implementing transcendental 
functions for C99 complex doubles this way [3].

I'm not an expert on unums, though I've skimmed some of the online 
discussions and I did attend John Gustafson's talk at ARITH 22.

The most interesting selling point for unums seems to be that computers 
nowadays spend much more energy storing floating-point numbers in memory 
and moving them between different levels of memory than actually doing 
arithmetic with them. With a variable bit-length encoding, you trade some 
increase in CPU/GPU complexity (encoding/decoding, bitwise addressing, 
garbage collection) for a decrease in memory and cache usage, which 
supposedly leads to an overall improvement in efficiency (this is yet to be 
proved).

Unums need to be implemented in hardware to make any sense compared to 
traditional interval arithmetic based on pairs of IEEE 754 doubles (or 
floats, when sufficient), let alone compared to floating-point arithmetic 
as it is typically used for non-rigorous numerical computing. When you're 
paying a factor 100+ overhead manipulating bits in software, saving a 
factor 2-4 (optimistically) in memory overhead just isn't worth it.

I'm skeptical about the use of a single bit for error bounds. Simply 
summing equal-magnitude inexact numbers sequentially will blow up the error 
bound exponentially. You can get around this problem with pairwise 
summation, and fused operations help for various more complex operations, 
but there are limits to the effectiveness of such solutions. Interval 
blowup is not so much of a problem when you have thousands of spare bits of 
precision at your disposal, but it's a problem when you want to compete 
with 16-bit, 32-bit or 64-bit floats.

Fusing specific operations to reduce the dependency problem is something 
that perfectly well can be done on top of traditional interval arithmetic 
(and indeed is done in practice, in particular for transcendental 
functions); selling this as an advantage over traditional interval 
arithmetic is perhaps a bit disingenuous. More charitably, Gustafson can be 
given credit for recognizing the importance of fused operations and making 
them a part of the unum package. The other claimed advantages of unums 
(e.g. open endpoints) mostly strike me as gimmicks.

John Gustafson makes a very good point that (paraphrasing) we should spend 
computing power on computing more correctly instead of simply reducing the 
unit cost for computing nonsense. How to achieve this goal in practice is 
not obvious. I'm a fan of interval arithmetic and believe that it's a 
severely underused tool (much like automatic differentiation). It's 
especially nice for computer algebra, computational geometry, computational 
number theory, and the like. It can almost certainly be used in scientific 
computing to a much greater extent than today, e.g. for rigorously solving 
ODEs and PDEs. Whether in the shape of unums or something more traditional, 
it makes less sense for computer graphics, machine learning, or 
representing any kind of "noisy" data (there's a whole separate field of 
"fuzzy" interval arithmetic, which I know nothing about), so it's not 
likely to replace floating-point arithmetic as the backbone of numerical 
computing anytime soon.

[1] http://www.texmacs.org/joris/ball/ball-abs.html

[2] https://hal.inria.fr/hal-01079834/

[3] https://github.com/fredrik-johansson/arbcmath
 

> I'm curious about the Float128 type - this is based on Arb, and doesn't 
> seem to match the IEEE 754 binary128 floating point standard (which has 112 
> bits of fraction (plus "hidden" 1-bit), 15 bits of exponent + 1 bit sign).
> Should these maybe be called something else, so as not to cause confusion 
> with the IEEE standard binary floating point types?
>

I agree that they probably should be called something else.

Fredrik

Re: [julia-users] ANN: Float128 and Float{256,512,1024} for common math

Reply via email to