Hi Raffaello, I am the author of a recent publication on double to string conversion [1] - the Ryu algorithm. I've been aware of the problems with the Jdk for several years, and am very much looking forward to improvements in correctness and performance in this area.
I have done some testing against my Java implementation of the Ryu algorithm described in the linked paper. Interestingly, I've found a few cases where they output different results. In particular: 1.0E-323 is printed as 9.9E-324 1.0E-322 is printed as 9.9E-323 It's likely that there are more such cases - I only ran a sample of double-precision numbers. Arguably, 9.9 is the correctly rounded 2-digit output and Ryu is incorrect here. That's what you get when you have a special case for Java without a correctness proof. :-( In terms of performance, this algorithm performs almost exactly the same as my Java implementation of Ryu, although I'd like to point out that my C implementation of Ryu is quite a bit faster (though note that it generates different output, in particular, it only outputs a single digit of precision in the above cases, rather than two), and I didn't backport all the performance improvements from the Java version, yet. It looks like this is not coincidence - as far as I can see so far, it's algorithmically very similar, although it manages to avoid the loop I'm using in Ryu to find the shortest representation. I have a few comments: * <li> It rounds to {@code v} according to the usual round-to-closest * rule of IEEE 754 floating-point arithmetic. - Since you're spelling out the rounding rules just below, this is duplicated, and by itself, it's unclear since it doesn't specify the specific sub-type (round half even). - Naming: I'd strongly suggest to use variable names that relate to what's stored, e.g., m for mantissa, e for exponent, etc. - What's not clear to me is how the algorithm determines how many digits to print. - Also, it might be nicer to move the long multiplications to a helper method - at least from a short look, it looks like the computations of vn, vnl, and vnr are identical. - I looked through the spec, and it looks like all cases are well-defined. Yay! I will need some more time to do a more thorough review of the code and more testing for differences. Unfortunately, I'm also traveling the next two weeks, so this might take a bit of time. I'm not a contributor to the Jdk, and this isn't my full-time job. I was lurking here because I was going to send a patch for the double to string conversion code myself (based on Ryu). Thanks, -- Ulf [1] https://dl.acm.org/citation.cfm?id=3192369 [2] https://github.com/google/double-conversion [3] https://en.wikipedia.org/wiki/Rounding On Thu, Sep 27, 2018 at 11:03 AM Andrew Haley <a...@redhat.com> wrote: > On 09/26/2018 06:39 PM, raffaello.giulie...@gmail.com wrote: > > > The submitted code contains both the changes to the current > > implementation and extensive jtreg tests. > > > > While I've struggled to keep the code within the 80 chars/line limit, > > mercurial still generates longer lines. Thus, to avoid possible problems > > with the email systems, the code is submitted both inline and as an > > attachment. Hope at least one copy makes its way without errors. > > Overall, the commenting is much too light. There are many places > where I think I know what you're doing but you don't explain it. > > Here, for example: > > > + > > + // pow5 = pow51*2^63 + pow50 > > + long pow51 = ceilPow5dHigh(-k); > > + long pow50 = ceilPow5dLow(-k); > > + > > + // p = p2*2^126 + p1*2^63 + p0 and p = pow5 * cb > > + long x0 = pow50 * cb; > > + long x1 = multiplyHigh(pow50, cb); > > + long y0 = pow51 * cb; > > + long y1 = multiplyHigh(pow51, cb); > > + long z = (x1 << 1 | x0 >>> 63) + (y0 & MASK_63); > > + long p0 = x0 & MASK_63; > > + long p1 = z & MASK_63; > > + long p2 = (y1 << 1 | y0 >>> 63) + (z >>> 63); > > + long vn = p2 << 1 + ord2alpha | p1 >>> 62 - ord2alpha; > > + if ((p1 & mask) != 0 || p0 >= threshold) { > > + vn |= 1; > > + } > > ... etc. I think I can figure out what you're doing, but you could > explain it. > > If you write the comments now while the code is still fresh in your > mind it'll be easy. > > > + private static final long[] ceilPow5d = { > > + /* -292 */ 0x7FBB_D8FE_5F5E_6E27L, 0x497A_3A27_04EE_C3DFL, > > + /* -291 */ 0x4FD5_679E_FB9B_04D8L, 0x5DEC_6458_6315_3A6CL, > > + /* -290 */ 0x63CA_C186_BA81_C60EL, 0x7567_7D6E_7BDA_8906L, > > + /* -289 */ 0x7CBD_71E8_6922_3792L, 0x52C1_5CCA_1AD1_2B48L, > > What exactly is this table anyway? How was it generated? Please say. > > There are many more places in the code. What you've done is nice, but > it could be exemplary. > > -- > Andrew Haley > Java Platform Lead Engineer > Red Hat UK Ltd. <https://www.redhat.com> > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 >