On 05/15/2015 08:10 PM, Luc Maisonobe wrote: > Le 14/05/2015 01:29, Gilles a écrit : >> On Wed, 13 May 2015 19:43:09 +0200, Luc Maisonobe wrote: >>> Le 13/05/2015 11:35, Gilles a écrit : >>>> Hello. >>>> >>>> On Wed, 13 May 2015 11:12:16 +0200, Luc Maisonobe wrote: >>>>> Hi all, >>>>> >>>>> As Jenkins failures appeared again, I restarted work on FastMath.pow. >>>>> I'll keep you informed as I make progess. >>>> >>>> Did someone see those failures anywhere else other than on the >>>> identified machine used for CI? >>> >>> I did not reproduce them myselves, and I guess most of us didn't succeed >>> either (lets remember last year when Thomas and Sebb tried to fix the >>> exp function). >>> >>> However, the configuration is not a rare one (Ubuntu, Java 7, Sun JVM), >>> so I fear this bug, even if rare, can hit several people. >> >> How much damage can it do, apart from a false positive on a unit test? >> Isn't it rather the configuration that is faulty, rather than the CM code? > > There are at least some (small) problems in the code. > I did reimplement the method, relying more heavily on bits > representation to detect special cases. This new implementation also > delegates all integer powers to pow(double, long), which handle these > cases directly with high accuracy multiplication and reciprocal. > > This showed that our previous implementation was sometimes slightly > false. For example during one test in the gamma distribution suite, we > compute 147.2421875^-142. Our former implementation returned > 1.3785386632060381807...e-308, and the new implementation returns > 1.3785386632060391688...e-308 whereas the exact result should be > 1.3785386632060390905...e-308. These are denormalized numbers, i.e. > their mantissas have less than 53 significant bits (here, they have 52 > bits). The mantissas are: > > former implementation 0x9E9AA8184555B > new implementation 0x9E9AA8184555D > exact result 0x9E9AA8184555C.D7655353C... > > So we had more than 1.8 ulp error in this case and now we have 0.16 ulp > error. > > Comparing all the calls done in our test suite for the full library, the > differences between the implementations are always in the integer power > cases (which is normal if you look at the code), they are quite rare and > generally only 1ulp away. The case above with 2ulps difference (from > -1.8 to +0.2) is rare. > > Performances are equivalent to the previous ones, the new code is > neither faster nor slower. On my 7 years old desktop computer running > Linux, FastMath.pow takes about 0.69 as much time as StrictMath, as per > Gilles performance benchmarks. > > Two junit tests did not pass anymore, and it took me a long time to > understand. The errors were really related to those 1 or 2 ulps differences. > > The first test was for BOBYQA optimizer, with the DiffPow > function, which computes: > > f(x) = abs(x0)^2 + abs(x1)^4 + abs(x2)^6 + > abs(x3)^8 +abs(x4)^10 + abs(x5)^12 > > The last term (power 12) is sometimes 1ulp away from the previous > implementation (and I think closer to exact result). The optimizer > does converge to the expected result (which is obviously all xi = 0), > but the number of iterations is extremelly large and highly dependent > on the accuracy. With former implementation, it took about 8000 > iterations on my computer, for a limit set to 12000 for this test. With > the new implementation, I needed to raise the limit to 21000 to avoid > failing with a max number of iterations error. It really seems this test > is a stress test for this optimizer. The values computed are really > small, with denormalized numbers. > > The second test was for gamma distribution, corresponding to a shape > parameter of 142. There was only one evaluation of pow that was > different, it is the one depicted above. The two ulps difference > implied the error in the non-overflow part of the distribution (with x > from 0 to 150) was between 1 and 6 ulps, with a mean at about 3.54 ulps > whereas with the previous implementation it was always 0 or 1 ulp with a > mean slightly below 0.5 ulp. So I had to raise the test tolerance to > pass. This seems strange to me. I tried to recompute the reference files > using the maxima script and increasing the accuracy from 64 digits to 96 > digits, but got similar behaviour. The tests for other shape parameters > do all pass (values 1, 8, 10, 100 and 1000). Only value 142 fails. I > even don't know why this 142 value was used (if it were 42 I would have > taken it as a well known humourous reference). Perhaps it was already a > borderline case. > > So here we are. A new implementation is available in the h10-builds > branch. It is as fast as the previous one, is probably slightly more > accurate for integer powers, but makes two tests fail unless we adjust > some test parameters or tolerances. The H10 host seem to not trigger the > JIT optimization bug anymore with this implementation. > > What do we do? Do we put this back to the master branch?
+1 to merge the changes back in. Thomas --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
