When I think about it, I think the [NoOptimizations] might just be forcing some values to be saved to the stack as single-precision floats. I think it may work only for certain methods. The casting issue isn't fixed using NoOptimizations, in either my test program (which is just simple floating point math) or if I add it to several methods using in TestSimpleExplanations.TestDMQ8 (DisjunctionMaxScorer.Score, QueryUtils.CollectorAnonymousInnerClassHelper.Collect, and others).
I actually liked the NoOptimizations thing better, because it was more explicit than casting. At least when I see NoOptimizations in source, I usually assume I'm looking at a workaround for some jit issue. Plus, performance impact could be lessened if the methods where these issues happen are made small enough that NoOptimizations doesn't make much of a difference. -Christopher On Sun, May 31, 2015 at 7:08 PM, Christopher Currens < [email protected]> wrote: > As I finished writing this, I noticed your reponses above. I think the > NoOptimization is probably forcing float truncation which can be a good > thing. I wonder if it adversely affects performance. > > Anyway, more information on exactly what's happening. > > ======================= > > One last thing. I was able to reproduce this issue in a test project, and > after stepping through the native code, I can confirm that the issue is > limited to 32-bit processes and is a result of the use of the x87 > floating point coprocessor. It is *not* an issue with float to double > conversion, but is caused by the way the jitter might generate the code. > In short, it's not a bug, it's just some unfortunate behavior. I can put > the code in a gist if you want to see it. > > Anyway, the issue is that the returned value from Score() is stored in > the FPU register at 80-bit double-extended precision, thanks to the x87 > coprocessor. The first call scorer_.Score() which is stored in skipToScore > is saved onto the stack using `fstp dword ptr [addr]`. The dword ptr forces > `fstp` to store it as a single precision. Then, the inline call to > scorer_.Score() inside of the Assert.AreEqual statement is not actually > converted to a single before converted to a double. Instead, the return > value from Score() is stored using `fstp qword ptr [addr]`. Because it's > stored with a qword ptr, `fstp` treats it as a double precision, which > produces a much different value. > > When I ran through debugging this, here are the values I saw. After > calculating the first Score(): > > st0=1.60327445312500e+005 > > Storing this value into skipToScore uses instructions that stores it on > the stack here with this value: > > 160327.44 > > When calling Assert.Equals, it is pulled back into the st0 register as: > > st0=1.603274375000000000e+0005 > > with the expected loss of precision. It is compared against the original > value (since the second call to Store() produces that) and we get the > failure. > > I did figure out a way to fix it, although I'm not sure any of it is > ideal. If we explicitly cast to a float, it will truncate the value before > returning it. Casting in the Score() method is easy, since we can wrap > the statement in parenthesis and prepend it with a cast. Alternatively, > casting can be done on in QueryUtils.cs and you can cast the values in > Assert.AreEquals to float. The downside is resharper complains that the > casts aren't necessary, when they actually do make a difference in the > outcome. > > -Christopher > > On Sun, May 31, 2015 at 6:32 PM, Laimonas Simutis <[email protected]> > wrote: > >> Just tried something with TestFuzzyQuery.TestTieBreaker failure that I >> described in the previous email. Took it out of nunit and built a console >> app that does what the test is doing. Ran it compiled in Release mode on >> 32 >> bit machine, total hits was 2 (incorrect). Ran it on 64 bit machine, total >> hits was 5 (correct). Then took the method that is giving issues with >> rounding (CalculateMaxBoost) and marked it with >> [MethodImpl(MethodImplOptions.NoOptimization)] attribute and now the code >> returns correct results on both platforms. >> >> >> On Sun, May 31, 2015 at 8:36 PM, Laimonas Simutis <[email protected]> >> wrote: >> >> > Christopher, >> > >> > Thanks for confirming that you are seeing the same thing and for the >> > background info as to what potentially is going on. Really helpful >> > information. >> > >> > This test can pass at times because of random selection of values. The >> > better test that always fails and contains no randomness component to >> it is >> > this one: >> > >> http://teamcity.codebetter.com/viewLog.html?tab=buildLog&logTab=tree&filter=debug&expand=all&buildId=192345#_focus=5721 >> > >> > In the test, this line in particular is the issue: >> > >> > >> > >> https://github.com/apache/lucenenet/blob/master/src/Lucene.Net.Core/Search/FuzzyTermsEnum.cs#L243 >> > >> > There is a code path where MaxEdits > 0 is true, termAfter is false and >> > "Bottom > CalculateMaxBoost(MaxEdits)" gets evaluated as true even >> though >> > the values should evaluate as equal. I confirm this with the same >> technique >> > by printing the numbers inside the loop. >> > >> > There is no conversion to double going on and I can get the test to fail >> > less frequently by precalculating max boost outside of the "while" >> > condition but even that just reduces the frequency of failures but does >> not >> > totally eliminate it. >> > >> > Will continue to investigate / look for solutions on this. In the >> meantime >> > I am open to any suggestions :) >> > >> > On Sun, May 31, 2015 at 2:33 AM, Christopher Currens < >> > [email protected]> wrote: >> > >> >> I was able to confirm that the 32-bit and 64-bit JVMs both emit code >> using >> >> SSE. So maybe there is something there, or maybe not. >> >> >> >> It's weird though, because if I run the test over and over (using the >> >> NUnit >> >> adapter in visual studio, so x86) it sometimes passes, and I'm not sure >> >> why. You are right, though, it is something related to the conversion >> >> between float and double. Every time it fails, I output the roundtrip >> >> string for both skipToScore and scorer_.Score() as floats and then >> casted >> >> as double. Every single time when it fails, the float values are >> exactly >> >> the same and those same float values casted to doubles produce >> different >> >> numbers. I mean, this is what you saw yourself in the tests, I'm just >> here >> >> to confirm I'm seeing the same thing (and it's puzzling). >> >> >> >> I feel like this one is out of our control (maybe a .NET bug?) and >> maybe >> >> the best fix is to to do what you've already done and avoid the >> conversion >> >> to double altogether via Assert.IsTrue. >> >> >> >> -Christopher >> >> >> >> >> >> On Sat, May 30, 2015 at 9:03 PM, Christopher Currens < >> >> [email protected]> wrote: >> >> >> >> > The .NET jitter emits different code to handle floating point >> >> instructions >> >> > in x86 vs x64. At least on my machine, I noticed that the native >> >> assembly >> >> > code generated by the jitter when running in x86 uses the x87 >> extensions >> >> > for floating point and in x64 it uses SSE. I believe that this is >> only >> >> an >> >> > issue when dealing with single-precision floating point numbers, >> which >> >> are >> >> > used pretty much everywhere in search. The reason is because the x87 >> >> > extensions, by default, use 80-bit double-extended precision >> internally >> >> > (thanks, Wikipedia!) whereas x64 uses single-precision instructions >> (and >> >> > thus the mantissa is truncated) which means we'll get different >> results >> >> > between the two architectures. >> >> > >> >> > Resharper defaults to x64. If I use the NUnit Test Adapter and run >> the >> >> > unit tests using visual studio directly, which runs in 32-bit mode, I >> >> can >> >> > get the tests to fail almost all the time. >> >> > >> >> > This is a good catch. I'm not sure if we should change nunit to be >> x64 >> >> > necessarily. It's possible that this is exposing a real code issue >> >> > somewhere, or at least an inconsistency in behavior between .NET and >> >> Java. >> >> > I think I might pull down the java code and see if there's a >> difference >> >> in >> >> > this test between a 32-bit and 64-bit JVM. I don't know what kind of >> >> > assembly instructions that are emitted by Java's jitter. >> >> > >> >> > -Christopher >> >> > >> >> > On Sat, May 30, 2015 at 6:47 PM, Laimonas Simutis <[email protected]> >> >> > wrote: >> >> > >> >> >> FINALLY I am able to reproduce it locally. Looking through TC build >> I >> >> >> noticed this: >> >> >> >> >> >> Running NUnit-2.6.3 tests under .NET Framework v4.0 x86 >> >> >> >> >> >> Note x86... So instead of running test via Resharper and built in >> >> NUnit, I >> >> >> ran it with nunit 2.6.3 via command line. Tests fail with the odd >> >> float >> >> >> issues if I run it with nunit-x86, and pass if I run it with >> nunit.exe >> >> >> (both version 2.6.3). I am on a 64 bit machine, and so are the TC >> build >> >> >> agents it seems. >> >> >> >> >> >> I am still not sure why this causes the failures to occur, but do we >> >> need >> >> >> to adjust what nunit build we use to run the tests? >> >> >> >> >> >> On Sat, May 30, 2015 at 4:28 PM, Laimonas Simutis <[email protected] >> > >> >> >> wrote: >> >> >> >> >> >> > >> >> >> > >> >> >> > On Sat, May 30, 2015 at 4:01 PM, Itamar Syn-Hershko < >> >> [email protected] >> >> >> > >> >> >> > wrote: >> >> >> > >> >> >> >> And when you refactor _scorer.Score() to be in a different line >> it >> >> >> passes >> >> >> >> 100% of the time on all platforms? that doesn't sound right. >> >> >> >> >> >> >> > >> >> >> > It continues to pass on mine (I can never get those to fail >> locally), >> >> >> and >> >> >> > ran the test several times on TC and it passed. I know, it sounds >> >> odd, >> >> >> I am >> >> >> > at a loss to explain it. >> >> >> > >> >> >> > >> >> >> >> >> >> >> >> Also, not in front of VS now, but AreEquals should already be >> doing >> >> >> this >> >> >> >> epsilon thing no? >> >> >> >> >> >> >> > >> >> >> > That's what I thought too. The only odd thing there is no "float" >> >> >> overload >> >> >> > and only "double" so not sure if conversion from float to double >> >> might >> >> >> be >> >> >> > introducing rounding issues here too. That's why I replaced it >> with >> >> >> epsilon >> >> >> > just to see what would happen and it still failed so then I went >> with >> >> >> > precalculating scorer_.Score() before comparison just to see what >> >> would >> >> >> > happen. >> >> >> > >> >> >> > And check this out. I put the comparison back like it used to be >> >> >> > (Assert.AreEquals) and wrapped in catch to output to console the >> >> values: >> >> >> > >> >> >> > float skipToScore = scorer_.Score(); >> >> >> > try >> >> >> > { >> >> >> > Assert.AreEqual(skipToScore, scorer_.Score(), MaxDiff, >> "unstable >> >> >> > skipTo(" + i + ") score!"); >> >> >> > } >> >> >> > catch (AssertionException ex) >> >> >> > { >> >> >> > Console.WriteLine("Failed, these two were deemed not equal:"); >> >> >> > Console.WriteLine(skipToScore.ToString("R")); >> >> >> > Console.WriteLine(scorer_.Score().ToString("R")); >> >> >> > throw; >> >> >> > } >> >> >> > >> >> >> > Look at the output on TC: >> >> >> > >> >> >> > Test(s) failed. unstable skipTo(3) score! >> >> >> > Expected: 115019.984375d +/- 0.0010000000474974513d >> >> >> > But was: 115019.98828125d >> >> >> > >> >> >> > ------- Stderr: ------- >> >> >> > Failed, these two were deemed not equal: >> >> >> > 115019.984 >> >> >> > 115019.984 >> >> >> > >> >> >> > You can see how the floats were converted to doubles and >> furthermore >> >> how >> >> >> > when I call Score() in catch section, it returns 115019.984 yet >> when >> >> it >> >> >> was >> >> >> > called in Assert it is outputting 115019.98828125d. and 0.988 and >> is >> >> off >> >> >> > from 0.984 by more than 0.001 (which is the value of MaxDiff). >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> >> >> >> >> >> -- >> >> >> >> >> >> >> >> Itamar Syn-Hershko >> >> >> >> http://code972.com | @synhershko <https://twitter.com/synhershko >> > >> >> >> >> Freelance Developer & Consultant >> >> >> >> Lucene.NET committer and PMC member >> >> >> >> >> >> >> >> On Sat, May 30, 2015 at 10:46 PM, Laimonas Simutis < >> >> [email protected]> >> >> >> >> wrote: >> >> >> >> >> >> >> >> > Itamar, >> >> >> >> > >> >> >> >> > These float comparison are killing me :) I am pretty sure all >> the >> >> >> >> remaining >> >> >> >> > failures in core tests are related to float issues. >> >> >> >> > >> >> >> >> > I am trying to use epsilon here by changing >> >> >> >> > >> >> >> >> > AreEqual(skipToScore, scorer_.Score(), MaxDiff) to >> >> >> >> > IsTrue(Math.Abs(skipToScore - scorer_.Score()) < MaxDiff). >> >> >> >> > >> >> >> >> > It is similar to the link you provided except I am not >> >> >> >> > handling infinite and values close to 0, which are not expected >> >> and >> >> >> do >> >> >> >> not >> >> >> >> > occur in this test. >> >> >> >> > >> >> >> >> > I can get this test to pass by taking out scorer_.Score() >> >> calculation >> >> >> >> and >> >> >> >> > calculating it separately and then comparing, like this: >> >> >> >> > >> >> >> >> > var secondScore = scorer_.Score(); >> >> >> >> > IsTrue(Math.Abs(skipToScore - secondScore) < MaxDiff). >> >> >> >> > >> >> >> >> > In this case, the scorer_.Score() is doing a bunch of float >> adds >> >> / >> >> >> >> > multiplies ( >> >> >> >> > >> >> >> >> > >> >> >> >> >> >> >> >> >> >> https://github.com/apache/lucenenet/blob/master/src/Lucene.Net.Core/Search/DisjunctionMaxScorer.cs#L58 >> >> >> >> > ) >> >> >> >> > so I can see where rounding error could come in but still >> cannot >> >> >> explain >> >> >> >> > how it consistently fails on some env and not the others. Also >> >> have >> >> >> no >> >> >> >> idea >> >> >> >> > how to proceed with this issue besides changing the order of >> >> >> >> calculations, >> >> >> >> > like I did with the above to get it to pass. Just don't feel >> >> >> confident >> >> >> >> that >> >> >> >> > there is no bigger issue somewhere else. >> >> >> >> > >> >> >> >> > >> >> >> >> > Laimis >> >> >> >> > >> >> >> >> > >> >> >> >> > >> >> >> >> > On Sat, May 30, 2015 at 2:56 PM, Itamar Syn-Hershko < >> >> >> [email protected] >> >> >> >> > >> >> >> >> > wrote: >> >> >> >> > >> >> >> >> > > Float comparison is not as trivial - you should probably use >> >> >> epsilon >> >> >> >> -- >> >> >> >> > see >> >> >> >> > > http://stackoverflow.com/a/3875619/135701 for example >> >> >> >> > > >> >> >> >> > > -- >> >> >> >> > > >> >> >> >> > > Itamar Syn-Hershko >> >> >> >> > > http://code972.com | @synhershko < >> >> https://twitter.com/synhershko> >> >> >> >> > > Freelance Developer & Consultant >> >> >> >> > > Lucene.NET committer and PMC member >> >> >> >> > > >> >> >> >> > > On Sat, May 30, 2015 at 9:50 PM, <[email protected]> wrote: >> >> >> >> > > >> >> >> >> > > > Repository: lucenenet >> >> >> >> > > > Updated Branches: >> >> >> >> > > > refs/heads/failingtests bdf2899a0 -> 6a81f8606 >> >> >> >> > > > >> >> >> >> > > > >> >> >> >> > > > use proper float comparison >> >> >> >> > > > >> >> >> >> > > > >> >> >> >> > > > Project: >> >> http://git-wip-us.apache.org/repos/asf/lucenenet/repo >> >> >> >> > > > Commit: >> >> >> >> > >> http://git-wip-us.apache.org/repos/asf/lucenenet/commit/6a81f860 >> >> >> >> > > > Tree: >> >> >> >> http://git-wip-us.apache.org/repos/asf/lucenenet/tree/6a81f860 >> >> >> >> > > > Diff: >> >> >> >> http://git-wip-us.apache.org/repos/asf/lucenenet/diff/6a81f860 >> >> >> >> > > > >> >> >> >> > > > Branch: refs/heads/failingtests >> >> >> >> > > > Commit: 6a81f860671ab98fb7cd595af317b3d8521acc21 >> >> >> >> > > > Parents: bdf2899 >> >> >> >> > > > Author: Laimonas Simutis <[email protected]> >> >> >> >> > > > Authored: Sat May 30 14:49:35 2015 -0400 >> >> >> >> > > > Committer: Laimonas Simutis <[email protected]> >> >> >> >> > > > Committed: Sat May 30 14:49:35 2015 -0400 >> >> >> >> > > > >> >> >> >> > > > >> >> >> >> >> >> ---------------------------------------------------------------------- >> >> >> >> > > > src/Lucene.Net.TestFramework/Search/QueryUtils.cs | 4 ++-- >> >> >> >> > > > 1 file changed, 2 insertions(+), 2 deletions(-) >> >> >> >> > > > >> >> >> >> >> >> ---------------------------------------------------------------------- >> >> >> >> > > > >> >> >> >> > > > >> >> >> >> > > > >> >> >> >> > > > >> >> >> >> > > >> >> >> >> > >> >> >> >> >> >> >> >> >> >> http://git-wip-us.apache.org/repos/asf/lucenenet/blob/6a81f860/src/Lucene.Net.TestFramework/Search/QueryUtils.cs >> >> >> >> > > > >> >> >> >> >> >> ---------------------------------------------------------------------- >> >> >> >> > > > diff --git >> a/src/Lucene.Net.TestFramework/Search/QueryUtils.cs >> >> >> >> > > > b/src/Lucene.Net.TestFramework/Search/QueryUtils.cs >> >> >> >> > > > index 1156eee..6615d4c 100644 >> >> >> >> > > > --- a/src/Lucene.Net.TestFramework/Search/QueryUtils.cs >> >> >> >> > > > +++ b/src/Lucene.Net.TestFramework/Search/QueryUtils.cs >> >> >> >> > > > @@ -478,8 +478,8 @@ namespace Lucene.Net.Search >> >> >> >> > > > Assert.IsTrue(scorer_.Advance(i) >> != >> >> >> >> > > > DocIdSetIterator.NO_MORE_DOCS, "query collected " + doc + " >> >> but >> >> >> >> > skipTo(" >> >> >> >> > > + >> >> >> >> > > > i + ") says no more docs!"); >> >> >> >> > > > Assert.AreEqual(doc, >> scorer_.DocID(), >> >> >> >> "query >> >> >> >> > > > collected " + doc + " but skipTo(" + i + ") got to " + >> >> >> >> > scorer_.DocID()); >> >> >> >> > > > float skipToScore = >> scorer_.Score(); >> >> >> >> > > > - Assert.AreEqual(skipToScore, >> >> >> >> scorer_.Score(), >> >> >> >> > > > MaxDiff, "unstable skipTo(" + i + ") score!"); >> >> >> >> > > > - Assert.AreEqual(score, >> skipToScore, >> >> >> >> MaxDiff, >> >> >> >> > > > "query assigned doc " + doc + " a score of <" + score + "> >> but >> >> >> >> skipTo(" >> >> >> >> > > + i >> >> >> >> > > > + ") has <" + skipToScore + ">!"); >> >> >> >> > > > + >> Assert.IsTrue(Math.Abs(skipToScore - >> >> >> >> > > > scorer_.Score()) < MaxDiff, "unstable skipTo(" + i + ") >> >> score!"); >> >> >> >> > > > + Assert.AreEqual(Math.Abs(score - >> >> >> >> skipToScore) >> >> >> >> > < >> >> >> >> > > > MaxDiff, "query assigned doc " + doc + " a score of <" + >> >> score + >> >> >> "> >> >> >> >> but >> >> >> >> > > > skipTo(" + i + ") has <" + skipToScore + ">!"); >> >> >> >> > > > >> >> >> >> > > > // Hurry things along if they are >> >> going >> >> >> >> slow >> >> >> >> > (eg >> >> >> >> > > > // if you got SimpleText codec >> this >> >> will >> >> >> >> kick >> >> >> >> > > in): >> >> >> >> > > > >> >> >> >> > > > >> >> >> >> > > >> >> >> >> > >> >> >> >> >> >> >> > >> >> >> > >> >> >> >> >> > >> >> > >> >> >> > >> > >> > >
