Re: lucenenet git commit: use proper float comparison

Christopher Currens Sun, 31 May 2015 19:28:51 -0700

When I think about it, I think the [NoOptimizations] might just be forcing
some values to be saved to the stack as single-precision floats. I think it
may work only for certain methods. The casting issue isn't fixed using
NoOptimizations, in either my test program (which is just simple floating
point math) or if I add it to several methods using in
TestSimpleExplanations.TestDMQ8 (DisjunctionMaxScorer.Score,
QueryUtils.CollectorAnonymousInnerClassHelper.Collect, and others).


I actually liked the NoOptimizations thing better, because it was more
explicit than casting. At least when I see NoOptimizations in source, I
usually assume I'm looking at a workaround for some jit issue. Plus,
performance impact could be lessened if the methods where these issues
happen are made small enough that NoOptimizations doesn't make much of a
difference.

-Christopher

On Sun, May 31, 2015 at 7:08 PM, Christopher Currens <
[email protected]> wrote:

> As I finished writing this, I noticed your reponses above. I think the
> NoOptimization is probably forcing float truncation which can be a good
> thing. I wonder if it adversely affects performance.
>
> Anyway, more information on exactly what's happening.
>
> =======================
>
> One last thing. I was able to reproduce this issue in a test project, and
> after stepping through the native code, I can confirm that the issue is
> limited to 32-bit processes and is a result of the use of the x87
> floating point coprocessor. It is *not* an issue with float to double
> conversion, but is caused by the way the jitter might generate the code.
> In short, it's not a bug, it's just some unfortunate behavior. I can put
> the code in a gist if you want to see it.
>
> Anyway, the issue is that the returned value from Score() is stored in
> the FPU register at 80-bit double-extended precision, thanks to the x87
> coprocessor. The first call scorer_.Score() which is stored in skipToScore
> is saved onto the stack using `fstp dword ptr [addr]`. The dword ptr forces
> `fstp` to store it as a single precision. Then, the inline call to
> scorer_.Score() inside of the Assert.AreEqual statement is not actually
> converted to a single before converted to a double. Instead, the return
> value from Score() is stored using `fstp qword ptr [addr]`. Because it's
> stored with a qword ptr, `fstp` treats it as a double precision, which
> produces a much different value.
>
> When I ran through debugging this, here are the values I saw.  After
> calculating the first Score():
>
> st0=1.60327445312500e+005
>
> Storing this value into skipToScore uses instructions that stores it on
> the stack here with this value:
>
> 160327.44
>
> When calling Assert.Equals, it is pulled back into the st0 register as:
>
> st0=1.603274375000000000e+0005
>
> with the expected loss of precision. It is compared against the original
> value (since the second call to Store() produces that) and we get the
> failure.
>
> I did figure out a way to fix it, although I'm not sure any of it is
> ideal. If we explicitly cast to a float, it will truncate the value before
> returning it. Casting in the Score() method is easy, since we can wrap
> the statement in parenthesis and prepend it with a cast. Alternatively,
> casting can be done on in QueryUtils.cs and you can cast the values in
> Assert.AreEquals to float. The downside is resharper complains that the
> casts aren't necessary, when they actually do make a difference in the
> outcome.
>
> -Christopher
>
> On Sun, May 31, 2015 at 6:32 PM, Laimonas Simutis <[email protected]>
> wrote:
>
>> Just tried something with TestFuzzyQuery.TestTieBreaker failure that I
>> described in the previous email. Took it out of nunit and built a console
>> app that does what the test is doing. Ran it compiled in Release mode on
>> 32
>> bit machine, total hits was 2 (incorrect). Ran it on 64 bit machine, total
>> hits was 5 (correct). Then took the method that is giving issues with
>> rounding (CalculateMaxBoost) and marked it with
>> [MethodImpl(MethodImplOptions.NoOptimization)] attribute and now the code
>> returns correct results on both platforms.
>>
>>
>> On Sun, May 31, 2015 at 8:36 PM, Laimonas Simutis <[email protected]>
>> wrote:
>>
>> > Christopher,
>> >
>> > Thanks for confirming that you are seeing the same thing and for the
>> > background info as to what potentially is going on. Really helpful
>> > information.
>> >
>> > This test can pass at times because of random selection of values. The
>> > better test that always fails and contains no randomness component to
>> it is
>> > this one:
>> >
>> http://teamcity.codebetter.com/viewLog.html?tab=buildLog&logTab=tree&filter=debug&expand=all&buildId=192345#_focus=5721
>> >
>> > In the test, this line in particular is the issue:
>> >
>> >
>> >
>> https://github.com/apache/lucenenet/blob/master/src/Lucene.Net.Core/Search/FuzzyTermsEnum.cs#L243
>> >
>> > There is a code path where MaxEdits > 0 is true, termAfter is false and
>> > "Bottom > CalculateMaxBoost(MaxEdits)" gets evaluated as true even
>> though
>> > the values should evaluate as equal. I confirm this with the same
>> technique
>> > by printing the numbers inside the loop.
>> >
>> > There is no conversion to double going on and I can get the test to fail
>> > less frequently by precalculating max boost outside of the "while"
>> > condition but even that just reduces the frequency of failures but does
>> not
>> > totally eliminate it.
>> >
>> > Will continue to investigate / look for solutions on this. In the
>> meantime
>> > I am open to any suggestions :)
>> >
>> > On Sun, May 31, 2015 at 2:33 AM, Christopher Currens <
>> > [email protected]> wrote:
>> >
>> >> I was able to confirm that the 32-bit and 64-bit JVMs both emit code
>> using
>> >> SSE. So maybe there is something there, or maybe not.
>> >>
>> >> It's weird though, because if I run the test over and over (using the
>> >> NUnit
>> >> adapter in visual studio, so x86) it sometimes passes, and I'm not sure
>> >> why. You are right, though, it is something related to the conversion
>> >> between float and double. Every time it fails, I output the roundtrip
>> >> string for both skipToScore and scorer_.Score() as floats and then
>> casted
>> >> as double. Every single time when it fails, the float values are
>> exactly
>> >> the same and those same float values casted to doubles produce
>> different
>> >> numbers. I mean, this is what you saw yourself in the tests, I'm just
>> here
>> >> to confirm I'm seeing the same thing (and it's puzzling).
>> >>
>> >> I feel like this one is out of our control (maybe a .NET bug?) and
>> maybe
>> >> the best fix is to to do what you've already done and avoid the
>> conversion
>> >> to double altogether via Assert.IsTrue.
>> >>
>> >> -Christopher
>> >>
>> >>
>> >> On Sat, May 30, 2015 at 9:03 PM, Christopher Currens <
>> >> [email protected]> wrote:
>> >>
>> >> > The .NET jitter emits different code to handle floating point
>> >> instructions
>> >> > in x86 vs x64. At least on my machine, I noticed that the native
>> >> assembly
>> >> > code generated by the jitter when running in x86 uses the x87
>> extensions
>> >> > for floating point and in x64 it uses SSE. I believe that this is
>> only
>> >> an
>> >> > issue when dealing with single-precision floating point numbers,
>> which
>> >> are
>> >> > used pretty much everywhere in search. The reason is because the x87
>> >> > extensions, by default, use 80-bit double-extended precision
>> internally
>> >> > (thanks, Wikipedia!) whereas x64 uses single-precision instructions
>> (and
>> >> > thus the mantissa is truncated) which means we'll get different
>> results
>> >> > between the two architectures.
>> >> >
>> >> > Resharper defaults to x64. If I use the NUnit Test Adapter and run
>> the
>> >> > unit tests using visual studio directly, which runs in 32-bit mode, I
>> >> can
>> >> > get the tests to fail almost all the time.
>> >> >
>> >> > This is a good catch. I'm not sure if we should change nunit to be
>> x64
>> >> > necessarily. It's possible that this is exposing a real code issue
>> >> > somewhere, or at least an inconsistency in behavior between .NET and
>> >> Java.
>> >> > I think I might pull down the java code and see if there's a
>> difference
>> >> in
>> >> > this test between a 32-bit and 64-bit JVM. I don't know what kind of
>> >> > assembly instructions that are emitted by Java's jitter.
>> >> >
>> >> > -Christopher
>> >> >
>> >> > On Sat, May 30, 2015 at 6:47 PM, Laimonas Simutis <[email protected]>
>> >> > wrote:
>> >> >
>> >> >> FINALLY I am able to reproduce it locally. Looking through TC build
>> I
>> >> >> noticed this:
>> >> >>
>> >> >> Running NUnit-2.6.3 tests under .NET Framework v4.0 x86
>> >> >>
>> >> >> Note x86... So instead of running test via Resharper and built in
>> >> NUnit, I
>> >> >> ran it  with nunit 2.6.3 via command line. Tests fail with the odd
>> >> float
>> >> >> issues if I run it with nunit-x86, and pass if I run it with
>> nunit.exe
>> >> >> (both version 2.6.3). I am on a 64 bit machine, and so are the TC
>> build
>> >> >> agents it seems.
>> >> >>
>> >> >> I am still not sure why this causes the failures to occur, but do we
>> >> need
>> >> >> to adjust what nunit build we use to run the tests?
>> >> >>
>> >> >> On Sat, May 30, 2015 at 4:28 PM, Laimonas Simutis <[email protected]
>> >
>> >> >> wrote:
>> >> >>
>> >> >> >
>> >> >> >
>> >> >> > On Sat, May 30, 2015 at 4:01 PM, Itamar Syn-Hershko <
>> >> [email protected]
>> >> >> >
>> >> >> > wrote:
>> >> >> >
>> >> >> >> And when you refactor _scorer.Score() to be in a different line
>> it
>> >> >> passes
>> >> >> >> 100% of the time on all platforms? that doesn't sound right.
>> >> >> >>
>> >> >> >
>> >> >> > It continues to pass on mine (I can never get those to fail
>> locally),
>> >> >> and
>> >> >> > ran the test several times on TC and it passed. I know, it sounds
>> >> odd,
>> >> >> I am
>> >> >> > at a loss to explain it.
>> >> >> >
>> >> >> >
>> >> >> >>
>> >> >> >> Also, not in front of VS now, but AreEquals should already be
>> doing
>> >> >> this
>> >> >> >> epsilon thing no?
>> >> >> >>
>> >> >> >
>> >> >> > That's what I thought too. The only odd thing there is no "float"
>> >> >> overload
>> >> >> > and only "double" so not sure if conversion from float to double
>> >> might
>> >> >> be
>> >> >> > introducing rounding issues here too. That's why I replaced it
>> with
>> >> >> epsilon
>> >> >> > just to see what would happen and it still failed so then I went
>> with
>> >> >> > precalculating scorer_.Score() before comparison just to see what
>> >> would
>> >> >> > happen.
>> >> >> >
>> >> >> > And check this out. I put the comparison back like it used to be
>> >> >> > (Assert.AreEquals) and wrapped in catch to output to console the
>> >> values:
>> >> >> >
>> >> >> > float skipToScore = scorer_.Score();
>> >> >> > try
>> >> >> > {
>> >> >> >     Assert.AreEqual(skipToScore, scorer_.Score(), MaxDiff,
>> "unstable
>> >> >> > skipTo(" + i + ") score!");
>> >> >> > }
>> >> >> > catch (AssertionException ex)
>> >> >> > {
>> >> >> >     Console.WriteLine("Failed, these two were deemed not equal:");
>> >> >> >     Console.WriteLine(skipToScore.ToString("R"));
>> >> >> >     Console.WriteLine(scorer_.Score().ToString("R"));
>> >> >> >     throw;
>> >> >> > }
>> >> >> >
>> >> >> > Look at the output on TC:
>> >> >> >
>> >> >> > Test(s) failed.   unstable skipTo(3) score!
>> >> >> >   Expected: 115019.984375d +/- 0.0010000000474974513d
>> >> >> >   But was:  115019.98828125d
>> >> >> >
>> >> >> > ------- Stderr: -------
>> >> >> > Failed, these two were deemed not equal:
>> >> >> > 115019.984
>> >> >> > 115019.984
>> >> >> >
>> >> >> > You can see how the floats were converted to doubles and
>> furthermore
>> >> how
>> >> >> > when I call Score() in catch section, it returns 115019.984 yet
>> when
>> >> it
>> >> >> was
>> >> >> > called in Assert it is outputting 115019.98828125d. and 0.988 and
>> is
>> >> off
>> >> >> > from 0.984 by more than 0.001 (which is the value of MaxDiff).
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >>
>> >> >> >> --
>> >> >> >>
>> >> >> >> Itamar Syn-Hershko
>> >> >> >> http://code972.com | @synhershko <https://twitter.com/synhershko
>> >
>> >> >> >> Freelance Developer & Consultant
>> >> >> >> Lucene.NET committer and PMC member
>> >> >> >>
>> >> >> >> On Sat, May 30, 2015 at 10:46 PM, Laimonas Simutis <
>> >> [email protected]>
>> >> >> >> wrote:
>> >> >> >>
>> >> >> >> > Itamar,
>> >> >> >> >
>> >> >> >> > These float comparison are killing me :) I am pretty sure all
>> the
>> >> >> >> remaining
>> >> >> >> > failures in core tests are related to float issues.
>> >> >> >> >
>> >> >> >> > I am trying to use epsilon here by changing
>> >> >> >> >
>> >> >> >> > AreEqual(skipToScore, scorer_.Score(), MaxDiff) to
>> >> >> >> > IsTrue(Math.Abs(skipToScore - scorer_.Score()) < MaxDiff).
>> >> >> >> >
>> >> >> >> > It is similar to the link you provided except I am not
>> >> >> >> > handling infinite and values close to 0, which are not expected
>> >> and
>> >> >> do
>> >> >> >> not
>> >> >> >> > occur in this test.
>> >> >> >> >
>> >> >> >> > I can get this test to pass by taking out scorer_.Score()
>> >> calculation
>> >> >> >> and
>> >> >> >> > calculating it separately and then comparing, like this:
>> >> >> >> >
>> >> >> >> > var secondScore = scorer_.Score();
>> >> >> >> > IsTrue(Math.Abs(skipToScore - secondScore) < MaxDiff).
>> >> >> >> >
>> >> >> >> > In this case, the scorer_.Score() is doing a bunch of float
>> adds
>> >> /
>> >> >> >> > multiplies (
>> >> >> >> >
>> >> >> >> >
>> >> >> >>
>> >> >>
>> >>
>> https://github.com/apache/lucenenet/blob/master/src/Lucene.Net.Core/Search/DisjunctionMaxScorer.cs#L58
>> >> >> >> > )
>> >> >> >> > so I can see where rounding error could come in but still
>> cannot
>> >> >> explain
>> >> >> >> > how it consistently fails on some env and not the others. Also
>> >> have
>> >> >> no
>> >> >> >> idea
>> >> >> >> > how to proceed with this issue besides changing the order of
>> >> >> >> calculations,
>> >> >> >> > like I did with the above to get it to pass. Just don't feel
>> >> >> confident
>> >> >> >> that
>> >> >> >> > there is no bigger issue somewhere else.
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > Laimis
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > On Sat, May 30, 2015 at 2:56 PM, Itamar Syn-Hershko <
>> >> >> [email protected]
>> >> >> >> >
>> >> >> >> > wrote:
>> >> >> >> >
>> >> >> >> > > Float comparison is not as trivial - you should probably use
>> >> >> epsilon
>> >> >> >> --
>> >> >> >> > see
>> >> >> >> > > http://stackoverflow.com/a/3875619/135701 for example
>> >> >> >> > >
>> >> >> >> > > --
>> >> >> >> > >
>> >> >> >> > > Itamar Syn-Hershko
>> >> >> >> > > http://code972.com | @synhershko <
>> >> https://twitter.com/synhershko>
>> >> >> >> > > Freelance Developer & Consultant
>> >> >> >> > > Lucene.NET committer and PMC member
>> >> >> >> > >
>> >> >> >> > > On Sat, May 30, 2015 at 9:50 PM, <[email protected]> wrote:
>> >> >> >> > >
>> >> >> >> > > > Repository: lucenenet
>> >> >> >> > > > Updated Branches:
>> >> >> >> > > >   refs/heads/failingtests bdf2899a0 -> 6a81f8606
>> >> >> >> > > >
>> >> >> >> > > >
>> >> >> >> > > > use proper float comparison
>> >> >> >> > > >
>> >> >> >> > > >
>> >> >> >> > > > Project:
>> >> http://git-wip-us.apache.org/repos/asf/lucenenet/repo
>> >> >> >> > > > Commit:
>> >> >> >> >
>> http://git-wip-us.apache.org/repos/asf/lucenenet/commit/6a81f860
>> >> >> >> > > > Tree:
>> >> >> >> http://git-wip-us.apache.org/repos/asf/lucenenet/tree/6a81f860
>> >> >> >> > > > Diff:
>> >> >> >> http://git-wip-us.apache.org/repos/asf/lucenenet/diff/6a81f860
>> >> >> >> > > >
>> >> >> >> > > > Branch: refs/heads/failingtests
>> >> >> >> > > > Commit: 6a81f860671ab98fb7cd595af317b3d8521acc21
>> >> >> >> > > > Parents: bdf2899
>> >> >> >> > > > Author: Laimonas Simutis <[email protected]>
>> >> >> >> > > > Authored: Sat May 30 14:49:35 2015 -0400
>> >> >> >> > > > Committer: Laimonas Simutis <[email protected]>
>> >> >> >> > > > Committed: Sat May 30 14:49:35 2015 -0400
>> >> >> >> > > >
>> >> >> >> > > >
>> >> >> >>
>> >> ----------------------------------------------------------------------
>> >> >> >> > > >  src/Lucene.Net.TestFramework/Search/QueryUtils.cs | 4 ++--
>> >> >> >> > > >  1 file changed, 2 insertions(+), 2 deletions(-)
>> >> >> >> > > >
>> >> >> >>
>> >> ----------------------------------------------------------------------
>> >> >> >> > > >
>> >> >> >> > > >
>> >> >> >> > > >
>> >> >> >> > > >
>> >> >> >> > >
>> >> >> >> >
>> >> >> >>
>> >> >>
>> >>
>> http://git-wip-us.apache.org/repos/asf/lucenenet/blob/6a81f860/src/Lucene.Net.TestFramework/Search/QueryUtils.cs
>> >> >> >> > > >
>> >> >> >>
>> >> ----------------------------------------------------------------------
>> >> >> >> > > > diff --git
>> a/src/Lucene.Net.TestFramework/Search/QueryUtils.cs
>> >> >> >> > > > b/src/Lucene.Net.TestFramework/Search/QueryUtils.cs
>> >> >> >> > > > index 1156eee..6615d4c 100644
>> >> >> >> > > > --- a/src/Lucene.Net.TestFramework/Search/QueryUtils.cs
>> >> >> >> > > > +++ b/src/Lucene.Net.TestFramework/Search/QueryUtils.cs
>> >> >> >> > > > @@ -478,8 +478,8 @@ namespace Lucene.Net.Search
>> >> >> >> > > >                          Assert.IsTrue(scorer_.Advance(i)
>> !=
>> >> >> >> > > > DocIdSetIterator.NO_MORE_DOCS, "query collected " + doc + "
>> >> but
>> >> >> >> > skipTo("
>> >> >> >> > > +
>> >> >> >> > > > i + ") says no more docs!");
>> >> >> >> > > >                          Assert.AreEqual(doc,
>> scorer_.DocID(),
>> >> >> >> "query
>> >> >> >> > > > collected " + doc + " but skipTo(" + i + ") got to " +
>> >> >> >> > scorer_.DocID());
>> >> >> >> > > >                          float skipToScore =
>> scorer_.Score();
>> >> >> >> > > > -                        Assert.AreEqual(skipToScore,
>> >> >> >> scorer_.Score(),
>> >> >> >> > > > MaxDiff, "unstable skipTo(" + i + ") score!");
>> >> >> >> > > > -                        Assert.AreEqual(score,
>> skipToScore,
>> >> >> >> MaxDiff,
>> >> >> >> > > > "query assigned doc " + doc + " a score of <" + score + ">
>> but
>> >> >> >> skipTo("
>> >> >> >> > > + i
>> >> >> >> > > > + ") has <" + skipToScore + ">!");
>> >> >> >> > > > +
>> Assert.IsTrue(Math.Abs(skipToScore -
>> >> >> >> > > > scorer_.Score()) < MaxDiff, "unstable skipTo(" + i + ")
>> >> score!");
>> >> >> >> > > > +                        Assert.AreEqual(Math.Abs(score -
>> >> >> >> skipToScore)
>> >> >> >> > <
>> >> >> >> > > > MaxDiff, "query assigned doc " + doc + " a score of <" +
>> >> score +
>> >> >> ">
>> >> >> >> but
>> >> >> >> > > > skipTo(" + i + ") has <" + skipToScore + ">!");
>> >> >> >> > > >
>> >> >> >> > > >                          // Hurry things along if they are
>> >> going
>> >> >> >> slow
>> >> >> >> > (eg
>> >> >> >> > > >                          // if you got SimpleText codec
>> this
>> >> will
>> >> >> >> kick
>> >> >> >> > > in):
>> >> >> >> > > >
>> >> >> >> > > >
>> >> >> >> > >
>> >> >> >> >
>> >> >> >>
>> >> >> >
>> >> >> >
>> >> >>
>> >> >
>> >> >
>> >>
>> >
>> >
>>
>
>

Re: lucenenet git commit: use proper float comparison

Reply via email to