Re: [I] Use ULP float comparison instead of epsilon-based comparison [lucene]

via GitHub Mon, 01 Sep 2025 20:38:14 -0700


sstults commented on issue #13789:
URL: https://github.com/apache/lucene/issues/13789#issuecomment-3243676744


   In the learning-to-rank plugins for OpenSearch and Elasticsearch I ran an 
evaluation across each of Lucene's similarity classes and the expected score 
computed from the LTR model. It probably won't surprise you to hear that the 
ULP tolerances I needed across the various classes varied wildly, given that 
the number and kind of floating point operations involved in each calculation 
was wildly different.
   
   The [C++ Boost math 
library](https://www.boost.org/doc/libs/boost_1_75_0/libs/math/doc/html/math_toolkit/float_comparison.html)
 has some good references on this. What I got out of it is that a hybrid 
approach is more reliable across more use cases.
   
   Something like:
   
   ```java
   import static org.junit.jupiter.api.Assertions.assertEquals;
   
   public final class FloatApprox {
     // Tune these three knobs per project/suite:
     private static final double ABS_FLOOR = 1e-7;   // protects near zero
     private static final double REL_TOL   = 1e-4;   // ~0.01% relative
     private static final int    ULP_MULT  = 16;     // a few ULPs headroom
   
     public static void assertApproxEquals(float expected, float actual) {
       double mag = Math.max(Math.abs((double) expected), Math.abs((double) 
actual));
       double ulp = Math.ulp((double) expected); // local spacing around 
expected
       double delta = Math.max(ABS_FLOOR, Math.max(REL_TOL * mag, ULP_MULT * 
ulp));
       assertEquals(expected, actual, delta);
     }
   }
   ```
   
   `ABS_FLOOR`
   - An absolute tolerance floor for very small numbers.
   - Without this, relative error would blow up near zero.
   
   `REL_TOL`
   - A relative tolerance factor (0.01% here).
   - Scales with magnitude, useful for mid-to-large values.
   
   `ULP_MULT`
   - A small multiplier of ULP size.
   - Gives some slack in terms of floating-point representation granularity.
   
   You might even add these values as defaults to `assertApproxEquals()` and 
allow overriding as needed. 
   
   I'm about to do this to the LTR plugin and I'll report back if I run into 
anything unexpected. I'd love to *not* do this myself though and instead call 
this method in either Lucene or JUnit (cc @sbrannen).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [I] Use ULP float comparison instead of epsilon-based comparison [lucene]

Reply via email to