[jira] [Commented] (LUCENE-8759) BlockMaxConjunctionScorer's simplified way of computing max scores hurts performance
[ https://issues.apache.org/jira/browse/LUCENE-8759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16819305#comment-16819305 ] Adrien Grand commented on LUCENE-8759: -- Maybe the test could explicitly test both normal and denormal floats all the time? Otherwise +1. I'm curious whether this makes any difference when running luceneutil? > BlockMaxConjunctionScorer's simplified way of computing max scores hurts > performance > > > Key: LUCENE-8759 > URL: https://issues.apache.org/jira/browse/LUCENE-8759 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Attachments: LUCENE-8759.patch > > > BlockMaxConjunctionScorer computes the minimum value that the score should > have after each scorer in order to be able to interrupt scorer as soon as > possible. For instance say scorers A, B and C produce maximum scores that are > equal to 4, 2 and 1. If the minimum competitive score is X, then the score > after scoring A, B and C must be at least X, the score after scoring A and B > must be at least X-1 and the score after scoring A must be at least X-1-2. > However this is made a bit more complex than that due to floating-point > numbers and the fact that intermediate score values are doubles which only > get casted to a float after all values have been summed up. In order to keep > things simple, BlockMaxConjunctionScore has the following comment and code > {code} > // Also compute the minimum required scores for a hit to be > competitive > // A double that is less than 'score' might still be converted to > 'score' > // when casted to a float, so we go to the previous float to avoid > this issue > minScores[minScores.length - 1] = minScore > 0 ? > Math.nextDown(minScore) : 0; > {code} > It simplifies the problem by calling Math.nextDown(minScore). However this is > problematic because it defeats the fact that TopScoreDocCollector calls > setMinCompetitiveScore on the float value that is immediately greater than > the k-th greatest hit so far. > nextDown(minScore) is not the value that we need. The value that we need is > the smallest double that converts to minScore when casted to a float, which > would be half-way between nextDown(minScore) and minScore. In some cases this > would help get better performance out of conjunctions, especially if some > clauses produce constant scores. > MaxScoreSumPropagator#setMinCompetitiveScore has the same issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8759) BlockMaxConjunctionScorer's simplified way of computing max scores hurts performance
[ https://issues.apache.org/jira/browse/LUCENE-8759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16819179#comment-16819179 ] Jim Ferenczi commented on LUCENE-8759: -- Thanks that looks cleaner, I think I got confused because I was not removing 1 to the floatBits. I tested your solution on all positive floats and it works so I'll commit soon unless you have objections. > BlockMaxConjunctionScorer's simplified way of computing max scores hurts > performance > > > Key: LUCENE-8759 > URL: https://issues.apache.org/jira/browse/LUCENE-8759 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Attachments: LUCENE-8759.patch > > > BlockMaxConjunctionScorer computes the minimum value that the score should > have after each scorer in order to be able to interrupt scorer as soon as > possible. For instance say scorers A, B and C produce maximum scores that are > equal to 4, 2 and 1. If the minimum competitive score is X, then the score > after scoring A, B and C must be at least X, the score after scoring A and B > must be at least X-1 and the score after scoring A must be at least X-1-2. > However this is made a bit more complex than that due to floating-point > numbers and the fact that intermediate score values are doubles which only > get casted to a float after all values have been summed up. In order to keep > things simple, BlockMaxConjunctionScore has the following comment and code > {code} > // Also compute the minimum required scores for a hit to be > competitive > // A double that is less than 'score' might still be converted to > 'score' > // when casted to a float, so we go to the previous float to avoid > this issue > minScores[minScores.length - 1] = minScore > 0 ? > Math.nextDown(minScore) : 0; > {code} > It simplifies the problem by calling Math.nextDown(minScore). However this is > problematic because it defeats the fact that TopScoreDocCollector calls > setMinCompetitiveScore on the float value that is immediately greater than > the k-th greatest hit so far. > nextDown(minScore) is not the value that we need. The value that we need is > the smallest double that converts to minScore when casted to a float, which > would be half-way between nextDown(minScore) and minScore. In some cases this > would help get better performance out of conjunctions, especially if some > clauses produce constant scores. > MaxScoreSumPropagator#setMinCompetitiveScore has the same issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8759) BlockMaxConjunctionScorer's simplified way of computing max scores hurts performance
[ https://issues.apache.org/jira/browse/LUCENE-8759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16819154#comment-16819154 ] Adrien Grand commented on LUCENE-8759: -- I gave a try at simplifying this a bit and have the following version which passes your test: {code} if (value <= 0) { throw new IllegalArgumentException("Value must be > 0, got " + value); } else if (Float.isFinite(value) == false) { // preserve infinity or NaN return value; } final int floatBits = Float.floatToIntBits(value); final int prevFloatBits = floatBits - 1; final int prevFloatExp = prevFloatBits >>> 23; // delta between the mantissa of the double representation of `value` and // the previous float value is 2^shift int shift = 52 - 23; if (prevFloatExp == 0x0) { // we need to tune `shift` for denormal floats whose mantissa doesn't have // an implicit leading bit shift += Integer.numberOfLeadingZeros(prevFloatBits) - (31-23); } long doubleBits = Double.doubleToLongBits(value); doubleBits -= (1L << (shift - 1)); // half way between the current float and the previous one doubleBits += (floatBits & 0x1); // add one if necessary to compensate for the fact that Java rounds to even in case of tie return Double.longBitsToDouble(doubleBits); {code} > BlockMaxConjunctionScorer's simplified way of computing max scores hurts > performance > > > Key: LUCENE-8759 > URL: https://issues.apache.org/jira/browse/LUCENE-8759 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Attachments: LUCENE-8759.patch > > > BlockMaxConjunctionScorer computes the minimum value that the score should > have after each scorer in order to be able to interrupt scorer as soon as > possible. For instance say scorers A, B and C produce maximum scores that are > equal to 4, 2 and 1. If the minimum competitive score is X, then the score > after scoring A, B and C must be at least X, the score after scoring A and B > must be at least X-1 and the score after scoring A must be at least X-1-2. > However this is made a bit more complex than that due to floating-point > numbers and the fact that intermediate score values are doubles which only > get casted to a float after all values have been summed up. In order to keep > things simple, BlockMaxConjunctionScore has the following comment and code > {code} > // Also compute the minimum required scores for a hit to be > competitive > // A double that is less than 'score' might still be converted to > 'score' > // when casted to a float, so we go to the previous float to avoid > this issue > minScores[minScores.length - 1] = minScore > 0 ? > Math.nextDown(minScore) : 0; > {code} > It simplifies the problem by calling Math.nextDown(minScore). However this is > problematic because it defeats the fact that TopScoreDocCollector calls > setMinCompetitiveScore on the float value that is immediately greater than > the k-th greatest hit so far. > nextDown(minScore) is not the value that we need. The value that we need is > the smallest double that converts to minScore when casted to a float, which > would be half-way between nextDown(minScore) and minScore. In some cases this > would help get better performance out of conjunctions, especially if some > clauses produce constant scores. > MaxScoreSumPropagator#setMinCompetitiveScore has the same issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8759) BlockMaxConjunctionScorer's simplified way of computing max scores hurts performance
[ https://issues.apache.org/jira/browse/LUCENE-8759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16819049#comment-16819049 ] Jim Ferenczi commented on LUCENE-8759: -- I tried this approach but the shift for denormalized numbers is computed from the float bits so I started from the same bits for normalized numbers too. > BlockMaxConjunctionScorer's simplified way of computing max scores hurts > performance > > > Key: LUCENE-8759 > URL: https://issues.apache.org/jira/browse/LUCENE-8759 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Attachments: LUCENE-8759.patch > > > BlockMaxConjunctionScorer computes the minimum value that the score should > have after each scorer in order to be able to interrupt scorer as soon as > possible. For instance say scorers A, B and C produce maximum scores that are > equal to 4, 2 and 1. If the minimum competitive score is X, then the score > after scoring A, B and C must be at least X, the score after scoring A and B > must be at least X-1 and the score after scoring A must be at least X-1-2. > However this is made a bit more complex than that due to floating-point > numbers and the fact that intermediate score values are doubles which only > get casted to a float after all values have been summed up. In order to keep > things simple, BlockMaxConjunctionScore has the following comment and code > {code} > // Also compute the minimum required scores for a hit to be > competitive > // A double that is less than 'score' might still be converted to > 'score' > // when casted to a float, so we go to the previous float to avoid > this issue > minScores[minScores.length - 1] = minScore > 0 ? > Math.nextDown(minScore) : 0; > {code} > It simplifies the problem by calling Math.nextDown(minScore). However this is > problematic because it defeats the fact that TopScoreDocCollector calls > setMinCompetitiveScore on the float value that is immediately greater than > the k-th greatest hit so far. > nextDown(minScore) is not the value that we need. The value that we need is > the smallest double that converts to minScore when casted to a float, which > would be half-way between nextDown(minScore) and minScore. In some cases this > would help get better performance out of conjunctions, especially if some > clauses produce constant scores. > MaxScoreSumPropagator#setMinCompetitiveScore has the same issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8759) BlockMaxConjunctionScorer's simplified way of computing max scores hurts performance
[ https://issues.apache.org/jira/browse/LUCENE-8759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16818980#comment-16818980 ] Adrien Grand commented on LUCENE-8759: -- I'm wondering whether the logic could be simplified: the current code takes care of both casting to a double and rounding to the smallest double, maybe it could cast the float to a double and then twiddle bits of the double value (maybe it doesn't help, just wondering). > BlockMaxConjunctionScorer's simplified way of computing max scores hurts > performance > > > Key: LUCENE-8759 > URL: https://issues.apache.org/jira/browse/LUCENE-8759 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Attachments: LUCENE-8759.patch > > > BlockMaxConjunctionScorer computes the minimum value that the score should > have after each scorer in order to be able to interrupt scorer as soon as > possible. For instance say scorers A, B and C produce maximum scores that are > equal to 4, 2 and 1. If the minimum competitive score is X, then the score > after scoring A, B and C must be at least X, the score after scoring A and B > must be at least X-1 and the score after scoring A must be at least X-1-2. > However this is made a bit more complex than that due to floating-point > numbers and the fact that intermediate score values are doubles which only > get casted to a float after all values have been summed up. In order to keep > things simple, BlockMaxConjunctionScore has the following comment and code > {code} > // Also compute the minimum required scores for a hit to be > competitive > // A double that is less than 'score' might still be converted to > 'score' > // when casted to a float, so we go to the previous float to avoid > this issue > minScores[minScores.length - 1] = minScore > 0 ? > Math.nextDown(minScore) : 0; > {code} > It simplifies the problem by calling Math.nextDown(minScore). However this is > problematic because it defeats the fact that TopScoreDocCollector calls > setMinCompetitiveScore on the float value that is immediately greater than > the k-th greatest hit so far. > nextDown(minScore) is not the value that we need. The value that we need is > the smallest double that converts to minScore when casted to a float, which > would be half-way between nextDown(minScore) and minScore. In some cases this > would help get better performance out of conjunctions, especially if some > clauses produce constant scores. > MaxScoreSumPropagator#setMinCompetitiveScore has the same issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org