Does anyone have any insight on the right way to approach removing consolidation of boosts on multivalue fields?
On Fri, May 6, 2011 at 10:51 AM, Neil Hooey <[email protected]> wrote: > After looking for places in the code where boosts are consolidated for > multivalue fields, I found this class: > lucene/src/java/org/apache/lucene/index/FieldInvertState.java > > Which has a "float boost" member variable. FieldInvertState seems to > keep track of the index positions of several Fields with the same > name. > > I'm considering changing that float to a "Vector<float>" to keep track > of each individual field's boost, but there is a lot of code that > calls FieldInvertState.getBoost(). The excerpts are listed at the end > of this email. > > Does anyone have a good idea of how to get FieldInvertState to store > boosts for each field, if that's even the right direction to go? > > Calls to FieldInvertState.getBoost(): > ---------------------------------------------------------------------- > $ ack --java -i 'state\.getBoost' > lucene/contrib/misc/src/java/org/apache/lucene/misc/SweetSpotSimilarity.java > 103: * Implemented as <code> state.getBoost() * > 117: return state.getBoost() * computeLengthNorm(numTokens); > > lucene/contrib/misc/src/test/org/apache/lucene/index/TestFieldNormModifier.java > 53: return state.getBoost() * (discountOverlaps ? > state.getLength() - state.getNumOverlap() : state.getLength()); > > lucene/contrib/misc/src/test/org/apache/lucene/misc/TestLengthNormModifier.java > 58: return state.getBoost() * (discountOverlaps ? > state.getLength() - state.getNumOverlap() : state.getLength()); > 179: return state.getBoost() * (discountOverlaps ? > state.getLength() - state.getNumOverlap() : state.getLength()); > > lucene/src/java/org/apache/lucene/search/DefaultSimilarity.java > 26: * <code>state.getBoost()*lengthNorm(numTerms)</code>, where > 40: return state.getBoost() * ((float) (1.0 / Math.sqrt(numTerms))); > > lucene/src/test/org/apache/lucene/index/TestIndexReaderCloneNorms.java > 52: return state.getBoost(); > > lucene/src/test/org/apache/lucene/index/TestNorms.java > 51: return state.getBoost(); > > lucene/src/test/org/apache/lucene/index/TestOmitTf.java > 44: @Override public float computeNorm(FieldInvertState state) > { return state.getBoost(); } > > lucene/src/test/org/apache/lucene/search/TestDisjunctionMaxQuery.java > 67: return state.getBoost(); > > lucene/src/test/org/apache/lucene/search/TestSimilarity.java > 47: @Override public float computeNorm(FieldInvertState state) > { return state.getBoost(); } > > lucene/src/test/org/apache/lucene/search/payloads/TestPayloadNearQuery.java > 326: return state.getBoost(); > > lucene/src/test/org/apache/lucene/search/payloads/TestPayloadTermQuery.java > 319: return state.getBoost(); > > > On Thu, May 5, 2011 at 11:45 AM, Neil Hooey <[email protected]> wrote: >> Currently when you assign boosts to multivalue fields during >> index-time, they are consolidated, and the individual boosts are lost. >> >> There are some relevant cases where the individual boost values are >> important, so I'd like to fix this behaviour. >> >> I've created an issue here, which gives some examples: >> https://issues.apache.org/jira/browse/SOLR-2499 >> >> Do you have any ideas of where to get started with this fix, or have >> an idea of how difficult the fix might be? >> >> Thanks, >> >> - Neil >> > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
