Olli Kuonanoja created LUCENE-8501:
--------------------------------------
Summary: An ability to define the sum method for custom term
frequencies
Key: LUCENE-8501
URL: https://issues.apache.org/jira/browse/LUCENE-8501
Project: Lucene - Core
Issue Type: Improvement
Components: core/index
Reporter: Olli Kuonanoja
Custom term frequencies allows expert users to index and score in custom ways,
however, _DefaultIndexingChain_ adds a limitation to this as the sum of
frequencies can't overflow
{code:java}
try {
invertState.length = Math.addExact(invertState.length,
invertState.termFreqAttribute.getTermFrequency());
} catch (ArithmeticException ae) {
throw new IllegalArgumentException("too many tokens for field \"" +
field.name() + "\"");
}
{code}
This might become an issue if for example the frequency data is encoded in a
different way, say the specific scorer works with float frequencies.
The sum method can be added to _TermFrequencyAttribute_ to get something like
{code:java}
invertState.length =
invertState.termFreqAttribute.addFrequency(invertState.length);
{code}
so users may define the summing method and avoid the owerflow exceptions.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]