[
https://issues.apache.org/jira/browse/LUCENE-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13597123#comment-13597123
]
Simon Willnauer commented on LUCENE-4813:
-----------------------------------------
bq. Can we do without the FieldStatistics/DocFreqStatistics/etc and just change
'freq' to long?
I really appreciate the fact that this is an object that I can pass in for
several reasons. First you can just plug in your own stats if you want to and
it pulls a terms object only once that I can provide. In my usecase I call the
same instance of DirectSpellChecker in the same request multiple times to
generate candidates and that way I can just keep my Terms / TermsEnum instance
reused which is a small but yet important cost IMO which can in my expert case
help. For the users this that have used this class before nothing really
changes unless you want to go to totalTermFreq as their stats but we can make
this simple. We can also make these classes package private I am totally ok
with this to hide this small complexity here from the average user but enable
the expert user. API stays the same and if sumTotalTermFreq is available you
also get it in the SuggestWord. I would not want to fork this entire code just
for the sake of being able to reuse these statistics etc. if hiding this from
the user is the problem then lets move to pkg private. if its just you
"feeling" this is a too big of a change for the sake then I am not moving sorry.
> Allow DirectSpellchecker to use totalTermFrequency rather than docFrequency
> ---------------------------------------------------------------------------
>
> Key: LUCENE-4813
> URL: https://issues.apache.org/jira/browse/LUCENE-4813
> Project: Lucene - Core
> Issue Type: Bug
> Components: modules/spellchecker
> Affects Versions: 4.1
> Reporter: Simon Willnauer
> Fix For: 4.2, 5.0
>
> Attachments: LUCENE-4813.patch, LUCENE-4813.patch
>
>
> we have a bunch of new statistics in on our term dictionaries that we should
> make use of where it makes sense. For DirectSpellChecker totalTermFreq and
> sumTotalTermFreq might be better suited for spell correction on top of a
> fulltext index than docFreq and maxDoc
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]