[
https://issues.apache.org/jira/browse/LUCENENET-590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085274#comment-16085274
]
Shad Storhaug commented on LUCENENET-590:
-----------------------------------------
I took a look at the source for this method and it is exactly the same as in
Java, and it is still the same implementation in the master branch of Lucene.
{code:title=SpellChecker.cs|borderStyle=solid}
public virtual bool Exist(string word)
{
// obtainSearcher calls ensureOpen
IndexSearcher indexSearcher = ObtainSearcher();
try
{
// TODO: we should use ReaderUtil+seekExact, we dont care about
the docFreq
// this is just an existence check
return indexSearcher.IndexReader.DocFreq(new Term(F_WORD,
word)) > 0;
}
finally
{
ReleaseSearcher(indexSearcher);
}
}
{code}
The exact way it works depends on the implementation of the {{DocFreq()}}
method, which in turn depends on the {{Directory}} implementation used
(specifically, what type of {{AtomicReader}} is opened). I suspect all of the
built-in {{Directory}} implementations work similarly, but it is possible to
provide your own that has an alternate implementation.
The {{ReaderUtil.SeekExact()}} method mentioned doesn't exist in Lucene 4.8.0,
but the {{Exist()}} method is virtual so you can provide your own
implementation if it doesn't work exactly the way you like.
I suspect this is the correct default behavior. After all, words that are less
than 3 characters are not often misspelled and there would likely be a
performance penalty for checking them.
But there is no way to tell if this is the correct behavior without a sample of
the code including the type of directory implementation you are using. Do note
that if you are using one of the {{FSDirectory.Open()}} overloads the
implementation you get depends on your OS and whether you are on 32 or 64 bit.
The quickest way to check would be to provide a test in the TestSpellChecker
class
(https://github.com/apache/lucenenet/blob/master/src/Lucene.Net.Tests.Suggest/Spell/TestSpellChecker.cs)
that demonstrates a working and a failing case (either here or as a pull
request on GitHub), which could be ported back to Java to see if it behaves the
same way.
> SpellChecker.Exist() minimum word length
> -----------------------------------------
>
> Key: LUCENENET-590
> URL: https://issues.apache.org/jira/browse/LUCENENET-590
> Project: Lucene.Net
> Issue Type: Bug
> Components: Lucene.Net.Suggest
> Affects Versions: Lucene.Net 4.8.0
> Environment: .NET 4.6
> Reporter: Meta
>
> Hi,
> I'm not exactly sure if this is a bug or by design, but I've noticed when
> using the .Exist function of the SpellCheker
> Lucene.Net.Search.Spell.SpellChecker.Exist(string), it does not check if the
> word exist if the word character length is 2.
> Let me know if you have questions.
> Thanks
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)