[
https://issues.apache.org/jira/browse/LUCENENET-523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13663764#comment-13663764
]
Simon Svensson commented on LUCENENET-523:
------------------------------------------
Your code example does nothing to verify how stop-words are handled. It sounds
like you're using the default stop-words when indexing. This is a quick text
proving that the words 'of' and 'the' are kept when using CharArraySet.EMPTY_SET
{code}
[Test(Description = "Verify that StandardAnalyzer with empty stopwords keeps
'of' and 'the'.")]
public void StandardAnalyzerWithEmptyStopWords() {
var analyzer = new StandardAnalyzer(Version.LUCENE_30,
CharArraySet.EMPTY_SET);
var terms = ExtractTerms(analyzer, "test of the shazaam");
CollectionAssert.AreEquivalent(new[] { "test", "of", "the", "shazaam" },
terms);
}
public static String[] ExtractTerms(Analyzer analyzer, String text) {
using(var stringReader = new StringReader(text))
using(var stream = analyzer.TokenStream("f", stringReader)) {
var termAttr = stream.GetAttribute<ITermAttribute> ();
var terms = new List<String>();
while (stream.IncrementToken()) {
terms.Add(termAttr.Term);
}
return terms.ToArray();
}
}
{code}
> StandardAnalyzer StopWords cannot be used
> -----------------------------------------
>
> Key: LUCENENET-523
> URL: https://issues.apache.org/jira/browse/LUCENENET-523
> Project: Lucene.Net
> Issue Type: Bug
> Reporter: Phinehas
>
> When set the stop words list to empty set, it stills stop the english stop
> words such as "of", "the". But I want to search these common words in phrase
> query.
> StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_30,
> CharArraySet.EMPTY_SET);
> IndexSearcher searcher = new
> IndexSearcher(FSDirectory.Open(indexDirectory));
> Lucene.Net.Index.IndexReader indexReader =
> Lucene.Net.Index.IndexReader.Open(FSDirectory.Open(indexDirectory), true);
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira