chenhh021 opened a new issue, #780:
URL: https://github.com/apache/lucenenet/issues/780

   See [Lucene-10008](https://issues.apache.org/jira/browse/LUCENE-10008). It's 
also valid for lucene.
   
   CommonGramsFilterFactory's use of the "words" and "ignoreCase" config 
options is inconsistent with how StopFilterFactory uses them - leading to 
"ignoreCase=true" not being respected unless "words" is specified.
   
   Reproduce:
   ``` c#
   [Test]
           public void testIgnoreCase()
           {
               IResourceLoader loader = new 
ClasspathResourceLoader(typeof(TestAnalyzers));
               CommonGramsFilterFactory factory =
                   (CommonGramsFilterFactory)
                   TokenFilterFactory("CommonGrams", TEST_VERSION_CURRENT, 
loader, "ignoreCase", "true");
               CharArraySet words = factory.CommonWords;
               assertTrue("words is null and it shouldn't be", words != null);
               assertTrue(words.contains("the")); //passes
               assertTrue(words.contains("The")); //fails
               Tokenizer tokenizer = new MockTokenizer(new 
StringReader("testing the factory"),MockTokenizer.WHITESPACE, false);
               TokenStream stream = factory.Create(tokenizer);
               AssertTokenStreamContents(
                   stream, new String[] {"testing", "testing_The", "The", 
"The_factory", "factory"});
           }
   ```
   
   Working for a PR now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@lucenenet.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to