Thanks Patrick. "t" was left over and should have been removed as it is removed from the Java version too. I committed a new version of StopAnalyzer.cs to take care of this defect.
-- George > -----Original Message----- > From: Patrick Burrows [mailto:[EMAIL PROTECTED] > Sent: Saturday, August 11, 2007 10:21 PM > To: [email protected] > Subject: Re: Apache Lucene.Net 2.1 build 002 "Beta" released > > By the way, if that is the correct fix, then changing this > line in Analysis.StopAnalyzer.cs > > > public static readonly System.String[] ENGLISH_STOP_WORDS = > new System. > String[]{"a", "an", "and", "are", "as", "at", "be", "but", > "by", "for", "if", "in", "into", "is", "it", "no", "not", > "of", "on", "or", "such", "t", "that", "the", "their", > "then", "there", "these", "they", "this", "to", "was", > "will", "with"}; to > > > public static readonly System.String[] ENGLISH_STOP_WORDS = > new System. > String[]{"a", "an", "and", "are", "as", "at", "be", "but", > "by", "for", "if", "in", "into", "is", "it", "no", "not", > "of", "on", "or", "such", "that", "the", "their", "then", > "there", "these", "they", "this", "to", "was", "will", "with"}; > > causes that test to pass. > > > On 8/11/07, Patrick Burrows <[EMAIL PROTECTED]> wrote: > > > > Hey George: > > > > Since this is my first attempt at a bug fix, I figure I would just > > write up everything about it and see what the correct > course is to correct it: > > > > The first error that NUnit reports is that the TestStandard test is > > failing. It is failing on this line: > > > > > > AssertAnalyzesTo(a, > > "t-com", new System.String []{"t", "com"}); And the reason > this line > > is failing, ultimately, is because "t" is a stop word and > the Next() > > method in StopFilter.cs has this line: > > > > if > > (!stopWords.Contains(termText)) > > > > return token; The comments in TestStandard() regarding this line > > say > > this: > > > > > > // t and s had been stopwords in Lucene <= 2.0, which made it > > impossible > > > > // to correctly search for these terms: > > It seems simple enough to remove "t" from the list of stop > words. But > > is this the correct way to fix the issue? Was there a deeper reason > > that made "t" have to be in the list of stop words that > should also be > > checked? Am I thinking too much about it? :-) > > > > > > > > On 8/11/07, George Aroush <[EMAIL PROTECTED]> wrote: > > > > > > Hi Joe, > > > > > > It is a merge, so it make sense (and life easier) to fix the > > > existing NUnit issues before we move on. Sorry, for not > making this > > > clear. > > > > > > Regards, > > > > > > -- George > > > > > > > -----Original Message----- > > > > From: Joe Shaw [mailto:[EMAIL PROTECTED] > > > > Sent: Saturday, August 11, 2007 7:35 PM > > > > To: [EMAIL PROTECTED] > > > > Cc: [email protected] > > > > Subject: Re: Apache Lucene.Net 2.1 build 002 "Beta" released > > > > > > > > Hi, > > > > > > > > On 8/11/07, George Aroush <[EMAIL PROTECTED]> wrote: > > > > > I agree, and I see little value to have a full release of 2.1. > > > > > However, before we start working on 2.2, we should fix the > > > > > existing known issues with > > > > > 2.1 that NUnit tests has exposed; doing so will make the > > > > transition to > > > > > 2.2 must easier. If we take this path, then we can > leave 2.1 in > > > > > a "non-supported" mode and move on to 2.2. Does > everyone agree? > > > > > > > > When new versions of Lucene.Net made, are they merges of the > > > > changes from the previous Java version (converted > somehow), or are > > > > they totally new conversions with some of the .Net-isms > merged in? > > > > If the former, this definitely makes sense. If the latter, I > > > > would think it makes more sense to skip 2.1 entirely. > > > > > > > > In any case, moving forward on either front is positive news. > > > > Keep up the good work. :) > > > > > > > > Joe > > > > > > > > > > > > > > > > -- > > - > > P > > > > > -- > - > P >
