You can find the list in StopAnalyzer.java:

public static final String[] ENGLISH_STOP_WORDS = {
    "a", "an", "and", "are", "as", "at", "be", "but", "by",
    "for", "if", "in", "into", "is", "it",
    "no", "not", "of", "on", "or", "such",
    "that", "the", "their", "then", "there", "these",
    "they", "this", "to", "was", "will", "with"
  };

To include additional words, you can use one of the 4 constructors that allow you to set the stop word list:

StopAnalyzer(File stopwordsFile)
          Builds an analyzer with the stop words from the given file.
StopAnalyzer(Reader stopwords)
          Builds an analyzer with the stop words from the given reader.
StopAnalyzer(Set stopWords)
          Builds an analyzer with the stop words from the given set.
StopAnalyzer(String[] stopWords)
          Builds an analyzer which removes words in the provided array.


Good luck,
Ryan

On Apr 10, 2007, at 3:28 PM, sai hariharan wrote:

Hi,

Where can i find the list of words that is used for removal of
common English words by StopAnlayzer ?
Can i add additional words to the stop list ?

Regards,
--
சாய் Hari

Reply via email to