[ http://issues.apache.org/jira/browse/LUCENE-494?page=all ]
Mark Harwood updated LUCENE-494: -------------------------------- Attachment: QueryAutoStopWordAnalyzerTest.java > Analyzer for preventing overload of search service by queries with common > terms in large indexes > ------------------------------------------------------------------------------------------------ > > Key: LUCENE-494 > URL: http://issues.apache.org/jira/browse/LUCENE-494 > Project: Lucene - Java > Type: New Feature > Components: Analysis > Reporter: Mark Harwood > Priority: Minor > Attachments: QueryAutoStopWordAnalyzer.java, > QueryAutoStopWordAnalyzerTest.java > > An analyzer used primarily at query time to wrap another analyzer and provide > a layer of protection > which prevents very common words from being passed into queries. For very > large indexes the cost > of reading TermDocs for a very common word can be high. This analyzer was > created after experience with > a 38 million doc index which had a term in around 50% of docs and was causing > TermQueries for > this term to take 2 seconds. > Use the various "addStopWords" methods in this class to automate the > identification and addition of > stop words found in an already existing index. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]