Re: Analysers for newspaper pages...

2011-11-28 Thread Ian Lea
You can easily use just the CommonGrams stuff from Solr in your pure lucene project. There are a couple of useful docs on stop words and common grams et al at http://www.hathitrust.org/blogs/large-scale-search/slow-queries-and-common-words-part-1 http://www.hathitrust.org/blogs/large-scale-search

Re: Analysers for newspaper pages...

2011-11-28 Thread Dawn Zoƫ Raison
Hi Steve, On 28/11/2011 19:43, Steven A Rowe wrote: I assume that when you refer to "the impact of stop words," you're concerned about query-time performance? You should consider the possibility that performance without removing stop words is good enough that you won't have to take any steps

RE: Analysers for newspaper pages...

2011-11-28 Thread Steven A Rowe
Hi Dawn, I assume that when you refer to "the impact of stop words," you're concerned about query-time performance? You should consider the possibility that performance without removing stop words is good enough that you won't have to take any steps to address the issue. That said, there are