On Oct 6, 2005, at 8:28 AM, Ahmed El-dawy wrote:
Thanks for your help.
I used PhraseQuery to boost close terms. I think of an idea for sop
words but I don't know, if it has any drawbacks. I can index any dummy
Token in place of all stop words. This token will never be searched
but it will be cou
Thanks for your help.
I used PhraseQuery to boost close terms. I think of an idea for sop
words but I don't know, if it has any drawbacks. I can index any dummy
Token in place of all stop words. This token will never be searched
but it will be counted as a Token and will make a space between words.
Chris, you may consider using a modified version of the Nutch analysis
(http://lucene.apache.org/nutch/apidocs/org/apache/nutch/analysis/package-summary.html)
which has a very slick treatment of stopwords. Please refer to chapter
4, page 145 of the Lucene in Action written by Eric and Otis for s
> > "welcome there"~9
> >
>
> The issue is that "all" is a stop word, though. The StopFilter does
> not leave a hole when stop words are removed, so indexing "welcome
> all there" is exactly the same as indexing "welcome there" as far as
> the index is concerned. I started to address this sit
On Oct 3, 2005, at 4:56 AM, Chris Lamprecht wrote:
1- Words in Document that are more close to original search terms
have
a larger Score. For example, if I was searching for "wellcome",
Document("wellcome") must be better than Document("welcome")
I'm just "thinking outloud" here, but some i
Hi Ahmed,
> 2- Change some letters in the words with common spelling mistakes. For
> example, wellcome will be changed to welcome.
Sounds pretty cool
> 1- Words in Document that are more close to original search terms have
> a larger Score. For example, if I was searching for "wellcome",
> Docum
Hello,
I have made a new Analyzer that does the following:
1- Remove common prefixes and suffixes. For example, uncommon will be
converted to common.
2- Change some letters in the words with common spelling mistakes. For
example, wellcome will be changed to welcome.
3- Stop words are removed.
I