Steve,
This is great stuff. IMHO this should be the default behaviour - especially
coming from someone who is new to Lucene. I can not tell you how surprised
i was when someone searched for a phrase and it was returned but did not
exist in the document except as "access, the manager". Your
Greg,
I include a patch below to StopFilter.java which should inhibit exact phrase
matching across a removed stopword.
[Would it be useful for this to be the default behavior?]
See the API docs for Token.setPositionIncrement(int):
http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/analy
> One of these documents has the line "access, the
> manager". When searching for the phrase "access manager", this document is
> being returned. I understand why (at least i think i do), because a stop
> word is "the" and the "," is being removed by the tokenizer, my question is
> is there any w
On Thursday 17 July 2003 07:20, greg wrote:
> I have several document sections that are being indexed via the
> StandardAnalyzer. One of these documents has the line "access, the
> manager". When searching for the phrase "access manager", this document is
> being returned. I understand why (at l
I have several document sections that are being indexed via the
StandardAnalyzer. One of these documents has the line "access, the
manager". When searching for the phrase "access manager", this document is
being returned. I understand why (at least i think i do), because a stop
word is "the"