Re: Ignoring “de la” at index or search time

2019-03-01 Thread baris . kazar
this did not work, any suggestions please? QueryParser parser = new QueryParser(columns[0], analyzer) ; Query query5 = parser.parse(q+"~"); i cant set the slop value like parser.setPhraseSlop(slopValue); i still see the query printed as with value 2: Query5:: :~2 Best regards On

Re: Ignoring “de la” at index or search time

2019-02-25 Thread baris . kazar
Ok, found answer to this question: parser.setPhraseSlop(slopValue); Thanks On 2/25/19 11:43 AM, baris.ka...@oracle.com wrote: Thanks Erick, that was very helplful. Now, i see what you mean by at the begining of this thread: stopwords are less of a concern Now, may i ask the following

Re: Ignoring “de la” at index or search time

2019-02-25 Thread baris . kazar
Thanks Erick, that was very helplful. Now, i see what you mean by at the begining of this thread: stopwords are less of a concern Now, may i ask the following related question? QueryParser parser = new QueryParser(columns[0], analyzer) ; Query query5 = parser.parse(q+"~"); i see the query 

Re: Ignoring “de la” at index or search time

2019-02-24 Thread Erick Erickson
Case 1. Stopwords are irrelevant. If you sreach field:(a AND b) you're asking if both appear in the field, and that's the only question. It doesn't matter what other words are in the field. It doesn't matter whether they're close to each other. Case 2. Yep. On Sun, Feb 24, 2019, 17:02

Re: Ignoring “de la” at index or search time

2019-02-24 Thread baris.kazar
There is PhraseQuery, too, but lets consider two cases: case1: that PhraseQuery is not being used: then should i add to standard filter’s stopwords also the french stopwords both at index & search times? can i just add them at search time and keep old index as it is? [gotta disable this editing

Re: Ignoring “de la” at index or search time

2019-02-24 Thread baris.kazar
There is PhraseQuery, too, but lets consider two cases: case1: that PhraseQuery is not being used: then should i add to standard filter’s stopwords also the french stopwords both at index & search times? can i just add them at search time and keep old friends index as it is? case2: that

Re: Ignoring “de la” at index or search time

2019-02-24 Thread baris.kazar
That is not what i am looking for. Thanks. c b search string finds a b but how cant find a de la b so i will try french stopwords. Doing that i am using 8 queries like the ones i mentioned. Best > On Feb 24, 2019, at 1:19 PM, Erick Erickson wrote: > > Phrase search is looking for words next

Re: Ignoring “de la” at index or search time

2019-02-24 Thread Erick Erickson
Phrase search is looking for words next to each other. A phrase search on the text “my dog has fleas” would succeed for “my dog” or “has fleas” but not “my fleas” since the words are not right next to each other. “my fleas”~3 would succeed because the “~3” indicates that the words can have

Re: Ignoring “de la” at index or search time

2019-02-24 Thread baris.kazar
i guess so what is phrase search? c b is searched do you expect a de la b? Thanks > On Feb 24, 2019, at 10:49 AM, Erick Erickson wrote: > > Not sure we’re talking about the same thing. I was talking specifically about > _phrase_ searches. If all you want is the clause you just said, phrases

Re: Ignoring “de la” at index or search time

2019-02-24 Thread Erick Erickson
Not sure we’re talking about the same thing. I was talking specifically about _phrase_ searches. If all you want is the clause you just said, phrases are not involved at all and the presence or absence of intervening words is totally unnecessary. This assumes your field type tokenizes the input

Re: Ignoring “de la” at index or search time

2019-02-23 Thread baris.kazar
In this case search string is c b and then search query has 8 combos including two cases with c b ~ which means find all containing c And b and c Or b ( two separate queries having ~ ) and then i can find a b but not a de la b without French stopwords. Thanks > On Feb 23, 2019, at 6:52 PM, Erick

Re: Ignoring “de la” at index or search time

2019-02-23 Thread Erick Erickson
Lucene won’t ignore these unless you tell it to via stopwords. This is a problem no matter how you look at it. If you do put in stopwords, the word _positions_ are retained. In your example, word position a 1 de 2 la 3 b 4 If you remove “de” and “la” via

Re: Ignoring “de la” at index or search time

2019-02-23 Thread baris.kazar
Thanks Erick there is a pattern i cant catch in my results such as: a de la b i catch “a b” though. I though Lucene might ignore those automatically while creating index. > On Feb 23, 2019, at 12:29 PM, Erick Erickson wrote: > > Use stopwords, although it's becoming less of a concern, why do

Re: Ignoring “de la” at index or search time

2019-02-23 Thread Erick Erickson
Use stopwords, although it's becoming less of a concern, why do you think you need to? On Sat, Feb 23, 2019, 08:42 baris.kazar wrote: > Hi,- > What is the (most efficient) way to > ignore “de la” kinda connectors > in a string at index or search time? > Thanks > >

Ignoring “de la” at index or search time

2019-02-23 Thread baris.kazar
Hi,- What is the (most efficient) way to ignore “de la” kinda connectors in a string at index or search time? Thanks - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: