That is not what i am looking for. Thanks. c b search string finds a b but how cant find a de la b so i will try french stopwords. Doing that i am using 8 queries like the ones i mentioned. Best
> On Feb 24, 2019, at 1:19 PM, Erick Erickson <erickerick...@gmail.com> wrote: > > Phrase search is looking for words next to each other. A phrase search on the > text “my dog has fleas” would succeed for “my dog” or “has fleas” but not “my > fleas” since the words are not right next to each other. “my fleas”~3 would > succeed because the “~3” indicates that the words can have intervening terms. > > Searching (dog AND fleas) would match no matter how many words were between > the two. > > If you’re unclear about what phrase search .vs. non-phrase search means, some > background research/ self-education are strongly recommended, such basic > understanding of search is pretty much assumed. > > Best, > Erick > >> On Feb 24, 2019, at 9:25 AM, baris.kazar <baris.ka...@oracle.com> wrote: >> >> i guess so >> what is phrase search? >> c b is searched do you expect a de la b? >> Thanks >> >>> On Feb 24, 2019, at 10:49 AM, Erick Erickson <erickerick...@gmail.com> >>> wrote: >>> >>> Not sure we’re talking about the same thing. I was talking specifically >>> about _phrase_ searches. If all you want is the clause you just said, >>> phrases are not involved at all and the presence or absence of intervening >>> words is totally unnecessary. This assumes your field type tokenizes the >>> input similar to the text_general field in the examples. Specifically _not_ >>> “string” fields or fields that use KeywordTokenizer. >>> >>> q=name:(a AND b) OR name:b >>> >>> for instance. With a query like that it doesn’t matter in the least whether >>> there are, or are not any words between “a” and “b”. >>> >>> All that may be obvious to you, but when I read your latest e-mail it >>> occurred to me that we might not be talking about the same thing. >>> >>> Best, >>> Erick >>> >>>> On Feb 23, 2019, at 7:33 PM, baris.kazar <baris.ka...@oracle.com> wrote: >>>> >>>> In this case search string is c b >>>> and then search query has 8 combos >>>> including two cases with c b ~ which means find all containing c And b and >>>> c Or b ( two separate queries having ~ ) >>>> and then i can find a b but not a de la b without French stopwords. >>>> Thanks >>>> >>>>> On Feb 23, 2019, at 6:52 PM, Erick Erickson <erickerick...@gmail.com> >>>>> wrote: >>>>> >>>>> Lucene won’t ignore these unless you tell it to via stopwords. >>>>> >>>>> This is a problem no matter how you look at it. If you do put in >>>>> stopwords, the word _positions_ are retained. In your example, >>>>> word position >>>>> a 1 >>>>> de 2 >>>>> la 3 >>>>> b 4 >>>>> >>>>> If you remove “de” and “la” via stopwords, the positions are still: >>>>> >>>>> word position >>>>> a 1 >>>>> b 4 >>>>> >>>>> So searching for “a b” would fail in the second case unless you included >>>>> “slop” as >>>>> “a b”~2 >>>>> >>>>> But let’s say you _do not_ have input with these stopwords, just “a b". >>>>> The positions >>>>> will be 1 and 2 respectively. Here the user would expect “a b” to match >>>>> this doc, but >>>>> not a doc with “a de la b” (unless they knew a lot about search!). >>>>> >>>>> So maybe the right thing to do is let phrases have slop as a matter of >>>>> course. >>>>> >>>>> Best, >>>>> Erick >>>>> >>>>> >>>>>> On Feb 23, 2019, at 11:07 AM, baris.kazar <baris.ka...@oracle.com> wrote: >>>>>> >>>>>> Thanks Erick there is a pattern i cant catch in my results such as: >>>>>> a de la b >>>>>> i catch “a b” though. >>>>>> I though Lucene might ignore those automatically while creating index. >>>>>> >>>>>> >>>>>>> On Feb 23, 2019, at 12:29 PM, Erick Erickson <erickerick...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>> Use stopwords, although it's becoming less of a concern, why do you >>>>>>> think >>>>>>> you need to? >>>>>>> >>>>>>>> On Sat, Feb 23, 2019, 08:42 baris.kazar <baris.ka...@oracle.com> wrote: >>>>>>>> >>>>>>>> Hi,- >>>>>>>> What is the (most efficient) way to >>>>>>>> ignore “de la” kinda connectors >>>>>>>> in a string at index or search time? >>>>>>>> Thanks >>>>>>>> >>>>>>>> --------------------------------------------------------------------- >>>>>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>>>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>>>> >>>>> >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>>> >>>> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>> >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org