In this case search string is c b
and then search query has 8 combos
including two cases with c b ~ which means find all containing c And b and c Or 
b ( two separate queries having ~ )
and then i can find a b but not a de la b without French stopwords.
Thanks

> On Feb 23, 2019, at 6:52 PM, Erick Erickson <erickerick...@gmail.com> wrote:
> 
> Lucene won’t ignore these unless you tell it to via stopwords.
> 
> This is a problem no matter how you look at it. If you do put in stopwords, 
> the word _positions_ are retained. In your example,
> word     position
> a           1
> de         2
> la         3
> b           4
> 
> If you remove “de” and “la” via stopwords, the positions are still:
> 
> word     position
> a           1
> b           4
> 
> So searching for “a b” would fail in the second case unless you included 
> “slop” as
> “a b”~2
> 
> But let’s say you _do not_ have input with these stopwords, just “a b". The 
> positions
> will be 1 and 2 respectively. Here the user would expect “a b” to match this 
> doc, but
> not a doc with “a de la b” (unless they knew a lot about search!).
> 
> So maybe the right thing to do is let phrases have slop as a matter of course.
> 
> Best,
> Erick
> 
> 
>> On Feb 23, 2019, at 11:07 AM, baris.kazar <baris.ka...@oracle.com> wrote:
>> 
>> Thanks Erick there is a pattern i cant catch in my results such as:
>> a de la b
>> i catch “a b” though.
>> I though Lucene might ignore those automatically while creating index.
>> 
>> 
>>> On Feb 23, 2019, at 12:29 PM, Erick Erickson <erickerick...@gmail.com> 
>>> wrote:
>>> 
>>> Use stopwords, although it's becoming less of a concern, why do you think
>>> you need to?
>>> 
>>>> On Sat, Feb 23, 2019, 08:42 baris.kazar <baris.ka...@oracle.com> wrote:
>>>> 
>>>> Hi,-
>>>> What is the (most efficient) way to
>>>> ignore “de la” kinda connectors
>>>> in a string at index or search time?
>>>> Thanks
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>> 
>>>> 
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to