i guess so
what is phrase search?
c b is searched do you expect a de la b?
Thanks

> On Feb 24, 2019, at 10:49 AM, Erick Erickson <erickerick...@gmail.com> wrote:
> 
> Not sure we’re talking about the same thing. I was talking specifically about 
> _phrase_ searches. If all you want is the clause you just said, phrases are 
> not involved at all and the presence or absence of intervening words is 
> totally unnecessary. This assumes your field type tokenizes the input similar 
> to the text_general field in the examples. Specifically _not_ “string” fields 
> or fields that use KeywordTokenizer. 
> 
> q=name:(a AND b) OR name:b
> 
> for instance. With a query like that it doesn’t matter in the least whether 
> there are, or are not any words between “a” and “b”.
> 
> All that may be obvious to you, but when I read your latest e-mail it 
> occurred to me that we might not be talking about the same thing.
> 
> Best,
> Erick
> 
>> On Feb 23, 2019, at 7:33 PM, baris.kazar <baris.ka...@oracle.com> wrote:
>> 
>> In this case search string is c b
>> and then search query has 8 combos
>> including two cases with c b ~ which means find all containing c And b and c 
>> Or b ( two separate queries having ~ )
>> and then i can find a b but not a de la b without French stopwords.
>> Thanks
>> 
>>> On Feb 23, 2019, at 6:52 PM, Erick Erickson <erickerick...@gmail.com> wrote:
>>> 
>>> Lucene won’t ignore these unless you tell it to via stopwords.
>>> 
>>> This is a problem no matter how you look at it. If you do put in stopwords, 
>>> the word _positions_ are retained. In your example,
>>> word     position
>>> a           1
>>> de         2
>>> la         3
>>> b           4
>>> 
>>> If you remove “de” and “la” via stopwords, the positions are still:
>>> 
>>> word     position
>>> a           1
>>> b           4
>>> 
>>> So searching for “a b” would fail in the second case unless you included 
>>> “slop” as
>>> “a b”~2
>>> 
>>> But let’s say you _do not_ have input with these stopwords, just “a b". The 
>>> positions
>>> will be 1 and 2 respectively. Here the user would expect “a b” to match 
>>> this doc, but
>>> not a doc with “a de la b” (unless they knew a lot about search!).
>>> 
>>> So maybe the right thing to do is let phrases have slop as a matter of 
>>> course.
>>> 
>>> Best,
>>> Erick
>>> 
>>> 
>>>> On Feb 23, 2019, at 11:07 AM, baris.kazar <baris.ka...@oracle.com> wrote:
>>>> 
>>>> Thanks Erick there is a pattern i cant catch in my results such as:
>>>> a de la b
>>>> i catch “a b” though.
>>>> I though Lucene might ignore those automatically while creating index.
>>>> 
>>>> 
>>>>> On Feb 23, 2019, at 12:29 PM, Erick Erickson <erickerick...@gmail.com> 
>>>>> wrote:
>>>>> 
>>>>> Use stopwords, although it's becoming less of a concern, why do you think
>>>>> you need to?
>>>>> 
>>>>>> On Sat, Feb 23, 2019, 08:42 baris.kazar <baris.ka...@oracle.com> wrote:
>>>>>> 
>>>>>> Hi,-
>>>>>> What is the (most efficient) way to
>>>>>> ignore “de la” kinda connectors
>>>>>> in a string at index or search time?
>>>>>> Thanks
>>>>>> 
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>> 
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>> 
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to