There is PhraseQuery, too, but lets consider two cases:

case1: that PhraseQuery is not being used:
then should i add to standard filter’s stopwords also the french stopwords both 
at index & search times? can i just add them at search time and keep old index 
as it is? [gotta disable this editing feature]

case2: that PhraseQuery being used:
i guess i need to play with the “slops” and stopwords in this case will not 
help, right?

Thanks


> On Feb 24, 2019, at 8:02 PM, baris.kazar <baris.ka...@oracle.com> wrote:
> 
> There is PhraseQuery, too, but lets consider two cases:
> 
> case1: that PhraseQuery is not being used:
> then should i add to standard filter’s stopwords also the french stopwords 
> both at index & search times? can i just add them at search time and keep old 
> friends index as it is?
> 
> case2: that PhraseQuery being used:
> i guess i need to play with the “slops” and stopwords in this case will not 
> help, right?
> 
> Thanks
> 
>> On Feb 24, 2019, at 2:25 PM, baris.kazar <baris.ka...@oracle.com> wrote:
>> 
>> That is not what i am looking for. Thanks.
>> 
>> c b search string finds
>> a b 
>> but how cant find 
>> a de la b
>> so i will try french stopwords.
>> Doing that i am using 8 queries like the ones i mentioned.
>> Best
>> 
>>> On Feb 24, 2019, at 1:19 PM, Erick Erickson <erickerick...@gmail.com> wrote:
>>> 
>>> Phrase search is looking for words next to each other. A phrase search on 
>>> the text “my dog has fleas” would succeed for “my dog” or “has fleas” but 
>>> not “my fleas” since the words are not right next to each other. “my 
>>> fleas”~3 would succeed because the “~3” indicates that the words can have 
>>> intervening terms.
>>> 
>>> Searching (dog AND fleas) would match no matter how many words were between 
>>> the two.
>>> 
>>> If you’re unclear about what phrase search .vs. non-phrase search means, 
>>> some background research/ self-education are strongly recommended, such 
>>> basic understanding of search is pretty much assumed.
>>> 
>>> Best,
>>> Erick
>>> 
>>>> On Feb 24, 2019, at 9:25 AM, baris.kazar <baris.ka...@oracle.com> wrote:
>>>> 
>>>> i guess so
>>>> what is phrase search?
>>>> c b is searched do you expect a de la b?
>>>> Thanks
>>>> 
>>>>> On Feb 24, 2019, at 10:49 AM, Erick Erickson <erickerick...@gmail.com> 
>>>>> wrote:
>>>>> 
>>>>> Not sure we’re talking about the same thing. I was talking specifically 
>>>>> about _phrase_ searches. If all you want is the clause you just said, 
>>>>> phrases are not involved at all and the presence or absence of 
>>>>> intervening words is totally unnecessary. This assumes your field type 
>>>>> tokenizes the input similar to the text_general field in the examples. 
>>>>> Specifically _not_ “string” fields or fields that use KeywordTokenizer. 
>>>>> 
>>>>> q=name:(a AND b) OR name:b
>>>>> 
>>>>> for instance. With a query like that it doesn’t matter in the least 
>>>>> whether there are, or are not any words between “a” and “b”.
>>>>> 
>>>>> All that may be obvious to you, but when I read your latest e-mail it 
>>>>> occurred to me that we might not be talking about the same thing.
>>>>> 
>>>>> Best,
>>>>> Erick
>>>>> 
>>>>>> On Feb 23, 2019, at 7:33 PM, baris.kazar <baris.ka...@oracle.com> wrote:
>>>>>> 
>>>>>> In this case search string is c b
>>>>>> and then search query has 8 combos
>>>>>> including two cases with c b ~ which means find all containing c And b 
>>>>>> and c Or b ( two separate queries having ~ )
>>>>>> and then i can find a b but not a de la b without French stopwords.
>>>>>> Thanks
>>>>>> 
>>>>>>> On Feb 23, 2019, at 6:52 PM, Erick Erickson <erickerick...@gmail.com> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>> Lucene won’t ignore these unless you tell it to via stopwords.
>>>>>>> 
>>>>>>> This is a problem no matter how you look at it. If you do put in 
>>>>>>> stopwords, the word _positions_ are retained. In your example,
>>>>>>> word     position
>>>>>>> a           1
>>>>>>> de         2
>>>>>>> la         3
>>>>>>> b           4
>>>>>>> 
>>>>>>> If you remove “de” and “la” via stopwords, the positions are still:
>>>>>>> 
>>>>>>> word     position
>>>>>>> a           1
>>>>>>> b           4
>>>>>>> 
>>>>>>> So searching for “a b” would fail in the second case unless you 
>>>>>>> included “slop” as
>>>>>>> “a b”~2
>>>>>>> 
>>>>>>> But let’s say you _do not_ have input with these stopwords, just “a b". 
>>>>>>> The positions
>>>>>>> will be 1 and 2 respectively. Here the user would expect “a b” to match 
>>>>>>> this doc, but
>>>>>>> not a doc with “a de la b” (unless they knew a lot about search!).
>>>>>>> 
>>>>>>> So maybe the right thing to do is let phrases have slop as a matter of 
>>>>>>> course.
>>>>>>> 
>>>>>>> Best,
>>>>>>> Erick
>>>>>>> 
>>>>>>> 
>>>>>>>> On Feb 23, 2019, at 11:07 AM, baris.kazar <baris.ka...@oracle.com> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> Thanks Erick there is a pattern i cant catch in my results such as:
>>>>>>>> a de la b
>>>>>>>> i catch “a b” though.
>>>>>>>> I though Lucene might ignore those automatically while creating index.
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On Feb 23, 2019, at 12:29 PM, Erick Erickson 
>>>>>>>>> <erickerick...@gmail.com> wrote:
>>>>>>>>> 
>>>>>>>>> Use stopwords, although it's becoming less of a concern, why do you 
>>>>>>>>> think
>>>>>>>>> you need to?
>>>>>>>>> 
>>>>>>>>>> On Sat, Feb 23, 2019, 08:42 baris.kazar <baris.ka...@oracle.com> 
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> Hi,-
>>>>>>>>>> What is the (most efficient) way to
>>>>>>>>>> ignore “de la” kinda connectors
>>>>>>>>>> in a string at index or search time?
>>>>>>>>>> Thanks
>>>>>>>>>> 
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>>>>>>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>>>>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>>>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>>>> 
>>>>> 
>>>>> 
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>>> 
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>> 
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>> 
>> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to