Sorry for the late reply.

But here is what I was talking about

big bang => b bi big (skipping the words other than the first one)
big bang => b ba ban bang (skipping the first word)
big bang => a an ang (skipping the first word + skipping first letter in
subsequent words)

The reason for this is to apply custom boosting for the matches based on
where the search term matches (start, middle, end etc).

May be we should use RegEx before EdgeNGramFilterFactory? But I was
thinking to have EdgeNGramFilterFactory take a parameter to skip "n"
characters from the start or end before generating the grams.

Your thoughts?




Thanks,
Kranti K. Parisa
http://www.linkedin.com/in/krantiparisa



On Wed, Nov 13, 2013 at 6:38 AM, Furkan KAMACI <[email protected]>wrote:

> EdgeNGramFilterFactory creates n-grams from the beginning edge of a input
> token by default. You can change its side. You can define minimum and
> maximum gram size for it. Here is what *can *EdgeNGramFilterFactory do
> with a configuration of minimum gram size is 2 and maximum gram size is 4:
>
> apache => ap, apa, apac, apach
>
> If we talk about your situation. What is a word for you? Strings that are
> delimited by whitespaces, underscores ... etc?
>
>
>
>
>
> 2013/11/12 Kranti Parisa <[email protected]>
>
>> Can EdgeNGramFilterFactory handle the cases where we need to
>> skip/consider the "n" words from the start or end?
>>
>> For example:
>>
>> Title: big bang theory
>>
>> field1: populate full ngrams
>> field2: populate ngrams for "bang theory" = skipping the first word "big"
>> field3: populate ngrams for "big" = considering only the first word "big"
>> field4: populate ngrams for "theory" = considering only the last word
>> "theory"
>>
>> and at query time, I would like to apply field level boosting to rank the
>> results.
>>
>>
>>
>> Thanks,
>> Kranti K. Parisa
>> http://www.linkedin.com/in/krantiparisa
>>
>>
>>
>> On Sun, Nov 10, 2013 at 5:51 PM, Furkan KAMACI <[email protected]>wrote:
>>
>>> Hi;
>>>
>>> There were two issues about adding preserveOriginal capability to
>>> EdgeNGramFilterFactory and I've made a patch about it. You can check and
>>> test it from here: https://issues.apache.org/jira/browse/SOLR-5152 This
>>> is the related issue that can be marked as duplicated:
>>> https://issues.apache.org/jira/browse/SOLR-5332
>>>
>>> Thanks;
>>> Furkan KAMACI
>>>
>>
>>
>

Reply via email to