Re: Adding preserveOriginal Capability to EdgeNGramFilterFactory

Furkan KAMACI Wed, 13 Nov 2013 03:39:05 -0800

EdgeNGramFilterFactory creates n-grams from the beginning edge of a input
token by default. You can change its side. You can define minimum and
maximum gram size for it. Here is what *can *EdgeNGramFilterFactory do with
a configuration of minimum gram size is 2 and maximum gram size is 4:


apache => ap, apa, apac, apach

If we talk about your situation. What is a word for you? Strings that are
delimited by whitespaces, underscores ... etc?





2013/11/12 Kranti Parisa <[email protected]>

> Can EdgeNGramFilterFactory handle the cases where we need to skip/consider
> the "n" words from the start or end?
>
> For example:
>
> Title: big bang theory
>
> field1: populate full ngrams
> field2: populate ngrams for "bang theory" = skipping the first word "big"
> field3: populate ngrams for "big" = considering only the first word "big"
> field4: populate ngrams for "theory" = considering only the last word
> "theory"
>
> and at query time, I would like to apply field level boosting to rank the
> results.
>
>
>
> Thanks,
> Kranti K. Parisa
> http://www.linkedin.com/in/krantiparisa
>
>
>
> On Sun, Nov 10, 2013 at 5:51 PM, Furkan KAMACI <[email protected]>wrote:
>
>> Hi;
>>
>> There were two issues about adding preserveOriginal capability to
>> EdgeNGramFilterFactory and I've made a patch about it. You can check and
>> test it from here: https://issues.apache.org/jira/browse/SOLR-5152 This
>> is the related issue that can be marked as duplicated:
>> https://issues.apache.org/jira/browse/SOLR-5332
>>
>> Thanks;
>> Furkan KAMACI
>>
>
>

Re: Adding preserveOriginal Capability to EdgeNGramFilterFactory

Reply via email to