[ 
https://issues.apache.org/jira/browse/SOLR-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander S. updated SOLR-5332:
-------------------------------

    Description: 
Hi, as described here: 
http://lucene.472066.n3.nabble.com/Help-to-figure-out-why-query-does-not-match-td4086967.html
 the problem is in that if you have these 2 strings to index:
1. facebook.com/someuser.1
2. facebook.com/someveryandverylongusername
and the edge ngram filter factory with min and max gram size settings 2 and 25, 
search requests for these urls will fail.

But search requests for:
1. facebook.com/someuser
2. facebook.com/someveryandverylonguserna
will work properly.

It's because first url has "1" at the end, which is lover than the allowed min 
gram size. In the second url the user name is longer than the max gram size (27 
characters).

Would be good to have a "preserve original" option, that will add the original 
string to the index if it does not fit the allowed gram size, so that "1" and 
"someveryandverylongusername" tokens will also be added to the index.

Best,
Alex

  was:
Hi, as described here: 
http://lucene.472066.n3.nabble.com/Help-to-figure-out-why-query-does-not-match-td4086967.html
 the problem is in that if you have these 2 strings to index:
1. facebook.com/someuser.1
2. facebook.com/someveryandverylongusername
and the edge ngram filter factory with min and max gram size settings 2 and 25, 
search requests for these urls will fail.

But search requests for:
1. facebook.com/someuser
2. facebook.com/someveryandverylonguserna
will work properly.

It's because first url has "1" at the end, which is lover that the allowed min 
gram size. In the second url the user name is longer than the max gram size (27 
characters).

Would be good to have a "preserve original" option, that will add the original 
string to the index if it does not fit the allowed gram size, so that "1" and 
"someveryandverylongusername" tokens will also be added to the index.

Best,
Alex


> Add "preserve original" setting to the EdgeNGramFilterFactory
> -------------------------------------------------------------
>
>                 Key: SOLR-5332
>                 URL: https://issues.apache.org/jira/browse/SOLR-5332
>             Project: Solr
>          Issue Type: Wish
>            Reporter: Alexander S.
>
> Hi, as described here: 
> http://lucene.472066.n3.nabble.com/Help-to-figure-out-why-query-does-not-match-td4086967.html
>  the problem is in that if you have these 2 strings to index:
> 1. facebook.com/someuser.1
> 2. facebook.com/someveryandverylongusername
> and the edge ngram filter factory with min and max gram size settings 2 and 
> 25, search requests for these urls will fail.
> But search requests for:
> 1. facebook.com/someuser
> 2. facebook.com/someveryandverylonguserna
> will work properly.
> It's because first url has "1" at the end, which is lover than the allowed 
> min gram size. In the second url the user name is longer than the max gram 
> size (27 characters).
> Would be good to have a "preserve original" option, that will add the 
> original string to the index if it does not fit the allowed gram size, so 
> that "1" and "someveryandverylongusername" tokens will also be added to the 
> index.
> Best,
> Alex



--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to