Kuntal Ganguly created SOLR-6032:
------------------------------------
Summary: NgramFilter dont keep token less than mingram size or
greater than maxgram size
Key: SOLR-6032
URL: https://issues.apache.org/jira/browse/SOLR-6032
Project: Solr
Issue Type: Bug
Components: search
Affects Versions: 4.6.1, 4.2.1
Environment: Ubuntu12.04,4GB RAM, Quadcore Processor
Reporter: Kuntal Ganguly
I have a requirement for partial and exact type.Now partial search work fine
for NgramFilter within mingram & maxgram size range. Now when im trying to
index a value less than mingram size,the tokens are not generated .Same things
happens when the value is greater than maxgramsize.
I haveto created a field type as shown below:
<fieldType name="text_ngram" class="solr.TextField" positionIncrementGap="100">
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.NGramFilterFactory" minGramSize="3"
maxGramSize="6" preserveOriginal="true"/>
</analyzer>
</fieldType>
when i'm trying to index a value say AB (it is not indexed and not searchable).
Similarly if the value is GangulyKuntal (which is greater than maxgram
size),the search is not working.
**Increasing maxgram size to more than the anticipated value is not good design
aspect.
NgramFilter should keep the original tokens if it is less than mingram or
greater than maxgram. By doing this it will make it truly partial as well as
exact search solution.It would really be very helpful,if this changes are made
in the coming release. Any suggestion will be of great help?
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]