Re: search with wildcard

Jack Krupansky Thu, 21 Nov 2013 10:22:11 -0800

You might be able to make use of the dictionary compound word filter, butyou will have to build up a dictionary of words to use:


http://lucene.apache.org/core/4_5_1/analyzers-common/org/apache/lucene/analysis/compound/DictionaryCompoundWordTokenFilterFactory.html


My e-book has some examples and a better description.

-- Jack Krupansky

-----Original Message-----From: Ahmet Arslan

Sent: Thursday, November 21, 2013 11:40 AM
To: solr-user@lucene.apache.org
Subject: Re: search with wildcard

Hi Adnreas,

If you don't want to use wildcards at query time, alternative way is to useNGrams at indexing time. This will produce a lot of tokens. e.g.For example 4grams of your example : Supertestplan => supe uper pert ertertes *test* estp stpl tpla plan



Is that you want? By the way why do you want to search inside of words?

<filter class="solr.NGramFilterFactory" minGramSize="3" maxGramSize="4"/>




On Thursday, November 21, 2013 5:23 PM, Andreas Owen <a...@conx.ch> wrote:

I suppose i have to create another field with diffenet tokenizers and set
the boost very low so it doesn't really mess with my ranking because there
the word is now in 2 fields. What kind of tokenizer can do the job?



From: Andreas Owen [mailto:a...@conx.ch]
Sent: Donnerstag, 21. November 2013 16:13
To: solr-user@lucene.apache.org
Subject: search with wildcard



I am querying "test" in solr 4.3.1 over the field below and it's not finding
all occurences. It seems that if it is a substring of a word like
"Supertestplan" it isn't found unless I use a wildcards "*test*". This is
write because of my tokenizer but does someone know a way around this? I
don't want to add wildcards because that messes up queries with multiple
words.



<fieldType name="text_de" class="solr.TextField" positionIncrementGap="100">

     <analyzer>

       <tokenizer class="solr.StandardTokenizerFactory"/>

       <filter class="solr.LowerCaseFilterFactory"/>



       <filter class="solr.StopFilterFactory" ignoreCase="true"
words="lang/stopwords_de.txt" format="snowball"
enablePositionIncrements="true"/> <!-- remove common words -->

       <filter class="solr.GermanNormalizationFilterFactory"/>

                              <filter
class="solr.SnowballPorterFilterFactory" language="German"/> <!-- remove
noun/adjective inflections like plural endings -->



     </analyzer>

</fieldType>

Re: search with wildcard

Reply via email to