Re: Auto Completion search

Daniel Rosher Mon, 23 Jun 2008 09:18:54 -0700

Hello,

Basically I think you need to use NGramFilter, this will be alot faster
than the searches you list, but will make your index much larger too.


In Solr this can be achieved with something like:

 <fieldType name="acomplete" class="solr.TextField">
                <analyzer type="index">
                        <tokenizer
class="solr.KeywordTokenizerFactory"/>
                        <filter class="solr.LowerCaseFilterFactory" />
                        <filter class="solr.PatternReplaceFilterFactory"
pattern="([^a-z0-9])" replacement="" replace="all" />
                        <filter class="NGramFilterFactory"
maxGramSize="20" minGramSize="1" />
                </analyzer>
                <analyzer type="query">
                        <tokenizer
class="solr.KeywordTokenizerFactory"/>
                        <filter class="solr.LowerCaseFilterFactory" />
                        <filter class="solr.PatternReplaceFilterFactory"
pattern="([^a-z0-9])" replacement="" replace="all" />
                </analyzer>
        </fieldType>

Cheers,
Dan


On Mon, 2008-06-23 at 08:24 -0700, Lukas Öesterreicher wrote:
> Hello.
> 
> I am trying to implement a search based on a search text in an index that
> contains Track Title, Album Name or Artist Name information that delivers a
> list or results that are suited for "auto completion" to make searching
> easier for the user. This search is very performance critical. The real
> search itself is done on a seperate index and has already been implemented.
> 
> Results should match any Title or Name that contains the sepecified text,
> however only only matching at the start of a word. The PrefixQuery is
> predestined for this, I think.
> I created the index using a WhitespaceAnalyzer.
> 
> So for instance if I search for "Sing I"
> it should match the texts
> "If I Don't Sing I'll Cry", "Sing It" and "Sing It Again",
> but not "I Sing".
> 
> The current solution I've come up with until now is use a
> WhitespaceTokenizer
> and create a TermQuery for all but the last token and a PrefixQuery for the
> last token,
> and combine this with AND (Occur.MUST).
> So for instance "I Sing" would result in "item.name:Sing AND item.name:I*"
> (I think,
> I've never seen the string representation of the query).
> 
> When retrieving the results I check if the original query text is contained
> within the returned item.name (while ignoring cases), and only return it if
> it is
> the case.
> 
> I am not sure however if this is a good result, performance-wise.
> Can you provide a better solution and/or comment on my current one.
> 
> The currend one is quite quick if no whitespace is contained in the search
> term,
> but requests are many times slower if there are whitespaces (usually normal
> spaces)
> in the search term.
> 
> Thanx,
> Lukas
Daniel Rosher
Developer
www.thehotonlinenetwork.com
d: 0207 3489 912

    t: 0845 4680 568

    f: 0845 4680 868

    m: 

                Beaumont House, Kensington Village, Avonmore Road, London, W14 
8TS
        


    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
- - - - - - - - - - - - - - - - -

    This message is sent in confidence for the addressee only. It may contain 
privileged

    information. The contents are not to be disclosed to anyone other than the 
addressee.

    Unauthorised recipients are requested to preserve this confidentiality and 
to advise

    us of any errors in transmission. Thank you.

    hotonline ltd is registered in England & Wales. Registered office: One 
Canada Square,

    Canary Wharf, London E14 5AP. Registered No: 1904765.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Auto Completion search

Reply via email to