RE: Autocomplete: match words anywhere in the token

Jonathan Rochkind Fri, 24 Sep 2010 15:30:38 -0700

I'm pretty sure under the algorithm that Chantal describes, if you use a 
multi-valued field for matching, you're going to get results in your 
auto-suggest that are in the same document with things that matched your entry, 
but don't actually match your entry themselves.  Chantal seemed to confirm that 
when I asked. 
________________________________________
From: Peter Karich [peat...@yahoo.de]
Sent: Friday, September 24, 2010 8:18 AM
To: solr-user@lucene.apache.org
Subject: Re: Autocomplete: match words anywhere in the token


Jonathan,

this field described here from Chantal:

> 2.) create an additional field that stores uses the
> String type with the same content (use copy field to fill either)

can be multivalued. Or what did you mean?

BTW: The nice thing about facet.prefix is that you can add an arbitrary
(filter) query...

Regards,
Peter.

> Hi Jonathan,
>
> yes it works only for single-valued fields without great effort. For
> multivalued fields you'd have to do some extra work getting only the
> values wich contain tokens that start with the given prefix.
>
> But maybe you mean also wether it works for several fields in one query.
> I guess not, but you can create a new field that contains the values of
> the fields that you wish to query for autosuggestions (multivalued or
> not depending on whether you use facetting or terms comp.).
>
> I just checked and actually I have such a field, but I use it in
> combination with the terms component, while I use the autosuggest based
> on facetting in combination with a different single-valued (and
> required) field. (I have two different autosuggest sources.)
>
> However, the suggestions based on the terms component are always single
> tokens (because of the way my fields are analyzed) - I haven't put any
> effort into changing that because I'm not completely convinced that this
> source of suggestions is good in my case here. There are far too many
> tokens to suggest from and it all seems very arbitrary. The use case of
> autosuggest I have in mind, though, is  that of a really long dropdown
> box (although of course all entries never show up at once) that offers
> complex suggestions (phrases) that really denote some product or person
> or other defined objects. And I achieved that with the other
> autocomplete based on facets pretty well.
>
> I definitely need to have a look at how to use facetting in combination
> with multivalued fields for autocomplete.
>
> Cheers,
> Chantal
>
> On Thu, 2010-09-23 at 22:20 +0200, Jonathan Rochkind wrote:
>
>> This works with _one_ entry per document, right?   If you've actually
>> found a clever trick to use this technique when you have more than one
>> entry for auto-suggest per document, do let me know.  Cause I haven't
>> been able to come with one.
>>
>> Jonathan
>>
>> Chantal Ackermann wrote:
>>
>>> What works very good for me:
>>>
>>> 1.) Keep the tokenized field (KeywordTokenizerFilter,
>>> WordDelimiterFilter) (like you described you had)
>>> 2.) create an additional field that stores uses the String type with the
>>> same content (use copy field to fill either)
>>> 3.) use facet.prefix instead of terms.prefix for searching the
>>> suggestions
>>> 4.) to your query add also the String field as a facet, and return the
>>> results from that field as suggestion list. They will include the
>>> complete String "canon pixma mp500" for example. The other field can
>>> only return facets based on tokens. You probably never want that as
>>> facets.
>>>
>>> So your query was alright and the "canon" (2) facet count probably is
>>> the two occurrences that you listed, but as the field was tokenized,
>>> only tokens would be returned as facets. You need to have an additional
>>> field of pure String type to get the complete value as a facet back.
>>>
>>> In general, it worked out fine for me to create String fields as return
>>> values for facets while using the tokenized fields for searching and the
>>> actual facet queries.
>>>
>>> Cheers,
>>> Chantal
>>>
>>>
>>> On Wed, 2010-09-22 at 16:39 +0200, Jason Rutherglen wrote:
>>>
>>>
>>>> This may be what you're looking for.
>>>> http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
>>>>
>>>> On Wed, Sep 22, 2010 at 4:41 AM, Arunkumar Ayyavu
>>>> <arunkumar.ayy...@gmail.com> wrote:
>>>>
>>>>
>>>>> It's been over a week since I started learning Solr. Now, I'm using the
>>>>> electronics store example to explore the autocomplete feature in Solr.
>>>>>
>>>>> When I send the query terms.fl=name&terms.prefix=canon to terms request
>>>>> handler, I get the following response
>>>>> <lst name="terms">
>>>>>  <lst name="name">
>>>>>   <int name="canon">2</int>
>>>>>  </lst>
>>>>> </lst>
>>>>>
>>>>> But I expect the following results in the response.
>>>>> canon pixma mp500 all-in-one photo printer
>>>>> canon powershot sd500
>>>>>
>>>>> So, I changed the schema for textgen fieldType to use
>>>>> KeywordTokenizerFactory and also removed WordDelimiterFilterFactory. That
>>>>> gives me the expected result.
>>>>>
>>>>> Now, I also want the Solr to return "canon pixma mp500 all-in-one photo
>>>>> printer"  when I send the query terms.fl=name&terms.prefix=pixma. Could 
>>>>> you
>>>>> gurus help me get the expected result?
>>>>>
>>>>> BTW, I couldn't quite understand the behavior of terms.lower and 
>>>>> terms.upper
>>>>> (I tried these with the electronics store example). Could you also help me
>>>>> understand these 2 query fields?
>>>>> Thanks.
>>>>>
>>>>> --
>>>>> Arun
>>>>>
>>>>>
>>>>>
>>>
>>>
>>>
>
>
>


--
http://jetwick.com twitter search prototype

RE: Autocomplete: match words anywhere in the token

Reply via email to