Re: Solr Auto-Complete

Salman Ansari Tue, 08 Dec 2015 04:47:27 -0800

Thanks Alexandre. I think it is clear.

On Sun, Dec 6, 2015 at 5:21 PM, Alexandre Rafalovitch <arafa...@gmail.com>
wrote:


> For suffix matches, you copy text the field and in the different type add
> string reversal for both index and query portions. So you are doing prefix
> matching algorithm but on reversed strings.
>
> I can dig up an example if it is not clear.
> On 6 Dec 2015 8:06 am, "Salman Ansari" <salman.rah...@gmail.com> wrote:
>
> > That is right. I am actually looking for phrase prefixes not each term
> > prefix within the phrase. That satisfies my requirements. However, my
> > additional question was how do I manipulate the filedType to later allow
> > for suffix matches as well? or will that be a completely different
> > fieldType definition?
> >
> > Regards,
> > Salman
> >
> >
> > On Sun, Dec 6, 2015 at 2:12 PM, Andrea Gazzarini <a.gazzar...@gmail.com>
> > wrote:
> >
> > > Sorry, my damned mobile: "Is that close to what you were looking for?"
> > >
> > > 2015-12-06 12:07 GMT+01:00 Andrea Gazzarini <a.gazzar...@gmail.com>:
> > >
> > > > Do you mean "phrase" or "term" prefixes? If you try to put a field
> > value
> > > > (two or more terms) in the analysis page you will see what the index
> > > > analyzer chain (of my example field type) is doing. The whole value
> is
> > > > managed as a single-ngrammed token, so you will get only a phrase
> > prefix
> > > > search, as in your request.
> > > >
> > > > If you want to manage also terms prefixes, I would also index another
> > > > field (similar to the example you posted); then, the search handler
> > with
> > > > e(dismax) would have something like this:
> > > >
> > > >        <str name="qf">
> > > >>
> > > >     text_suggestion_phrase_prefix_search^b1
> > > >     text_suggestion_terms_prefix_search^b2
> > > >
> > > > </str>
> > > >
> > > >
> > > > b1 and b2 values strictly depend on your search logic.
> > > >
> > > > Is that close that what you were looking for?
> > > >
> > > > Best,
> > > > Andrea
> > > >
> > > >
> > > >
> > > > 2015-12-06 11:53 GMT+01:00 Salman Ansari <salman.rah...@gmail.com>:
> > > >
> > > >> Thanks a lot Andrea. It did work.
> > > >>
> > > >> However, just for my understanding, can you please explain more how
> > did
> > > >> you
> > > >> make it work for prefixes. I know you mentioned using another
> > Tokenizer
> > > >> but
> > > >> for example, if I want to tweak it later on to work on suffixes or
> > > within
> > > >> phrases how should I go about that?
> > > >>
> > > >> Thanks again for your help.
> > > >>
> > > >> Regards,
> > > >> Salman
> > > >>
> > > >>
> > > >> On Sun, Dec 6, 2015 at 1:24 PM, Andrea Gazzarini <
> > a.gazzar...@gmail.com
> > > >
> > > >> wrote:
> > > >>
> > > >> > Hi Salman,
> > > >> > that's because you're using a StandardTokenizer. Try with
> something
> > > like
> > > >> > this (copied, pasted and changed using my phone so probably with a
> > lot
> > > >> of
> > > >> > mistakes ;) but you should be able to get what I mean). BTW I
> don't
> > > >> know if
> > > >> > that's the case but I would also put a MappingCharFilterFactory
> > > >> >
> > > >> >         <fieldType name="text_suggestion" class="solr.TextField"
> > > >> > positionIncrementGap="100">
> > > >> >             <analyzer type="index">
> > > >> > *                <charFilter class="solr.MappingCharFilterFactory"
> > > >> > mapping="mapping-FoldToASCII.txt"/>    *
> > > >> >                 <tokenizer class="solr.KeywordTokenizerFactory"/>
> > > >> >                 <filter class="solr.LowerCaseFilterFactory" />
> > > >> >                 <filter class="solr.WordDelimiterFilterFactory"
> > > >> > generateWordParts="0" generateNumberParts="0" catenateAll="1"
> > > >> > splitOnCaseChange="0" />
> > > >> >                 <filter class="solr.EdgeNGramFilterFactory"
> > > >> > maxGramSize="20"/>
> > > >> >             </analyzer>
> > > >> >             <analyzer type="query">
> > > >> > *                <charFilter class="solr.MappingCharFilterFactory"
> > > >> > mapping="mapping-FoldToASCII.txt"/>*
> > > >> >                 <tokenizer class="solr.KeywordTokenizerFactory"/>
> > > >> >                 <filter class="solr.LowerCaseFilterFactory" />
> > > >> >                 <filter class="solr.WordDelimiterFilterFactory"
> > > >> > generateWordParts="0" generateNumberParts="0" catenateAll="1"
> > > >> > splitOnCaseChange="0" />
> > > >> >             </analyzer>
> > > >> >         </fieldType>
> > > >> >
> > > >> >
> > > >> > 2015-12-06 9:36 GMT+01:00 Salman Ansari <salman.rah...@gmail.com
> >:
> > > >> >
> > > >> > > Hi,
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > I have updated my schema.xml as mentioned in the previous posts
> > > using
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > <fieldType name="text_suggestion" class="solr.TextField"
> > > >> > > positionIncrementGap="100">
> > > >> > >         <analyzer type="index">
> > > >> > >             <tokenizer class="solr.StandardTokenizerFactory"/>
> > > >> > >             <filter class="solr.LowerCaseFilterFactory"/>
> > > >> > >             <filter class="solr.EdgeNGramFilterFactory"
> > > >> minGramSize="1"
> > > >> > > maxGramSize="20"/>
> > > >> > >         </analyzer>
> > > >> > >         <analyzer type="query">
> > > >> > >             <tokenizer class="solr.StandardTokenizerFactory"/>
> > > >> > >             <filter class="solr.LowerCaseFilterFactory"/>
> > > >> > >         </analyzer>
> > > >> > >     </fieldType>
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > This does the auto-complete, but it does it at every portion of
> > the
> > > >> text
> > > >> > > (not just at the beginning) (prefix). So searching for "And" in
> my
> > > >> field
> > > >> > > for locations returns both of the following documents.
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > <doc>
> > > >> > >
> > > >> > > <int name="id">1</int>
> > > >> > >
> > > >> > > <str name="country_code">AD</str>
> > > >> > >
> > > >> > > <str name="name_eng">*And*orra</str>
> > > >> > >
> > > >> > > <str name="name_ar">أندورا</str>
> > > >> > >
> > > >> > > <long name="_version_">1519794717684924416</long>
> > > >> > >
> > > >> > > </doc>
> > > >> > >
> > > >> > > <doc>
> > > >> > >
> > > >> > > <int name="id">5</int>
> > > >> > >
> > > >> > > <str name="country_code">AG</str>
> > > >> > >
> > > >> > > <str name="name_eng">Antigua *and* Barbuda</str>
> > > >> > >
> > > >> > > <str name="name_ar">أنتيجوا وبربودا</str>
> > > >> > >
> > > >> > > <long name="_version_">1519794717701701633</long>
> > > >> > >
> > > >> > > </doc>
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > I have read about this and at first I thought I need to add
> > > >> side="front"
> > > >> > > but after adding that, Solr returned an error (when creating a
> > > >> > collection)
> > > >> > > indicating "Unknown parameters <side="front>". I read again and
> it
> > > >> looks
> > > >> > > like that side="front" is the default behavior as here
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> http://stackoverflow.com/questions/28807427/edgengramfilterfactory-change-in-solr5
> > > >> > >
> > > >> > >
> > > >> > > My question is that, how do I enable only prefix auto-complete?
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > Comments and feedback are appreciated.
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > Regards,
> > > >> > >
> > > >> > > Salman
> > > >> > >
> > > >> > >
> > > >> > > On Fri, Dec 4, 2015 at 6:21 PM, Alexandre Rafalovitch <
> > > >> > arafa...@gmail.com>
> > > >> > > wrote:
> > > >> > >
> > > >> > > > You can see an example of similar use at:
> > > >> > > > http://www.solr-start.com/javadoc/solr-lucene/index.html
> > (search
> > > >> box).
> > > >> > > >
> > > >> > > > The corresponding schema is here:
> > > >> > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> https://github.com/arafalov/Solr-Javadoc/blob/master/JavadocIndex/JavadocCollection/conf/schema.xml#L24
> > > >> > > > . It does have some extra special-case stuff to allow to
> search
> > by
> > > >> the
> > > >> > > > fragments, but the general use case is the same.
> > > >> > > >
> > > >> > > > Regards,
> > > >> > > >    Alex.
> > > >> > > > ----
> > > >> > > > Newsletter and resources for Solr beginners and intermediates:
> > > >> > > > http://www.solr-start.com/
> > > >> > > >
> > > >> > > >
> > > >> > > > On 4 December 2015 at 10:11, Salman Ansari <
> > > salman.rah...@gmail.com
> > > >> >
> > > >> > > > wrote:
> > > >> > > > > Thanks Alan, Alessandaro and Andrea for your great
> > > explanations. I
> > > >> > will
> > > >> > > > > follow the path of adding edge ngrams to the field type for
> my
> > > use
> > > >> > > case.
> > > >> > > > >
> > > >> > > > > Regards,
> > > >> > > > > Salman
> > > >> > > > >
> > > >> > > > > On Thu, Dec 3, 2015 at 12:23 PM, Alessandro Benedetti <
> > > >> > > > abenede...@apache.org
> > > >> > > > >> wrote:
> > > >> > > > >
> > > >> > > > >> "Sounds good but I heard "/suggest" component is the
> > > recommended
> > > >> way
> > > >> > > of
> > > >> > > > >> doing auto-complete"
> > > >> > > > >>
> > > >> > > > >> This sounds fantastic :)
> > > >> > > > >> We "heard" that as well, we know what the suggest component
> > > does.
> > > >> > > > >> The point is that you would like to retrieve the
> suggestions
> > +
> > > >> some
> > > >> > > > >> consistent payload in different fields.
> > > >> > > > >> Current suggest component offers some effort in providing a
> > > >> payload,
> > > >> > > but
> > > >> > > > >> almost all the suggester implementation are based on an FST
> > > >> approach
> > > >> > > > which
> > > >> > > > >> aim to be as fast and memory efficient as possible.
> > > >> > > > >> Honestly you could experiment and even contribute a
> > > >> customisation if
> > > >> > > you
> > > >> > > > >> want to add a new feature to the suggest component able to
> > > return
> > > >> > > > complex
> > > >> > > > >> payloads together with the suggestions.
> > > >> > > > >> Apart that, it strictly depends of how you want to provide
> > the
> > > >> > > > >> autocompletion, there are plenty of different lookups
> > > >> implementation
> > > >> > > and
> > > >> > > > >> plenty of tokenizer/token filters to combine .
> > > >> > > > >> So I would confirm what we already said and that Andrea
> > > >> confirmed.
> > > >> > > > >>
> > > >> > > > >> If anyone has played with the suggester suggestions
> payload,
> > > his
> > > >> > > > feedback
> > > >> > > > >> is welcome!
> > > >> > > > >>
> > > >> > > > >> Cheers
> > > >> > > > >>
> > > >> > > > >>
> > > >> > > > >> On 3 December 2015 at 06:21, Andrea Gazzarini <
> > > >> > a.gazzar...@gmail.com>
> > > >> > > > >> wrote:
> > > >> > > > >>
> > > >> > > > >> > Hi Salman,
> > > >> > > > >> > few months ago I have been involved in a project similar
> to
> > > >> > > > >> > map.geoadmin.ch
> > > >> > > > >> > and there, I had your same need (I also sent an email to
> > this
> > > >> > list).
> > > >> > > > >> >
> > > >> > > > >> > From my side I can furtherly confirm what Alan and
> > Alessandro
> > > >> > > already
> > > >> > > > >> > explained, I followed that approach.
> > > >> > > > >> >
> > > >> > > > >> > IMHO, that is the "recommended way" if the component's
> > > features
> > > >> > meet
> > > >> > > > your
> > > >> > > > >> > needs (i.e. do not reinvent the wheel) but it seems
> you're
> > > out
> > > >> of
> > > >> > > > those
> > > >> > > > >> > bounds.
> > > >> > > > >> >
> > > >> > > > >> > Best,
> > > >> > > > >> > Andrea
> > > >> > > > >> > On 2 Dec 2015 21:51, "Salman Ansari" <
> > > salman.rah...@gmail.com>
> > > >> > > wrote:
> > > >> > > > >> >
> > > >> > > > >> > > Sounds good but I heard "/suggest" component is the
> > > >> recommended
> > > >> > > way
> > > >> > > > of
> > > >> > > > >> > > doing auto-complete in the new versions of Solr.
> > Something
> > > >> along
> > > >> > > the
> > > >> > > > >> > lines
> > > >> > > > >> > > of this article
> > > >> > > > >> > >
> > https://cwiki.apache.org/confluence/display/solr/Suggester
> > > >> > > > >> > >
> > > >> > > > >> > > <searchComponent name="suggest"
> > > >> class="solr.SuggestComponent">
> > > >> > > > >> > >   <lst name="suggester">
> > > >> > > > >> > >     <str name="name">mySuggester</str>
> > > >> > > > >> > >     <str name="lookupImpl">FuzzyLookupFactory</str>
> > > >> > > > >> > >     <str
> > > >> name="dictionaryImpl">DocumentDictionaryFactory</str>
> > > >> > > > >> > >     <str name="field">cat</str>
> > > >> > > > >> > >     <str name="weightField">price</str>
> > > >> > > > >> > >     <str name="suggestAnalyzerFieldType">string</str>
> > > >> > > > >> > >     <str name="buildOnStartup">false</str>
> > > >> > > > >> > >   </lst>
> > > >> > > > >> > > </searchComponent>
> > > >> > > > >> > >
> > > >> > > > >> > > Can someone confirm this?
> > > >> > > > >> > >
> > > >> > > > >> > > Regards,
> > > >> > > > >> > > Salman
> > > >> > > > >> > >
> > > >> > > > >> > >
> > > >> > > > >> > > On Wed, Dec 2, 2015 at 1:14 PM, Alessandro Benedetti <
> > > >> > > > >> > > abenede...@apache.org>
> > > >> > > > >> > > wrote:
> > > >> > > > >> > >
> > > >> > > > >> > > > Hi Salman,
> > > >> > > > >> > > > I agree with Alan.
> > > >> > > > >> > > > Just configure your schema with the proper analysers
> .
> > > >> > > > >> > > > For the field you want to use for suggestions you are
> > > >> likely
> > > >> > to
> > > >> > > > need
> > > >> > > > >> > > simply
> > > >> > > > >> > > > this fieldType :
> > > >> > > > >> > > >
> > > >> > > > >> > > > <fieldType name="text_suggestion"
> > class="solr.TextField"
> > > >> > > > >> > > > positionIncrementGap="100">
> > > >> > > > >> > > >         <analyzer type="index">
> > > >> > > > >> > > >             <tokenizer
> > > >> class="solr.StandardTokenizerFactory"/>
> > > >> > > > >> > > >             <filter
> > class="solr.LowerCaseFilterFactory"/>
> > > >> > > > >> > > >             <filter
> class="solr.EdgeNGramFilterFactory"
> > > >> > > > >> minGramSize="1"
> > > >> > > > >> > > > maxGramSize="20"/>
> > > >> > > > >> > > >         </analyzer>
> > > >> > > > >> > > >         <analyzer type="query">
> > > >> > > > >> > > >             <tokenizer
> > > >> class="solr.StandardTokenizerFactory"/>
> > > >> > > > >> > > >             <filter
> > class="solr.LowerCaseFilterFactory"/>
> > > >> > > > >> > > >         </analyzer>
> > > >> > > > >> > > >     </fieldType>
> > > >> > > > >> > > >
> > > >> > > > >> > > > This is a very sample example, please adapt it to
> your
> > > use
> > > >> > case.
> > > >> > > > >> > > >
> > > >> > > > >> > > > Cheers
> > > >> > > > >> > > >
> > > >> > > > >> > > > On 2 December 2015 at 09:41, Alan Woodward <
> > > >> a...@flax.co.uk>
> > > >> > > > wrote:
> > > >> > > > >> > > >
> > > >> > > > >> > > > > Hi Salman,
> > > >> > > > >> > > > >
> > > >> > > > >> > > > > It sounds as though you want to do a normal search
> > > >> against a
> > > >> > > > >> special
> > > >> > > > >> > > > > 'suggest' field, that's been indexed with edge
> > ngrams.
> > > >> > > > >> > > > >
> > > >> > > > >> > > > > Alan Woodward
> > > >> > > > >> > > > > www.flax.co.uk
> > > >> > > > >> > > > >
> > > >> > > > >> > > > >
> > > >> > > > >> > > > > On 2 Dec 2015, at 09:31, Salman Ansari wrote:
> > > >> > > > >> > > > >
> > > >> > > > >> > > > > > Hi,
> > > >> > > > >> > > > > >
> > > >> > > > >> > > > > > I am looking for auto-complete in Solr but on top
> > of
> > > >> just
> > > >> > > auto
> > > >> > > > >> > > > complete I
> > > >> > > > >> > > > > > want as well to return the data completely (not
> > just
> > > >> > > > >> suggestions),
> > > >> > > > >> > > so I
> > > >> > > > >> > > > > > want to get back the ids, and other fields in the
> > > whole
> > > >> > > > >> document. I
> > > >> > > > >> > > > tried
> > > >> > > > >> > > > > > the following 2 approaches but each had issues
> > > >> > > > >> > > > > >
> > > >> > > > >> > > > > > 1) Used the /suggest component but that returns a
> > > very
> > > >> > > > specific
> > > >> > > > >> > > format
> > > >> > > > >> > > > > > which looks like I cannot customize. I want to
> > return
> > > >> the
> > > >> > > > whole
> > > >> > > > >> > > > document
> > > >> > > > >> > > > > > that has a matching field and not only the
> > suggestion
> > > >> > list.
> > > >> > > So
> > > >> > > > >> for
> > > >> > > > >> > > > > example,
> > > >> > > > >> > > > > > if I write "hard" it returns the results in a
> > > specific
> > > >> > > format
> > > >> > > > as
> > > >> > > > >> > > > follows
> > > >> > > > >> > > > > >
> > > >> > > > >> > > > > > <arr name="suggestion">          <str>hard
> > > drive</str>
> > > >> > > > >> > > > > > <str>hard disk</str>        </arr>
> > > >> > > > >> > > > > >
> > > >> > > > >> > > > > > Is there a way to get back additional fields with
> > > >> > > suggestions?
> > > >> > > > >> > > > > >
> > > >> > > > >> > > > > > 2) Tried the normal /select component but that
> does
> > > >> not do
> > > >> > > > >> > > > auto-complete
> > > >> > > > >> > > > > on
> > > >> > > > >> > > > > > portion of the word. So, for example, if I write
> > the
> > > >> query
> > > >> > > as
> > > >> > > > >> > "bara"
> > > >> > > > >> > > it
> > > >> > > > >> > > > > > DOES NOT return "barack obama". Any suggestions
> how
> > > to
> > > >> > solve
> > > >> > > > >> this?
> > > >> > > > >> > > > > >
> > > >> > > > >> > > > > >
> > > >> > > > >> > > > > > Regards,
> > > >> > > > >> > > > > > Salman
> > > >> > > > >> > > > >
> > > >> > > > >> > > > >
> > > >> > > > >> > > >
> > > >> > > > >> > > >
> > > >> > > > >> > > > --
> > > >> > > > >> > > > --------------------------
> > > >> > > > >> > > >
> > > >> > > > >> > > > Benedetti Alessandro
> > > >> > > > >> > > > Visiting card : http://about.me/alessandro_benedetti
> > > >> > > > >> > > >
> > > >> > > > >> > > > "Tyger, tyger burning bright
> > > >> > > > >> > > > In the forests of the night,
> > > >> > > > >> > > > What immortal hand or eye
> > > >> > > > >> > > > Could frame thy fearful symmetry?"
> > > >> > > > >> > > >
> > > >> > > > >> > > > William Blake - Songs of Experience -1794 England
> > > >> > > > >> > > >
> > > >> > > > >> > >
> > > >> > > > >> >
> > > >> > > > >>
> > > >> > > > >>
> > > >> > > > >>
> > > >> > > > >> --
> > > >> > > > >> --------------------------
> > > >> > > > >>
> > > >> > > > >> Benedetti Alessandro
> > > >> > > > >> Visiting card : http://about.me/alessandro_benedetti
> > > >> > > > >>
> > > >> > > > >> "Tyger, tyger burning bright
> > > >> > > > >> In the forests of the night,
> > > >> > > > >> What immortal hand or eye
> > > >> > > > >> Could frame thy fearful symmetry?"
> > > >> > > > >>
> > > >> > > > >> William Blake - Songs of Experience -1794 England
> > > >> > > > >>
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>

Re: Solr Auto-Complete

Reply via email to