I'm using the FreeTextLookupFactory in my implementation now.

Yes, now it can suggest part of the field from the middle of the content.

I read that this implementation is able to consider the previous tokens
when making the suggestions. However, when I try to enter a search phrase,
it seems that it is only considering the last token and not any of the
previous tokens.

For example, when I search for
http://localhost:8983/edm/collection1/suggest?suggest.q=trouble free, it is
giving me suggestions based on the word 'free' only, and not 'trouble free'.

This is my configuration:

In solrconfig.xml:

<searchComponent name="suggest" class="solr.SuggestComponent">
  <lst name="suggester">

                <str name="lookupImpl">FreeTextLookupFactory</str>
                <str name="indexPath">suggester_freetext_dir</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">Suggestion</str>
<str name="suggestFreeTextAnalyzerFieldType">suggestType</str>
<str name="ngrams">5</str>
<str name="buildOnStartup">false</str>
<str name="buildOnCommit">false</str>
  </lst>
</searchComponent>

<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy" >
  <lst name="defaults">
    <str name="wt">json</str>
        <str name="indent">true</str>

<str name="suggest">true</str>
<str name="suggest.count">10</str>
<str name="suggest.dictionary">mySuggester</str>
  </lst>
  <arr name="components">
<str>suggest</str>
  </arr>
</requestHandler>

In schema.xml

<fieldType name="suggestType" class="solr.TextField"
positionIncrementGap="100">
<analyzer>
<charFilter class="solr.PatternReplaceCharFilterFactory"
pattern="[^a-zA-Z0-9]" replacement=" " />
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.ShingleFilterFactory" maxShingleSize="5"
outputUnigrams="true"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" />
</analyzer>
</fieldType>

Is there anything I configured wrongly? I've set the ngrams to 5, which
means it is supposed to consider up to the previous 5 tokens entered?


Regards,
Edwin


On 17 June 2015 at 22:12, Alessandro Benedetti <benedetti.ale...@gmail.com>
wrote:

> Edwin,
> The spellcheck is a thing, the Suggester is another.
>
> If you need to provide auto suggestion to your users, the suggester is the
> right thing to use.
> But I really doubt to be useful to select as a suggester field the entire
> content.
> it is going to be quite expensive.
>
> In the case I would again really suggest you to take a look to the article
> I quoted and Solr generic documentation.
>
> It is possible to suggest part of the field.
> You can use the FreeText suggester with a proper analysis selected.
>
> Cheers
>
> 2015-06-17 6:14 GMT+01:00 Zheng Lin Edwin Yeo <edwinye...@gmail.com>:
>
> > Yes I've looked at that before, but I was told that the newer version of
> > Solr has its own suggester, and does not need to use spellchecker
> anymore?
> >
> > So it's not necessary to use the spellechecker inside suggester anymore?
> >
> > Regards,
> > Edwin
> >
> >
> > On 17 June 2015 at 11:56, Erick Erickson <erickerick...@gmail.com>
> wrote:
> >
> > > Have you looked at spellchecker? Because that sound much more like
> > > what you're asking about than suggester.
> > >
> > > Spell checking is more what you're asking for, have you even looked at
> > that
> > > after it was suggested?
> > >
> > > bq: Also, when I do a search, it shouldn't be returning whole fields,
> > > but just to return a portion of the sentence
> > >
> > > This is what highlighting is built for.
> > >
> > > Really, I recommend you take the time to do some familiarization with
> the
> > > whole search space and Solr. The excellent book here:
> > >
> > >
> > >
> >
> http://www.amazon.com/Solr-Action-Trey-Grainger/dp/1617291021/ref=sr_1_1?ie=UTF8&qid=1434513284&sr=8-1&keywords=apache+solr&pebp=1434513287267&perid=0YRK508J0HJ1N3BAX20E
> > >
> > > will give you the grounding you need to get the most out of Solr.
> > >
> > > Best,
> > > Erick
> > >
> > > On Tue, Jun 16, 2015 at 8:27 PM, Zheng Lin Edwin Yeo
> > > <edwinye...@gmail.com> wrote:
> > > > The long content is from when I tried to index PDF files. As some PDF
> > > files
> > > > has alot of words in the content, it will lead to the *UTF8 encoding
> is
> > > > longer than the max length 32766 error.*
> > > >
> > > > I think the problem is the content size of the PDF file exceed 32766
> > > > characters?
> > > >
> > > > I'm trying to accomplish to be able to index documents that can be of
> > any
> > > > size (even those with very large contents), and build the suggester
> > from
> > > > there. Also, when I do a search, it shouldn't be returning whole
> > fields,
> > > > but just to return a portion of the sentence.
> > > >
> > > >
> > > >
> > > > Regards,
> > > > Edwin
> > > >
> > > >
> > > > On 16 June 2015 at 23:02, Erick Erickson <erickerick...@gmail.com>
> > > wrote:
> > > >
> > > >> The suggesters are built to return whole fields. You _might_
> > > >> be able to add multiple fragments to a multiValued
> > > >> entry and get fragments, I haven't tried that though
> > > >> and I suspect that actually you'd get the same thing..
> > > >>
> > > >> This is an XY problem IMO. Please describe exactly what
> > > >> you're trying to accomplish, with examples rather than
> > > >> continue to pursue this path. It sounds like you want
> > > >> spellcheck or similar. The _point_ behind the
> > > >> suggesters is that they handle multiple-word suggestions
> > > >> by returning he whole field. So putting long text fields
> > > >> into them is not going to work.
> > > >>
> > > >> Best,
> > > >> Erick
> > > >>
> > > >> On Tue, Jun 16, 2015 at 1:46 AM, Alessandro Benedetti
> > > >> <benedetti.ale...@gmail.com> wrote:
> > > >> > in line :
> > > >> >
> > > >> > 2015-06-16 4:43 GMT+01:00 Zheng Lin Edwin Yeo <
> edwinye...@gmail.com
> > >:
> > > >> >
> > > >> >> Thanks Benedetti,
> > > >> >>
> > > >> >> I've change to the AnalyzingInfixLookup approach, and it is able
> to
> > > >> start
> > > >> >> searching from the middle of the field.
> > > >> >>
> > > >> >> However, is it possible to make the suggester to show only part
> of
> > > the
> > > >> >> content of the field (like 2 or 3 fields after), instead of the
> > > entire
> > > >> >> content/sentence, which can be quite long?
> > > >> >>
> > > >> >
> > > >> > I assume you use "fields" in the place of tokens.
> > > >> > The answer is yes, I already said that in my previous mail, I
> invite
> > > you
> > > >> to
> > > >> > read carefully the answers and the documentation linked !
> > > >> >
> > > >> > Related the excessive dimensions of tokens. This is weird, what
> are
> > > you
> > > >> > trying to autocomplete ?
> > > >> > I really doubt would be useful for a user to see super long auto
> > > >> completed
> > > >> > terms.
> > > >> >
> > > >> > Cheers
> > > >> >
> > > >> >>
> > > >> >>
> > > >> >> Regards,
> > > >> >> Edwin
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >> On 15 June 2015 at 17:33, Alessandro Benedetti <
> > > >> benedetti.ale...@gmail.com
> > > >> >> >
> > > >> >> wrote:
> > > >> >>
> > > >> >> > ehehe Edwin, I think you should read again the document I
> linked
> > > time
> > > >> >> ago :
> > > >> >> >
> > > >> >> > http://lucidworks.com/blog/solr-suggester/
> > > >> >> >
> > > >> >> > The suggester you used is not meant to provide infix
> suggestions.
> > > >> >> > The fuzzy suggester is working on a fuzzy basis , with the
> > > *starting*
> > > >> >> terms
> > > >> >> > of a field content.
> > > >> >> >
> > > >> >> > What you are looking for is actually one of the Infix
> Suggesters.
> > > >> >> > For example the AnalyzingInfixLookup approach.
> > > >> >> >
> > > >> >> > When working with Suggesters is important first to make a
> > > distinction
> > > >> :
> > > >> >> >
> > > >> >> > 1) Returning the full content of the field ( analysisInfix or
> > > Fuzzy)
> > > >> >> >
> > > >> >> > 2) Returning token(s) ( Free Text Suggester)
> > > >> >> >
> > > >> >> > Then the second difference is :
> > > >> >> >
> > > >> >> > 1) Infix suggestions ( from the "middle" of the field content)
> > > >> >> > 2) Classic suggester ( from the beginning of the field content)
> > > >> >> >
> > > >> >> > Clarified that, will be quite simple to work with suggesters.
> > > >> >> >
> > > >> >> > Cheers
> > > >> >> >
> > > >> >> > 2015-06-15 9:28 GMT+01:00 Zheng Lin Edwin Yeo <
> > > edwinye...@gmail.com>:
> > > >> >> >
> > > >> >> > > I've indexed a rich-text documents with the following
> content:
> > > >> >> > >
> > > >> >> > > This is a testing rich text documents to test the uploading
> of
> > > >> files to
> > > >> >> > > Solr
> > > >> >> > >
> > > >> >> > >
> > > >> >> > > When I tried to use the suggestion, it return me the entire
> > > field in
> > > >> >> the
> > > >> >> > > content once I enter suggest?q=t. However, when I tried to
> > search
> > > >> for
> > > >> >> > > q='rich', I don't get any results returned.
> > > >> >> > >
> > > >> >> > > This is my current configuration for the suggester:
> > > >> >> > > <searchComponent name="suggest"
> class="solr.SuggestComponent">
> > > >> >> > >   <lst name="suggester">
> > > >> >> > > <str name="name">mySuggester</str>
> > > >> >> > > <str name="lookupImpl">FuzzyLookupFactory</str>
> > > >> >> > > <str name="dictionaryImpl">DocumentDictionaryFactory</str>
> > > >> >> > > <str name="field">Suggestion</str>
> > > >> >> > > <str name="suggestAnalyzerFieldType">suggestType</str>
> > > >> >> > > <str name="buildOnStartup">true</str>
> > > >> >> > > <str name="buildOnCommit">false</str>
> > > >> >> > >   </lst>
> > > >> >> > > </searchComponent>
> > > >> >> > >
> > > >> >> > > <requestHandler name="/suggest" class="solr.SearchHandler"
> > > >> >> > startup="lazy" >
> > > >> >> > >   <lst name="defaults">
> > > >> >> > >     <str name="wt">json</str>
> > > >> >> > >         <str name="indent">true</str>
> > > >> >> > >
> > > >> >> > > <str name="suggest">true</str>
> > > >> >> > > <str name="suggest.count">10</str>
> > > >> >> > > <str name="suggest.dictionary">mySuggester</str>
> > > >> >> > >   </lst>
> > > >> >> > >   <arr name="components">
> > > >> >> > > <str>suggest</str>
> > > >> >> > >   </arr>
> > > >> >> > > </requestHandler>
> > > >> >> > >
> > > >> >> > > Is it possible to allow the suggester to return something
> even
> > > from
> > > >> the
> > > >> >> > > middle of the sentence, and also not to return the entire
> > > sentence
> > > >> if
> > > >> >> the
> > > >> >> > > sentence. Perhaps it should just suggest the next 2 or 3
> > fields,
> > > >> and to
> > > >> >> > > return more fields as the users type.
> > > >> >> > >
> > > >> >> > > For example,
> > > >> >> > > When user type 'this', it should return 'This is a testing'
> > > >> >> > > When user type 'this is a testing', it should return 'This
> is a
> > > >> testing
> > > >> >> > > rich text documents'.
> > > >> >> > >
> > > >> >> > >
> > > >> >> > > Regards,
> > > >> >> > > Edwin
> > > >> >> > >
> > > >> >> >
> > > >> >> >
> > > >> >> >
> > > >> >> > --
> > > >> >> > --------------------------
> > > >> >> >
> > > >> >> > Benedetti Alessandro
> > > >> >> > Visiting card : http://about.me/alessandro_benedetti
> > > >> >> >
> > > >> >> > "Tyger, tyger burning bright
> > > >> >> > In the forests of the night,
> > > >> >> > What immortal hand or eye
> > > >> >> > Could frame thy fearful symmetry?"
> > > >> >> >
> > > >> >> > William Blake - Songs of Experience -1794 England
> > > >> >> >
> > > >> >>
> > > >> >
> > > >> >
> > > >> >
> > > >> > --
> > > >> > --------------------------
> > > >> >
> > > >> > Benedetti Alessandro
> > > >> > Visiting card : http://about.me/alessandro_benedetti
> > > >> >
> > > >> > "Tyger, tyger burning bright
> > > >> > In the forests of the night,
> > > >> > What immortal hand or eye
> > > >> > Could frame thy fearful symmetry?"
> > > >> >
> > > >> > William Blake - Songs of Experience -1794 England
> > > >>
> > >
> >
>
>
>
> --
> --------------------------
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>

Reply via email to