subject:"Re\: Solr's suggester results"

Re: Solr's suggester results

2019-06-19 Thread ppunet

Here is my problem statement and I would really appreciate for your feedback.

1. There are 1000's of pdf's with large amount of content are indexed to
Solr.
2. Using AnalyzingInfixSuggester for the suggestions.

Q. As the SuggeterComponent provides the 'entire content' of the field in
the suggestions. How is it possible to have Suggester to return only part of
the content of the field, instead of the entire content, which in my
scenario quite long?


Thanks in advance.

PD



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Solr's suggester results

2015-06-17 Thread Zheng Lin Edwin Yeo

I'm using the FreeTextLookupFactory in my implementation now.

Yes, now it can suggest part of the field from the middle of the content.

I read that this implementation is able to consider the previous tokens
when making the suggestions. However, when I try to enter a search phrase,
it seems that it is only considering the last token and not any of the
previous tokens.

For example, when I search for
http://localhost:8983/edm/collection1/suggest?suggest.q=trouble free, it is
giving me suggestions based on the word 'free' only, and not 'trouble free'.

This is my configuration:

In solrconfig.xml:

searchComponent name=suggest class=solr.SuggestComponent
lst name=suggester

str name=lookupImplFreeTextLookupFactory/str
str name=indexPathsuggester_freetext_dir/str
str name=dictionaryImplDocumentDictionaryFactory/str
str name=fieldSuggestion/str
str name=suggestFreeTextAnalyzerFieldTypesuggestType/str
str name=ngrams5/str
str name=buildOnStartupfalse/str
str name=buildOnCommitfalse/str
/lst
/searchComponent

requestHandler name=/suggest class=solr.SearchHandler startup=lazy
lst name=defaults
str name=wtjson/str
str name=indenttrue/str

str name=suggesttrue/str
str name=suggest.count10/str
str name=suggest.dictionarymySuggester/str
/lst
arr name=components
strsuggest/str
/arr
/requestHandler

In schema.xml

fieldType name=suggestType class=solr.TextField
positionIncrementGap=100
analyzer
charFilter class=solr.PatternReplaceCharFilterFactory
pattern=[^a-zA-Z0-9] replacement= /
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.ShingleFilterFactory maxShingleSize=5
outputUnigrams=true/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt /
/analyzer
/fieldType

Is there anything I configured wrongly? I've set the ngrams to 5, which
means it is supposed to consider up to the previous 5 tokens entered?

Regards,
Edwin

On 17 June 2015 at 22:12, Alessandro Benedetti benedetti.ale...@gmail.com
wrote:

Edwin,
The spellcheck is a thing, the Suggester is another.

If you need to provide auto suggestion to your users, the suggester is the
right thing to use.
But I really doubt to be useful to select as a suggester field the entire
content.
it is going to be quite expensive.

In the case I would again really suggest you to take a look to the article
I quoted and Solr generic documentation.

It is possible to suggest part of the field.
You can use the FreeText suggester with a proper analysis selected.

Cheers

2015-06-17 6:14 GMT+01:00 Zheng Lin Edwin Yeo edwinye...@gmail.com:

Yes I've looked at that before, but I was told that the newer version of
Solr has its own suggester, and does not need to use spellchecker
anymore?

So it's not necessary to use the spellechecker inside suggester anymore?

Regards,
Edwin

On 17 June 2015 at 11:56, Erick Erickson erickerick...@gmail.com
wrote:

Have you looked at spellchecker? Because that sound much more like
what you're asking about than suggester.

Spell checking is more what you're asking for, have you even looked at
that
after it was suggested?

bq: Also, when I do a search, it shouldn't be returning whole fields,
but just to return a portion of the sentence

This is what highlighting is built for.

Really, I recommend you take the time to do some familiarization with
the
whole search space and Solr. The excellent book here:

http://www.amazon.com/Solr-Action-Trey-Grainger/dp/1617291021/ref=sr_1_1?ie=UTF8qid=1434513284sr=8-1keywords=apache+solrpebp=1434513287267perid=0YRK508J0HJ1N3BAX20E

will give you the grounding you need to get the most out of Solr.

Best,
Erick

On Tue, Jun 16, 2015 at 8:27 PM, Zheng Lin Edwin Yeo
edwinye...@gmail.com wrote:
The long content is from when I tried to index PDF files. As some PDF
files
has alot of words in the content, it will lead to the *UTF8 encoding
is
longer than the max length 32766 error.*

I think the problem is the content size of the PDF file exceed 32766
characters?

I'm trying to accomplish to be able to index documents that can be of
any
size (even those with very large contents), and build the suggester
from
there. Also, when I do a search, it shouldn't be returning whole
fields,
but just to return a portion of the sentence.

Regards,
Edwin

On 16 June 2015 at 23:02, Erick Erickson erickerick...@gmail.com
wrote:

The suggesters are built to return whole fields. You _might_
be able to add multiple fragments to a multiValued
entry and get fragments, I haven't tried that though
and I suspect that actually you'd get the same thing..

This is an XY problem IMO. Please describe exactly what
you're trying to accomplish, with examples rather than
continue to pursue this path. It sounds like you want
spellcheck or

Re: Solr's suggester results

2015-06-17 Thread Alessandro Benedetti

Edwin,
The spellcheck is a thing, the Suggester is another.

In the case I would again really suggest you to take a look to the article
I quoted and Solr generic documentation.

It is possible to suggest part of the field.
You can use the FreeText suggester with a proper analysis selected.

Cheers

2015-06-17 6:14 GMT+01:00 Zheng Lin Edwin Yeo edwinye...@gmail.com:

Yes I've looked at that before, but I was told that the newer version of
Solr has its own suggester, and does not need to use spellchecker anymore?

So it's not necessary to use the spellechecker inside suggester anymore?

Regards,
Edwin

On 17 June 2015 at 11:56, Erick Erickson erickerick...@gmail.com wrote:

Have you looked at spellchecker? Because that sound much more like
what you're asking about than suggester.

Spell checking is more what you're asking for, have you even looked at
that
after it was suggested?

bq: Also, when I do a search, it shouldn't be returning whole fields,
but just to return a portion of the sentence

This is what highlighting is built for.

Really, I recommend you take the time to do some familiarization with the
whole search space and Solr. The excellent book here:

http://www.amazon.com/Solr-Action-Trey-Grainger/dp/1617291021/ref=sr_1_1?ie=UTF8qid=1434513284sr=8-1keywords=apache+solrpebp=1434513287267perid=0YRK508J0HJ1N3BAX20E

will give you the grounding you need to get the most out of Solr.

Best,
Erick

I think the problem is the content size of the PDF file exceed 32766
characters?

Regards,
Edwin

On 16 June 2015 at 23:02, Erick Erickson erickerick...@gmail.com
wrote:

This is an XY problem IMO. Please describe exactly what
you're trying to accomplish, with examples rather than
continue to pursue this path. It sounds like you want
spellcheck or similar. The _point_ behind the
suggesters is that they handle multiple-word suggestions
by returning he whole field. So putting long text fields
into them is not going to work.

Best,
Erick

On Tue, Jun 16, 2015 at 1:46 AM, Alessandro Benedetti
benedetti.ale...@gmail.com wrote:
in line :

2015-06-16 4:43 GMT+01:00 Zheng Lin Edwin Yeo edwinye...@gmail.com
:

Thanks Benedetti,

I've change to the AnalyzingInfixLookup approach, and it is able to
start
searching from the middle of the field.

However, is it possible to make the suggester to show only part of
the
content of the field (like 2 or 3 fields after), instead of the
entire
content/sentence, which can be quite long?

I assume you use fields in the place of tokens.
The answer is yes, I already said that in my previous mail, I invite
you
to
read carefully the answers and the documentation linked !

Related the excessive dimensions of tokens. This is weird, what are
you
trying to autocomplete ?
I really doubt would be useful for a user to see super long auto
completed
terms.

Cheers

Regards,
Edwin

On 15 June 2015 at 17:33, Alessandro Benedetti
benedetti.ale...@gmail.com

wrote:

ehehe Edwin, I think you should read again the document I linked
time
ago :

http://lucidworks.com/blog/solr-suggester/

The suggester you used is not meant to provide infix suggestions.
The fuzzy suggester is working on a fuzzy basis , with the
*starting*
terms
of a field content.

What you are looking for is actually one of the Infix Suggesters.
For example the AnalyzingInfixLookup approach.

When working with Suggesters is important first to make a
distinction
:

1) Returning the full content of the field ( analysisInfix or
Fuzzy)

2) Returning token(s) ( Free Text Suggester)

Then the second difference is :

1) Infix suggestions ( from the middle of the field

Re: Solr's suggester results

2015-06-16 Thread Erick Erickson

The suggesters are built to return whole fields. You _might_
be able to add multiple fragments to a multiValued
entry and get fragments, I haven't tried that though
and I suspect that actually you'd get the same thing..

This is an XY problem IMO. Please describe exactly what
you're trying to accomplish, with examples rather than
continue to pursue this path. It sounds like you want
spellcheck or similar. The _point_ behind the
suggesters is that they handle multiple-word suggestions
by returning he whole field. So putting long text fields
into them is not going to work.

Best,
Erick

On Tue, Jun 16, 2015 at 1:46 AM, Alessandro Benedetti
benedetti.ale...@gmail.com wrote:
 in line :

 2015-06-16 4:43 GMT+01:00 Zheng Lin Edwin Yeo edwinye...@gmail.com:

 Thanks Benedetti,

 I've change to the AnalyzingInfixLookup approach, and it is able to start
 searching from the middle of the field.

 However, is it possible to make the suggester to show only part of the
 content of the field (like 2 or 3 fields after), instead of the entire
 content/sentence, which can be quite long?


 I assume you use fields in the place of tokens.
 The answer is yes, I already said that in my previous mail, I invite you to
 read carefully the answers and the documentation linked !

 Related the excessive dimensions of tokens. This is weird, what are you
 trying to autocomplete ?
 I really doubt would be useful for a user to see super long auto completed
 terms.

 Cheers



 Regards,
 Edwin



 On 15 June 2015 at 17:33, Alessandro Benedetti benedetti.ale...@gmail.com
 
 wrote:

  ehehe Edwin, I think you should read again the document I linked time
 ago :
 
  http://lucidworks.com/blog/solr-suggester/
 
  The suggester you used is not meant to provide infix suggestions.
  The fuzzy suggester is working on a fuzzy basis , with the *starting*
 terms
  of a field content.
 
  What you are looking for is actually one of the Infix Suggesters.
  For example the AnalyzingInfixLookup approach.
 
  When working with Suggesters is important first to make a distinction :
 
  1) Returning the full content of the field ( analysisInfix or Fuzzy)
 
  2) Returning token(s) ( Free Text Suggester)
 
  Then the second difference is :
 
  1) Infix suggestions ( from the middle of the field content)
  2) Classic suggester ( from the beginning of the field content)
 
  Clarified that, will be quite simple to work with suggesters.
 
  Cheers
 
  2015-06-15 9:28 GMT+01:00 Zheng Lin Edwin Yeo edwinye...@gmail.com:
 
   I've indexed a rich-text documents with the following content:
  
   This is a testing rich text documents to test the uploading of files to
   Solr
  
  
   When I tried to use the suggestion, it return me the entire field in
 the
   content once I enter suggest?q=t. However, when I tried to search for
   q='rich', I don't get any results returned.
  
   This is my current configuration for the suggester:
   searchComponent name=suggest class=solr.SuggestComponent
 lst name=suggester
   str name=namemySuggester/str
   str name=lookupImplFuzzyLookupFactory/str
   str name=dictionaryImplDocumentDictionaryFactory/str
   str name=fieldSuggestion/str
   str name=suggestAnalyzerFieldTypesuggestType/str
   str name=buildOnStartuptrue/str
   str name=buildOnCommitfalse/str
 /lst
   /searchComponent
  
   requestHandler name=/suggest class=solr.SearchHandler
  startup=lazy 
 lst name=defaults
   str name=wtjson/str
   str name=indenttrue/str
  
   str name=suggesttrue/str
   str name=suggest.count10/str
   str name=suggest.dictionarymySuggester/str
 /lst
 arr name=components
   strsuggest/str
 /arr
   /requestHandler
  
   Is it possible to allow the suggester to return something even from the
   middle of the sentence, and also not to return the entire sentence if
 the
   sentence. Perhaps it should just suggest the next 2 or 3 fields, and to
   return more fields as the users type.
  
   For example,
   When user type 'this', it should return 'This is a testing'
   When user type 'this is a testing', it should return 'This is a testing
   rich text documents'.
  
  
   Regards,
   Edwin
  
 
 
 
  --
  --
 
  Benedetti Alessandro
  Visiting card : http://about.me/alessandro_benedetti
 
  Tyger, tyger burning bright
  In the forests of the night,
  What immortal hand or eye
  Could frame thy fearful symmetry?
 
  William Blake - Songs of Experience -1794 England
 




 --
 --

 Benedetti Alessandro
 Visiting card : http://about.me/alessandro_benedetti

 Tyger, tyger burning bright
 In the forests of the night,
 What immortal hand or eye
 Could frame thy fearful symmetry?

 William Blake - Songs of Experience -1794 England

Re: Solr's suggester results

2015-06-16 Thread Zheng Lin Edwin Yeo

The long content is from when I tried to index PDF files. As some PDF files
has alot of words in the content, it will lead to the *UTF8 encoding is
longer than the max length 32766 error.*

I think the problem is the content size of the PDF file exceed 32766
characters?

I'm trying to accomplish to be able to index documents that can be of any
size (even those with very large contents), and build the suggester from
there. Also, when I do a search, it shouldn't be returning whole fields,
but just to return a portion of the sentence.



Regards,
Edwin


On 16 June 2015 at 23:02, Erick Erickson erickerick...@gmail.com wrote:

 The suggesters are built to return whole fields. You _might_
 be able to add multiple fragments to a multiValued
 entry and get fragments, I haven't tried that though
 and I suspect that actually you'd get the same thing..

 This is an XY problem IMO. Please describe exactly what
 you're trying to accomplish, with examples rather than
 continue to pursue this path. It sounds like you want
 spellcheck or similar. The _point_ behind the
 suggesters is that they handle multiple-word suggestions
 by returning he whole field. So putting long text fields
 into them is not going to work.

 Best,
 Erick

 On Tue, Jun 16, 2015 at 1:46 AM, Alessandro Benedetti
 benedetti.ale...@gmail.com wrote:
  in line :
 
  2015-06-16 4:43 GMT+01:00 Zheng Lin Edwin Yeo edwinye...@gmail.com:
 
  Thanks Benedetti,
 
  I've change to the AnalyzingInfixLookup approach, and it is able to
 start
  searching from the middle of the field.
 
  However, is it possible to make the suggester to show only part of the
  content of the field (like 2 or 3 fields after), instead of the entire
  content/sentence, which can be quite long?
 
 
  I assume you use fields in the place of tokens.
  The answer is yes, I already said that in my previous mail, I invite you
 to
  read carefully the answers and the documentation linked !
 
  Related the excessive dimensions of tokens. This is weird, what are you
  trying to autocomplete ?
  I really doubt would be useful for a user to see super long auto
 completed
  terms.
 
  Cheers
 
 
 
  Regards,
  Edwin
 
 
 
  On 15 June 2015 at 17:33, Alessandro Benedetti 
 benedetti.ale...@gmail.com
  
  wrote:
 
   ehehe Edwin, I think you should read again the document I linked time
  ago :
  
   http://lucidworks.com/blog/solr-suggester/
  
   The suggester you used is not meant to provide infix suggestions.
   The fuzzy suggester is working on a fuzzy basis , with the *starting*
  terms
   of a field content.
  
   What you are looking for is actually one of the Infix Suggesters.
   For example the AnalyzingInfixLookup approach.
  
   When working with Suggesters is important first to make a distinction
 :
  
   1) Returning the full content of the field ( analysisInfix or Fuzzy)
  
   2) Returning token(s) ( Free Text Suggester)
  
   Then the second difference is :
  
   1) Infix suggestions ( from the middle of the field content)
   2) Classic suggester ( from the beginning of the field content)
  
   Clarified that, will be quite simple to work with suggesters.
  
   Cheers
  
   2015-06-15 9:28 GMT+01:00 Zheng Lin Edwin Yeo edwinye...@gmail.com:
  
I've indexed a rich-text documents with the following content:
   
This is a testing rich text documents to test the uploading of
 files to
Solr
   
   
When I tried to use the suggestion, it return me the entire field in
  the
content once I enter suggest?q=t. However, when I tried to search
 for
q='rich', I don't get any results returned.
   
This is my current configuration for the suggester:
searchComponent name=suggest class=solr.SuggestComponent
  lst name=suggester
str name=namemySuggester/str
str name=lookupImplFuzzyLookupFactory/str
str name=dictionaryImplDocumentDictionaryFactory/str
str name=fieldSuggestion/str
str name=suggestAnalyzerFieldTypesuggestType/str
str name=buildOnStartuptrue/str
str name=buildOnCommitfalse/str
  /lst
/searchComponent
   
requestHandler name=/suggest class=solr.SearchHandler
   startup=lazy 
  lst name=defaults
str name=wtjson/str
str name=indenttrue/str
   
str name=suggesttrue/str
str name=suggest.count10/str
str name=suggest.dictionarymySuggester/str
  /lst
  arr name=components
strsuggest/str
  /arr
/requestHandler
   
Is it possible to allow the suggester to return something even from
 the
middle of the sentence, and also not to return the entire sentence
 if
  the
sentence. Perhaps it should just suggest the next 2 or 3 fields,
 and to
return more fields as the users type.
   
For example,
When user type 'this', it should return 'This is a testing'
When user type 'this is a testing', it should return 'This is a
 testing
rich text documents'.
   
   
Regards,
Edwin
   
  
  
  
   --
   --
  
   Benedetti

Re: Solr's suggester results

2015-06-16 Thread Erick Erickson

Have you looked at spellchecker? Because that sound much more like
what you're asking about than suggester.

Spell checking is more what you're asking for, have you even looked at that
after it was suggested?

bq: Also, when I do a search, it shouldn't be returning whole fields,
but just to return a portion of the sentence

This is what highlighting is built for.

Really, I recommend you take the time to do some familiarization with the
whole search space and Solr. The excellent book here:

http://www.amazon.com/Solr-Action-Trey-Grainger/dp/1617291021/ref=sr_1_1?ie=UTF8qid=1434513284sr=8-1keywords=apache+solrpebp=1434513287267perid=0YRK508J0HJ1N3BAX20E

will give you the grounding you need to get the most out of Solr.

Best,
Erick

On Tue, Jun 16, 2015 at 8:27 PM, Zheng Lin Edwin Yeo
edwinye...@gmail.com wrote:
The long content is from when I tried to index PDF files. As some PDF files
has alot of words in the content, it will lead to the *UTF8 encoding is
longer than the max length 32766 error.*

I think the problem is the content size of the PDF file exceed 32766
characters?

I'm trying to accomplish to be able to index documents that can be of any
size (even those with very large contents), and build the suggester from
there. Also, when I do a search, it shouldn't be returning whole fields,
but just to return a portion of the sentence.

Regards,
Edwin

On 16 June 2015 at 23:02, Erick Erickson erickerick...@gmail.com wrote:

Best,
Erick

On Tue, Jun 16, 2015 at 1:46 AM, Alessandro Benedetti
benedetti.ale...@gmail.com wrote:
in line :

2015-06-16 4:43 GMT+01:00 Zheng Lin Edwin Yeo edwinye...@gmail.com:

Thanks Benedetti,

I've change to the AnalyzingInfixLookup approach, and it is able to
start
searching from the middle of the field.

However, is it possible to make the suggester to show only part of the
content of the field (like 2 or 3 fields after), instead of the entire
content/sentence, which can be quite long?

I assume you use fields in the place of tokens.
The answer is yes, I already said that in my previous mail, I invite you
to
read carefully the answers and the documentation linked !

Related the excessive dimensions of tokens. This is weird, what are you
trying to autocomplete ?
I really doubt would be useful for a user to see super long auto
completed
terms.

Cheers

Regards,
Edwin

On 15 June 2015 at 17:33, Alessandro Benedetti
benedetti.ale...@gmail.com

wrote:

ehehe Edwin, I think you should read again the document I linked time
ago :

http://lucidworks.com/blog/solr-suggester/

The suggester you used is not meant to provide infix suggestions.
The fuzzy suggester is working on a fuzzy basis , with the *starting*
terms
of a field content.

What you are looking for is actually one of the Infix Suggesters.
For example the AnalyzingInfixLookup approach.

When working with Suggesters is important first to make a distinction
:

1) Returning the full content of the field ( analysisInfix or Fuzzy)

2) Returning token(s) ( Free Text Suggester)

Then the second difference is :

1) Infix suggestions ( from the middle of the field content)
2) Classic suggester ( from the beginning of the field content)

Clarified that, will be quite simple to work with suggesters.

Cheers

2015-06-15 9:28 GMT+01:00 Zheng Lin Edwin Yeo edwinye...@gmail.com:

I've indexed a rich-text documents with the following content:

This is a testing rich text documents to test the uploading of
files to
Solr

When I tried to use the suggestion, it return me the entire field in
the
content once I enter suggest?q=t. However, when I tried to search
for
q='rich', I don't get any results returned.

This is my current configuration for the suggester:
searchComponent name=suggest class=solr.SuggestComponent
lst name=suggester
str name=namemySuggester/str
str name=lookupImplFuzzyLookupFactory/str
str name=dictionaryImplDocumentDictionaryFactory/str
str name=fieldSuggestion/str
str name=suggestAnalyzerFieldTypesuggestType/str
str name=buildOnStartuptrue/str
str name=buildOnCommitfalse/str
/lst
/searchComponent

requestHandler name=/suggest class=solr.SearchHandler
startup=lazy
lst

Re: Solr's suggester results

2015-06-16 Thread Zheng Lin Edwin Yeo

Yes I've looked at that before, but I was told that the newer version of
Solr has its own suggester, and does not need to use spellchecker anymore?

So it's not necessary to use the spellechecker inside suggester anymore?

Regards,
Edwin

On 17 June 2015 at 11:56, Erick Erickson erickerick...@gmail.com wrote:

Have you looked at spellchecker? Because that sound much more like
what you're asking about than suggester.

Spell checking is more what you're asking for, have you even looked at that
after it was suggested?

bq: Also, when I do a search, it shouldn't be returning whole fields,
but just to return a portion of the sentence

This is what highlighting is built for.

Really, I recommend you take the time to do some familiarization with the
whole search space and Solr. The excellent book here:

http://www.amazon.com/Solr-Action-Trey-Grainger/dp/1617291021/ref=sr_1_1?ie=UTF8qid=1434513284sr=8-1keywords=apache+solrpebp=1434513287267perid=0YRK508J0HJ1N3BAX20E

will give you the grounding you need to get the most out of Solr.

Best,
Erick

I think the problem is the content size of the PDF file exceed 32766
characters?

I'm trying to accomplish to be able to index documents that can be of any
size (even those with very large contents), and build the suggester from
there. Also, when I do a search, it shouldn't be returning whole fields,
but just to return a portion of the sentence.

Regards,
Edwin

On 16 June 2015 at 23:02, Erick Erickson erickerick...@gmail.com
wrote:

Best,
Erick

On Tue, Jun 16, 2015 at 1:46 AM, Alessandro Benedetti
benedetti.ale...@gmail.com wrote:
in line :

2015-06-16 4:43 GMT+01:00 Zheng Lin Edwin Yeo edwinye...@gmail.com:

Thanks Benedetti,

I've change to the AnalyzingInfixLookup approach, and it is able to
start
searching from the middle of the field.

However, is it possible to make the suggester to show only part of
the
content of the field (like 2 or 3 fields after), instead of the
entire
content/sentence, which can be quite long?

I assume you use fields in the place of tokens.
The answer is yes, I already said that in my previous mail, I invite
you
to
read carefully the answers and the documentation linked !

Related the excessive dimensions of tokens. This is weird, what are
you
trying to autocomplete ?
I really doubt would be useful for a user to see super long auto
completed
terms.

Cheers

Regards,
Edwin

On 15 June 2015 at 17:33, Alessandro Benedetti
benedetti.ale...@gmail.com

wrote:

ehehe Edwin, I think you should read again the document I linked
time
ago :

http://lucidworks.com/blog/solr-suggester/

The suggester you used is not meant to provide infix suggestions.
The fuzzy suggester is working on a fuzzy basis , with the
*starting*
terms
of a field content.

What you are looking for is actually one of the Infix Suggesters.
For example the AnalyzingInfixLookup approach.

When working with Suggesters is important first to make a
distinction
:

1) Returning the full content of the field ( analysisInfix or
Fuzzy)

2) Returning token(s) ( Free Text Suggester)

Then the second difference is :

1) Infix suggestions ( from the middle of the field content)
2) Classic suggester ( from the beginning of the field content)

Clarified that, will be quite simple to work with suggesters.

Cheers

2015-06-15 9:28 GMT+01:00 Zheng Lin Edwin Yeo
edwinye...@gmail.com:

I've indexed a rich-text documents with the following content:

This is a testing rich text documents to test the uploading of
files to
Solr

When I tried to use the suggestion, it return me the entire
field in
the
content once I enter suggest?q=t. However, when I tried to search
for
q='rich', I don't get any results returned.

This is my current configuration for the suggester:
searchComponent name=suggest

Re: Solr's suggester results

2015-06-16 Thread Alessandro Benedetti

in line :

2015-06-16 4:43 GMT+01:00 Zheng Lin Edwin Yeo edwinye...@gmail.com:

 Thanks Benedetti,

 I've change to the AnalyzingInfixLookup approach, and it is able to start
 searching from the middle of the field.

 However, is it possible to make the suggester to show only part of the
 content of the field (like 2 or 3 fields after), instead of the entire
 content/sentence, which can be quite long?


I assume you use fields in the place of tokens.
The answer is yes, I already said that in my previous mail, I invite you to
read carefully the answers and the documentation linked !

Related the excessive dimensions of tokens. This is weird, what are you
trying to autocomplete ?
I really doubt would be useful for a user to see super long auto completed
terms.

Cheers



 Regards,
 Edwin



 On 15 June 2015 at 17:33, Alessandro Benedetti benedetti.ale...@gmail.com
 
 wrote:

  ehehe Edwin, I think you should read again the document I linked time
 ago :
 
  http://lucidworks.com/blog/solr-suggester/
 
  The suggester you used is not meant to provide infix suggestions.
  The fuzzy suggester is working on a fuzzy basis , with the *starting*
 terms
  of a field content.
 
  What you are looking for is actually one of the Infix Suggesters.
  For example the AnalyzingInfixLookup approach.
 
  When working with Suggesters is important first to make a distinction :
 
  1) Returning the full content of the field ( analysisInfix or Fuzzy)
 
  2) Returning token(s) ( Free Text Suggester)
 
  Then the second difference is :
 
  1) Infix suggestions ( from the middle of the field content)
  2) Classic suggester ( from the beginning of the field content)
 
  Clarified that, will be quite simple to work with suggesters.
 
  Cheers
 
  2015-06-15 9:28 GMT+01:00 Zheng Lin Edwin Yeo edwinye...@gmail.com:
 
   I've indexed a rich-text documents with the following content:
  
   This is a testing rich text documents to test the uploading of files to
   Solr
  
  
   When I tried to use the suggestion, it return me the entire field in
 the
   content once I enter suggest?q=t. However, when I tried to search for
   q='rich', I don't get any results returned.
  
   This is my current configuration for the suggester:
   searchComponent name=suggest class=solr.SuggestComponent
 lst name=suggester
   str name=namemySuggester/str
   str name=lookupImplFuzzyLookupFactory/str
   str name=dictionaryImplDocumentDictionaryFactory/str
   str name=fieldSuggestion/str
   str name=suggestAnalyzerFieldTypesuggestType/str
   str name=buildOnStartuptrue/str
   str name=buildOnCommitfalse/str
 /lst
   /searchComponent
  
   requestHandler name=/suggest class=solr.SearchHandler
  startup=lazy 
 lst name=defaults
   str name=wtjson/str
   str name=indenttrue/str
  
   str name=suggesttrue/str
   str name=suggest.count10/str
   str name=suggest.dictionarymySuggester/str
 /lst
 arr name=components
   strsuggest/str
 /arr
   /requestHandler
  
   Is it possible to allow the suggester to return something even from the
   middle of the sentence, and also not to return the entire sentence if
 the
   sentence. Perhaps it should just suggest the next 2 or 3 fields, and to
   return more fields as the users type.
  
   For example,
   When user type 'this', it should return 'This is a testing'
   When user type 'this is a testing', it should return 'This is a testing
   rich text documents'.
  
  
   Regards,
   Edwin
  
 
 
 
  --
  --
 
  Benedetti Alessandro
  Visiting card : http://about.me/alessandro_benedetti
 
  Tyger, tyger burning bright
  In the forests of the night,
  What immortal hand or eye
  Could frame thy fearful symmetry?
 
  William Blake - Songs of Experience -1794 England
 




-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England

Re: Solr's suggester results

2015-06-15 Thread Alessandro Benedetti

ehehe Edwin, I think you should read again the document I linked time ago :

http://lucidworks.com/blog/solr-suggester/

The suggester you used is not meant to provide infix suggestions.
The fuzzy suggester is working on a fuzzy basis , with the *starting* terms
of a field content.

What you are looking for is actually one of the Infix Suggesters.
For example the AnalyzingInfixLookup approach.

When working with Suggesters is important first to make a distinction :

1) Returning the full content of the field ( analysisInfix or Fuzzy)

2) Returning token(s) ( Free Text Suggester)

Then the second difference is :

1) Infix suggestions ( from the middle of the field content)
2) Classic suggester ( from the beginning of the field content)

Clarified that, will be quite simple to work with suggesters.

Cheers

2015-06-15 9:28 GMT+01:00 Zheng Lin Edwin Yeo edwinye...@gmail.com:

 I've indexed a rich-text documents with the following content:

 This is a testing rich text documents to test the uploading of files to
 Solr


 When I tried to use the suggestion, it return me the entire field in the
 content once I enter suggest?q=t. However, when I tried to search for
 q='rich', I don't get any results returned.

 This is my current configuration for the suggester:
 searchComponent name=suggest class=solr.SuggestComponent
   lst name=suggester
 str name=namemySuggester/str
 str name=lookupImplFuzzyLookupFactory/str
 str name=dictionaryImplDocumentDictionaryFactory/str
 str name=fieldSuggestion/str
 str name=suggestAnalyzerFieldTypesuggestType/str
 str name=buildOnStartuptrue/str
 str name=buildOnCommitfalse/str
   /lst
 /searchComponent

 requestHandler name=/suggest class=solr.SearchHandler startup=lazy 
   lst name=defaults
 str name=wtjson/str
 str name=indenttrue/str

 str name=suggesttrue/str
 str name=suggest.count10/str
 str name=suggest.dictionarymySuggester/str
   /lst
   arr name=components
 strsuggest/str
   /arr
 /requestHandler

 Is it possible to allow the suggester to return something even from the
 middle of the sentence, and also not to return the entire sentence if the
 sentence. Perhaps it should just suggest the next 2 or 3 fields, and to
 return more fields as the users type.

 For example,
 When user type 'this', it should return 'This is a testing'
 When user type 'this is a testing', it should return 'This is a testing
 rich text documents'.


 Regards,
 Edwin




-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England

Re: Solr's suggester results

2015-06-15 Thread Zheng Lin Edwin Yeo

Thanks Benedetti,

I've change to the AnalyzingInfixLookup approach, and it is able to start
searching from the middle of the field.

However, is it possible to make the suggester to show only part of the
content of the field (like 2 or 3 fields after), instead of the entire
content/sentence, which can be quite long?


Regards,
Edwin



On 15 June 2015 at 17:33, Alessandro Benedetti benedetti.ale...@gmail.com
wrote:

 ehehe Edwin, I think you should read again the document I linked time ago :

 http://lucidworks.com/blog/solr-suggester/

 The suggester you used is not meant to provide infix suggestions.
 The fuzzy suggester is working on a fuzzy basis , with the *starting* terms
 of a field content.

 What you are looking for is actually one of the Infix Suggesters.
 For example the AnalyzingInfixLookup approach.

 When working with Suggesters is important first to make a distinction :

 1) Returning the full content of the field ( analysisInfix or Fuzzy)

 2) Returning token(s) ( Free Text Suggester)

 Then the second difference is :

 1) Infix suggestions ( from the middle of the field content)
 2) Classic suggester ( from the beginning of the field content)

 Clarified that, will be quite simple to work with suggesters.

 Cheers

 2015-06-15 9:28 GMT+01:00 Zheng Lin Edwin Yeo edwinye...@gmail.com:

  I've indexed a rich-text documents with the following content:
 
  This is a testing rich text documents to test the uploading of files to
  Solr
 
 
  When I tried to use the suggestion, it return me the entire field in the
  content once I enter suggest?q=t. However, when I tried to search for
  q='rich', I don't get any results returned.
 
  This is my current configuration for the suggester:
  searchComponent name=suggest class=solr.SuggestComponent
lst name=suggester
  str name=namemySuggester/str
  str name=lookupImplFuzzyLookupFactory/str
  str name=dictionaryImplDocumentDictionaryFactory/str
  str name=fieldSuggestion/str
  str name=suggestAnalyzerFieldTypesuggestType/str
  str name=buildOnStartuptrue/str
  str name=buildOnCommitfalse/str
/lst
  /searchComponent
 
  requestHandler name=/suggest class=solr.SearchHandler
 startup=lazy 
lst name=defaults
  str name=wtjson/str
  str name=indenttrue/str
 
  str name=suggesttrue/str
  str name=suggest.count10/str
  str name=suggest.dictionarymySuggester/str
/lst
arr name=components
  strsuggest/str
/arr
  /requestHandler
 
  Is it possible to allow the suggester to return something even from the
  middle of the sentence, and also not to return the entire sentence if the
  sentence. Perhaps it should just suggest the next 2 or 3 fields, and to
  return more fields as the users type.
 
  For example,
  When user type 'this', it should return 'This is a testing'
  When user type 'this is a testing', it should return 'This is a testing
  rich text documents'.
 
 
  Regards,
  Edwin
 



 --
 --

 Benedetti Alessandro
 Visiting card : http://about.me/alessandro_benedetti

 Tyger, tyger burning bright
 In the forests of the night,
 What immortal hand or eye
 Could frame thy fearful symmetry?

 William Blake - Songs of Experience -1794 England

Re: Solr's suggester results

2015-06-15 Thread Zheng Lin Edwin Yeo

Also, is there a way to overcome the long content problem?

I'm getting this error when I've indexed large rich-text documents and
tried to build the suggester.

*{*
*  responseHeader:{*
*status:500,*
*QTime:47},*
*  error:{*
*msg:Document contains at least one immense term in
field=\exacttext\ (whose UTF8 encoding is longer than the max length
32766), all of which were skipped.  Please correct the analyzer to not
produce such terms.  The prefix of the first immense term is: '[32, 10, 32,
10, 32, 32, 10, 32, 32, 10, 32, 32, 10, 32, 32, 10, 32, 32, 10, 32, 32, 10,
32, 32, 10, 32, 32, 10, 32, 32]...', original message: bytes can be at most
32766 in length; got 139402,*
*trace:java.lang.IllegalArgumentException: Document contains at
least one immense term in field=\exacttext\ (whose UTF8 encoding is
longer than the max length 32766), all of which were skipped.  Please
correct the analyzer to not produce such terms.  The prefix of the first
immense term is: '[32, 10, 32, 10, 32, 32, 10, 32, 32, 10, 32, 32, 10, 32,
32, 10, 32, 32, 10, 32, 32, 10, 32, 32, 10, 32, 32, 10, 32, 32]...',
original message: bytes can be at most 32766 in length; got 139402\r\n\tat
org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:667)\r\n\tat
org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:344)\r\n\tat
org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:300)\r\n\tat
org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:232)\r\n\tat
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:458)\r\n\tat
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1350)\r\n\tat
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1138)\r\n\tat
org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester.add(AnalyzingInfixSuggester.java:381)\r\n\tat
org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester.build(AnalyzingInfixSuggester.java:310)\r\n\tat
org.apache.lucene.search.suggest.Lookup.build(Lookup.java:193)\r\n\tat
org.apache.solr.spelling.suggest.SolrSuggester.build(SolrSuggester.java:163)\r\n\tat
org.apache.solr.handler.component.SuggestComponent.prepare(SuggestComponent.java:179)\r\n\tat
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:196)\r\n\tat
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)\r\n\tat
org.apache.solr.core.SolrCore.execute(SolrCore.java:1984)\r\n\tat
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:829)\r\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:446)\r\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:220)\r\n\tat
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)\r\n\tat
org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:82)\r\n\tat
org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:294)\r\n\tat
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)\r\n\tat
org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:82)\r\n\tat
org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:294)\r\n\tat
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)\r\n\tat
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)\r\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)\r\n\tat
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)\r\n\tat
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)\r\n\tat
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)\r\n\tat
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)\r\n\tat
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)\r\n\tat
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)\r\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)\r\n\tat
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)\r\n\tat
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)\r\n\tat
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)\r\n\tat
org.eclipse.jetty.server.Server.handle(Server.java:368)\r\n\tat
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)\r\n\tat
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)\r\n\tat
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)\r\n\tat

Re: Solr's suggester results

Re: Solr's suggester results

Re: Solr's suggester results

Re: Solr's suggester results

Re: Solr's suggester results

Re: Solr's suggester results

Re: Solr's suggester results

Re: Solr's suggester results

Re: Solr's suggester results

Re: Solr's suggester results

Re: Solr's suggester results

11 matches

Site Navigation

Mail list logo

Footer information