Re: Autosuggest help

2019-04-06 Thread Midas A
Any update?

On Thu, 4 Apr 2019, 1:09 pm Midas A,  wrote:

> Hi,
>
> We need to use auto suggest click stream data in Auto suggestion . How we
> can achieve this ?
>
> Currently we are using suggester for auto suggestions .
>
>
> Regards,
> Midas
>


Autosuggest help

2019-04-04 Thread Midas A
Hi,

We need to use auto suggest click stream data in Auto suggestion . How we
can achieve this ?

Currently we are using suggester for auto suggestions .


Regards,
Midas


Re: Solr AutoSuggest Configuration Issue

2018-10-09 Thread Christian Ortner
Context filtering, at least using the suggest.cfq parameter, was not
introduced before Solr 6 to my knowledge. As Edwin, I highly recommend
updating.

On Mon, Oct 8, 2018 at 2:20 PM Manu Nair  wrote:

> Hi,
>
> I am using Solr 5.1 for my application.
> I am trying to use the autoSuggest feature of Solr.
> I want to do context filtering on the results returned by Solr suggest.
>
> Please help me know if this feature is supported in the version that I am
> using(5.1).
> Also if it works with multivalued field. I tried multiple times but it is
> not working.
>
> I am referring the following link for details :
> https://lucene.apache.org/solr/guide/6_6/suggester.html
>
> Please find the configuration in my solrconfig.xml as below
> 
>   
> mySuggester
> AnalyzingInfixLookupFactory
> DocumentDictionaryFactory
> name
> price
> text_en
> false
> countries
>   
> 
>
> Thanks alot for your help in advance.
>
> Regards,
> Manu Nair.
>


Re: Solr AutoSuggest Configuration Issue

2018-10-08 Thread Zheng Lin Edwin Yeo
The link that you are referring to is for Solr 6.6, but you are using Solr
5.1 which is quite an old version, so there could be some differences.
You can refer this guide for Solr 5.1:
http://archive.apache.org/dist/lucene/solr/ref-guide/apache-solr-ref-guide-5.1.pdf

The current version of Solr is already Solr 7.5, and it is recommended to
upgrade to the new version so that you can use the new features and and
also things like better memory consumption and better authentication

Regards,
Edwin

On Mon, 8 Oct 2018 at 20:20, Manu Nair  wrote:

> Hi,
>
> I am using Solr 5.1 for my application.
> I am trying to use the autoSuggest feature of Solr.
> I want to do context filtering on the results returned by Solr suggest.
>
> Please help me know if this feature is supported in the version that I am
> using(5.1).
> Also if it works with multivalued field. I tried multiple times but it is
> not working.
>
> I am referring the following link for details :
> https://lucene.apache.org/solr/guide/6_6/suggester.html
>
> Please find the configuration in my solrconfig.xml as below
> 
>   
> mySuggester
> AnalyzingInfixLookupFactory
> DocumentDictionaryFactory
> name
> price
> text_en
> false
> countries
>   
> 
>
> Thanks alot for your help in advance.
>
> Regards,
> Manu Nair.
>


Solr AutoSuggest Configuration Issue

2018-10-08 Thread Manu Nair
Hi,

I am using Solr 5.1 for my application.
I am trying to use the autoSuggest feature of Solr.
I want to do context filtering on the results returned by Solr suggest.

Please help me know if this feature is supported in the version that I am
using(5.1).
Also if it works with multivalued field. I tried multiple times but it is
not working.

I am referring the following link for details :
https://lucene.apache.org/solr/guide/6_6/suggester.html

Please find the configuration in my solrconfig.xml as below

  
mySuggester
AnalyzingInfixLookupFactory
DocumentDictionaryFactory
name
price
text_en
false
countries
  


Thanks alot for your help in advance.

Regards,
Manu Nair.


Re: Solr 6.5 autosuggest suggests misspelt words and unwanted words

2018-06-20 Thread Alessandro Benedetti
Hi,
you should curate your data, that is fundamental to have an healthy search
solution, but let's see what you can do anyway :

1) curate a dictionary of such bad words and then configure analysis to skip
them
2) Have you tried different dictionary implementations ? I would assume that
each single mispelled word has a low document frequency. You could use the
High Frequency Document Dictionary[1] and see how it goes.


[1]
https://lucene.apache.org/solr/guide/7_3/suggester.html#highfrequencydictionaryfactory



-
---
Alessandro Benedetti
Search Consultant, R Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Solr 6.5 autosuggest suggests misspelt words and unwanted words

2018-06-19 Thread Sri Sirisha Vallabhaneni
Hi ,

My Data contains un-curated data - which consists of *cuss words, misspelt
words* like *nd* instead of *need. *We are using a
auto-suggest/auto-complete that heavily relies on indexed data to recommend
suggestions as the user types in his query. We are using a list of stop
words consisting of cuss words to keep check on what is recommended to the
user and this list might get huge with time as well. Is there any clean way
to get around the problem

1. of eliminating cuss words entirely in suggestions
2. not suggesting misspelt words at all.

Thanks and Regards,
Sri


Re: autosuggest with solr.EdgeNGramFilterFactory no result found

2015-07-07 Thread Szűcs Roland
Thanx Erick,

Your blog article was the perfect answer to my problem.

Rgds,

Roland

2015-07-03 18:57 GMT+02:00 Erick Erickson erickerick...@gmail.com:

 OK, I think you took a wrong turn at the bakery

 The FST-based suggesters are intended to look at the
 beginnings of fields. It is totally unnecessary to use
 ngrams, the FST that gets built does that _for_ you.
 Actually it builds an internal FST structure that does
 this en passant.

 For getting whole fields that are anywhere in the input
 field, you probably want to think about
 AnalyzingInfixSuggester or FreeTextSuggester.

 The important bit here is that you shouldn't have to do
 so much work...

 This might help:

 http://lucidworks.com/blog/solr-suggester/

 Best,
 Erick

 On Fri, Jul 3, 2015 at 4:40 AM, Roland Szűcs
 roland.sz...@bookandwalk.com wrote:
  I tried to setup an autosuggest feature with multiple dictionaries for
  title , author and publisher fields.
 
  I used the solr.EdgeNGramFilterFactory to optimize the performance of the
  auto suggest.
 
  I have a document in the index with title: Romana.
 
  When I test the text analysis for auto suggest (on filed of
  title_suggest_ngram):
  ENGTF
  textraw_bytesstartendpositionLengthtypeposition
  rom[72 6f 6d]061word1roma[72 6f 6d 61]061word1roman[72 6f 6d 61
 6e]061word1
  romana[72 6f 6d 61 6e 61]061word1
  If I try to run http://localhost:8983/solr/bandw/suggest?q=Roma, I get:
  response
  lst name=responseHeader
  int name=status0/int
  int name=QTime1/int
  /lst
  lst name=suggest
  lst name=suggest_publisher
  lst name=Roma
  int name=numFound0/int
  arr name=suggestions/
  /lst
  /lst
  lst name=suggest_title
  lst name=Roma
  int name=numFound0/int
  arr name=suggestions/
  /lst
  /lst
  lst name=suggest_author
  lst name=Roma
  int name=numFound0/int
  arr name=suggestions/
  /lst
  /lst
  /lst
  /response
 
  my relevant field definitions:
  field name=id type=string indexed=true stored=true
 required=true
  multiValued=false omitNorms=true /
 field name=author type=text_hu indexed=true stored=true
  multiValued=true/
 field name=title type=text_hu indexed=true stored=true
  multiValued=false/
 field name=subtitle type=text_hu indexed=true stored=true
  multiValued=false/
 field name=publisher type=text_hu indexed=true stored=true
  multiValued=false/
  field name=title_suggest_ngram type=text_hu_suggest_ngram
  indexed=true stored=false multiValued=false omitNorms=true/
 field name=author_suggest_ngram type=text_hu_suggest_ngram
  indexed=true stored=false multiValued=false omitNorms=true/
 field name=publisher_suggest_ngram type=text_hu_suggest_ngram
  indexed=true stored=false multiValued=false omitNorms=true/
 copyField source=title dest=title_suggest_ngram/
 copyField source=author dest=author_suggest_ngram/
 copyField source=publisher dest=publisher_suggest_ngram/
 
  My EdgeNGram related field type definition:
  fieldType name=text_hu_suggest_ngram class=solr.TextField
  positionIncrementGap=100
  analyzer type=index
  tokenizer class=solr.StandardTokenizerFactory/
  filter class=solr.StopFilterFactory
  ignoreCase=true
  words=stopwords_hu.txt
  /
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.EdgeNGramFilterFactory minGramSize=3
  maxGramSize=8/
/analyzer
analyzer type=query
  tokenizer class=solr.StandardTokenizerFactory/
  filter class=solr.StopFilterFactory
  ignoreCase=true
  words=stopwords_hu.txt
  /
  filter class=solr.LowerCaseFilterFactory/
/analyzer
  /fieldType
 
  My requesthandler for suggest:
  requestHandler name=/suggest class=solr.SearchHandler
 startup=lazy
  lst name=defaults
  str name=suggesttrue/str
  str name=suggest.count5/str
  str name=suggest.dictionarysuggest_author/str
  str name=suggest.dictionarysuggest_title/str
  str name=suggest.dictionarysuggest_publisher/str
  /lst
  arr name=components
  strsuggest/str
  /arr
/requestHandler
 
  And finally my searchcomponent:
  searchComponent name=suggest class=solr.SuggestComponent
  lst name=suggester
  str name=namesuggest_title/str
  str name=lookupImplFSTLookupFactory/str
  str name=dictionaryImplDocumentDictionaryFactory/str
  str name=fieldtitle_suggest_ngram/str
  str name=wightFieldprice/str
  str name=builOnStartuptrue/str
  str name=buildOnCommittrue/str
  /lst
  lst name=suggester
  str name=namesuggest_author/str
  str name=lookupImplFSTLookupFactory/str
  str name=dictionaryImplDocumentDictionaryFactory/str
  str name=fieldauthor_suggest_ngram/str
  str name=wightFieldprice/str
  str name=builOnStartuptrue/str
  str name=buildOnCommittrue/str
  /lst
  lst name=suggester
  str name=namesuggest_publisher/str
  str name=lookupImplFSTLookupFactory/str
  str name=dictionaryImplDocumentDictionaryFactory/str
  str name=fieldpublisher_suggest_ngram/str
  str name

Re: autosuggest with solr.EdgeNGramFilterFactory no result found

2015-07-03 Thread Erick Erickson
OK, I think you took a wrong turn at the bakery

The FST-based suggesters are intended to look at the
beginnings of fields. It is totally unnecessary to use
ngrams, the FST that gets built does that _for_ you.
Actually it builds an internal FST structure that does
this en passant.

For getting whole fields that are anywhere in the input
field, you probably want to think about
AnalyzingInfixSuggester or FreeTextSuggester.

The important bit here is that you shouldn't have to do
so much work...

This might help:

http://lucidworks.com/blog/solr-suggester/

Best,
Erick

On Fri, Jul 3, 2015 at 4:40 AM, Roland Szűcs
roland.sz...@bookandwalk.com wrote:
 I tried to setup an autosuggest feature with multiple dictionaries for
 title , author and publisher fields.

 I used the solr.EdgeNGramFilterFactory to optimize the performance of the
 auto suggest.

 I have a document in the index with title: Romana.

 When I test the text analysis for auto suggest (on filed of
 title_suggest_ngram):
 ENGTF
 textraw_bytesstartendpositionLengthtypeposition
 rom[72 6f 6d]061word1roma[72 6f 6d 61]061word1roman[72 6f 6d 61 6e]061word1
 romana[72 6f 6d 61 6e 61]061word1
 If I try to run http://localhost:8983/solr/bandw/suggest?q=Roma, I get:
 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime1/int
 /lst
 lst name=suggest
 lst name=suggest_publisher
 lst name=Roma
 int name=numFound0/int
 arr name=suggestions/
 /lst
 /lst
 lst name=suggest_title
 lst name=Roma
 int name=numFound0/int
 arr name=suggestions/
 /lst
 /lst
 lst name=suggest_author
 lst name=Roma
 int name=numFound0/int
 arr name=suggestions/
 /lst
 /lst
 /lst
 /response

 my relevant field definitions:
 field name=id type=string indexed=true stored=true required=true
 multiValued=false omitNorms=true /
field name=author type=text_hu indexed=true stored=true
 multiValued=true/
field name=title type=text_hu indexed=true stored=true
 multiValued=false/
field name=subtitle type=text_hu indexed=true stored=true
 multiValued=false/
field name=publisher type=text_hu indexed=true stored=true
 multiValued=false/
 field name=title_suggest_ngram type=text_hu_suggest_ngram
 indexed=true stored=false multiValued=false omitNorms=true/
field name=author_suggest_ngram type=text_hu_suggest_ngram
 indexed=true stored=false multiValued=false omitNorms=true/
field name=publisher_suggest_ngram type=text_hu_suggest_ngram
 indexed=true stored=false multiValued=false omitNorms=true/
copyField source=title dest=title_suggest_ngram/
copyField source=author dest=author_suggest_ngram/
copyField source=publisher dest=publisher_suggest_ngram/

 My EdgeNGram related field type definition:
 fieldType name=text_hu_suggest_ngram class=solr.TextField
 positionIncrementGap=100
 analyzer type=index
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.StopFilterFactory
 ignoreCase=true
 words=stopwords_hu.txt
 /
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.EdgeNGramFilterFactory minGramSize=3
 maxGramSize=8/
   /analyzer
   analyzer type=query
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.StopFilterFactory
 ignoreCase=true
 words=stopwords_hu.txt
 /
 filter class=solr.LowerCaseFilterFactory/
   /analyzer
 /fieldType

 My requesthandler for suggest:
 requestHandler name=/suggest class=solr.SearchHandler startup=lazy
 lst name=defaults
 str name=suggesttrue/str
 str name=suggest.count5/str
 str name=suggest.dictionarysuggest_author/str
 str name=suggest.dictionarysuggest_title/str
 str name=suggest.dictionarysuggest_publisher/str
 /lst
 arr name=components
 strsuggest/str
 /arr
   /requestHandler

 And finally my searchcomponent:
 searchComponent name=suggest class=solr.SuggestComponent
 lst name=suggester
 str name=namesuggest_title/str
 str name=lookupImplFSTLookupFactory/str
 str name=dictionaryImplDocumentDictionaryFactory/str
 str name=fieldtitle_suggest_ngram/str
 str name=wightFieldprice/str
 str name=builOnStartuptrue/str
 str name=buildOnCommittrue/str
 /lst
 lst name=suggester
 str name=namesuggest_author/str
 str name=lookupImplFSTLookupFactory/str
 str name=dictionaryImplDocumentDictionaryFactory/str
 str name=fieldauthor_suggest_ngram/str
 str name=wightFieldprice/str
 str name=builOnStartuptrue/str
 str name=buildOnCommittrue/str
 /lst
 lst name=suggester
 str name=namesuggest_publisher/str
 str name=lookupImplFSTLookupFactory/str
 str name=dictionaryImplDocumentDictionaryFactory/str
 str name=fieldpublisher_suggest_ngram/str
 str name=wightFieldprice/str
 str name=buildOnCommittrue/str
 /lst
   /searchComponent
 If I change the search component definition to use title field instead of
 title_suggest_ngram tahn I manage to get suggest results only if my title
 field starts with the string specified in q parameter.
 As a filed level

autosuggest with solr.EdgeNGramFilterFactory no result found

2015-07-03 Thread Roland Szűcs
I tried to setup an autosuggest feature with multiple dictionaries for
title , author and publisher fields.

I used the solr.EdgeNGramFilterFactory to optimize the performance of the
auto suggest.

I have a document in the index with title: Romana.

When I test the text analysis for auto suggest (on filed of
title_suggest_ngram):
ENGTF
textraw_bytesstartendpositionLengthtypeposition
rom[72 6f 6d]061word1roma[72 6f 6d 61]061word1roman[72 6f 6d 61 6e]061word1
romana[72 6f 6d 61 6e 61]061word1
If I try to run http://localhost:8983/solr/bandw/suggest?q=Roma, I get:
response
lst name=responseHeader
int name=status0/int
int name=QTime1/int
/lst
lst name=suggest
lst name=suggest_publisher
lst name=Roma
int name=numFound0/int
arr name=suggestions/
/lst
/lst
lst name=suggest_title
lst name=Roma
int name=numFound0/int
arr name=suggestions/
/lst
/lst
lst name=suggest_author
lst name=Roma
int name=numFound0/int
arr name=suggestions/
/lst
/lst
/lst
/response

my relevant field definitions:
field name=id type=string indexed=true stored=true required=true
multiValued=false omitNorms=true /
   field name=author type=text_hu indexed=true stored=true
multiValued=true/
   field name=title type=text_hu indexed=true stored=true
multiValued=false/
   field name=subtitle type=text_hu indexed=true stored=true
multiValued=false/
   field name=publisher type=text_hu indexed=true stored=true
multiValued=false/
field name=title_suggest_ngram type=text_hu_suggest_ngram
indexed=true stored=false multiValued=false omitNorms=true/
   field name=author_suggest_ngram type=text_hu_suggest_ngram
indexed=true stored=false multiValued=false omitNorms=true/
   field name=publisher_suggest_ngram type=text_hu_suggest_ngram
indexed=true stored=false multiValued=false omitNorms=true/
   copyField source=title dest=title_suggest_ngram/
   copyField source=author dest=author_suggest_ngram/
   copyField source=publisher dest=publisher_suggest_ngram/

My EdgeNGram related field type definition:
fieldType name=text_hu_suggest_ngram class=solr.TextField
positionIncrementGap=100
analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords_hu.txt
/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.EdgeNGramFilterFactory minGramSize=3
maxGramSize=8/
  /analyzer
  analyzer type=query
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords_hu.txt
/
filter class=solr.LowerCaseFilterFactory/
  /analyzer
/fieldType

My requesthandler for suggest:
requestHandler name=/suggest class=solr.SearchHandler startup=lazy
lst name=defaults
str name=suggesttrue/str
str name=suggest.count5/str
str name=suggest.dictionarysuggest_author/str
str name=suggest.dictionarysuggest_title/str
str name=suggest.dictionarysuggest_publisher/str
/lst
arr name=components
strsuggest/str
/arr
  /requestHandler

And finally my searchcomponent:
searchComponent name=suggest class=solr.SuggestComponent
lst name=suggester
str name=namesuggest_title/str
str name=lookupImplFSTLookupFactory/str
str name=dictionaryImplDocumentDictionaryFactory/str
str name=fieldtitle_suggest_ngram/str
str name=wightFieldprice/str
str name=builOnStartuptrue/str
str name=buildOnCommittrue/str
/lst
lst name=suggester
str name=namesuggest_author/str
str name=lookupImplFSTLookupFactory/str
str name=dictionaryImplDocumentDictionaryFactory/str
str name=fieldauthor_suggest_ngram/str
str name=wightFieldprice/str
str name=builOnStartuptrue/str
str name=buildOnCommittrue/str
/lst
lst name=suggester
str name=namesuggest_publisher/str
str name=lookupImplFSTLookupFactory/str
str name=dictionaryImplDocumentDictionaryFactory/str
str name=fieldpublisher_suggest_ngram/str
str name=wightFieldprice/str
str name=buildOnCommittrue/str
/lst
  /searchComponent
If I change the search component definition to use title field instead of
title_suggest_ngram tahn I manage to get suggest results only if my title
field starts with the string specified in q parameter.
As a filed level autosuggester I would suggest also those matches which are
not the first term of the title but any of them.
What shall I make to use autosuggest correctly?

-- 
https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/huRoland Szűcs
https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/huConnect with
me on Linkedin https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24
https://bookandwalk.hu/CEOPhone: +36 1 210 81 13Bookandwalk.hu
https://bokandwalk.hu/


Re: Questions regarding autosuggest (Solr 5.2.1)

2015-06-30 Thread Thomas Michael Engelke
 God damn. Thank you.

*ashamed*

Am 30.06.2015 00:21 schrieb Erick Erickson: 

 Try not putting it in double quotes?
 
 Best,
 Erick
 
 On Mon, Jun 29, 2015 at 12:22 PM, Thomas Michael Engelke
 thomas.enge...@posteo.de wrote:
 
 A friend and I are trying to develop some software using Solr in the 
 background, and with that comes alot of changes. We're used to older 
 versions (4.3 and below). We especially have problems with the autosuggest 
 feature. This is the field definition (schema.xml) for our autosuggest 
 field: field name=autosuggest type=autosuggest indexed=true 
 stored=true required=false multiValued=true / ... copyField 
 source=name dest=autosuggest / ... fieldType name=autosuggest 
 class=solr.TextField positionIncrementGap=100 analyzer type=index 
 tokenizer class=solr.WhitespaceTokenizerFactory/ filter 
 class=solr.WordDelimiterFilterFactory splitOnCaseChange=0 
 splitOnNumerics=1 generateWordParts=1 generateNumberParts=1 
 catenateWords=1 catenateNumbers=0 catenateAll=0 preserveOriginal=0/ 
 filter class=solr.LowerCaseFilterFactory/ filter 
 class=solr.StopFilterFactory words=stopwords.txt ignoreCase=true 
 enablePositionIncrements=true
format=snowball/ filter 
class=solr.DictionaryCompoundWordTokenFilterFactory 
dictionary=dictionary.txt minWordSize=5 minSubwordSize=3 
maxSubwordSize=30 onlyLongestMatch=false/ filter 
class=solr.GermanNormalizationFilterFactory/ filter 
class=solr.SnowballPorterFilterFactory language=German2 
protected=protwords.txt/ filter class=solr.EdgeNGramFilterFactory 
minGramSize=2 maxGramSize=30/ filter 
class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer 
type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter 
class=solr.WordDelimiterFilterFactory splitOnCaseChange=0 
splitOnNumerics=1 generateWordParts=1 generateNumberParts=1 
catenateWords=1 catenateNumbers=0 catenateAll=0 preserveOriginal=0/ 
filter class=solr.LowerCaseFilterFactory/ filter 
class=solr.StopFilterFactory words=stopwords.txt ignoreCase=true 
enablePositionIncrements=true format=snowball/ filter
class=solr.GermanNormalizationFilterFactory/ filter 
class=solr.SnowballPorterFilterFactory language=German2 
protected=protwords.txt/ filter 
class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType 
Afterwards, we defined an autosuggest component to use this field, like this 
(solrconfig.xml): searchComponent name=suggest 
class=solr.SuggestComponent lst name=suggester str 
name=namemySuggester/str str name=lookupImplFuzzyLookupFactory/str 
str name=storeDirsuggester_fuzzy_dir/str str 
name=dictionaryImplDocumentDictionaryFactory/str str 
name=fieldsuggest/str str 
name=suggestAnalyzerFieldTypeautosuggest/str str 
name=buildOnStartupfalse/str str name=buildOnCommitfalse/str /lst 
/searchComponent And add a requesthandler to test out the functionality: 
requestHandler name=/suggesthandler class=solr.SearchHandler 
startup=lazy  lst name=defaults str name=suggesttrue/str str
name=suggest.count10/str str name=suggest.dictionarymySuggester/str 
/lst arr name=components strsuggest/str /arr /requestHandler 
However, trying to start the core that has this configuration, a long exception 
occurs, telling us this: Error in configuration: autosuggest is not defined 
in the schema Now, that seems to be wrong. Any idea how to fix that?
 

Re: Questions regarding autosuggest (Solr 5.2.1)

2015-06-30 Thread Alessandro Benedetti
I would like to add some consideration if possible.
I find the field type really hard analysed, are you sure is this ok with
your suggestions requirement ?
Usually is better to keep the field for suggestion as less analysed as
possible and then play with the different type of suggesters.
If you notice any additional problem, we can discuss through that, if not ,
it is ok !

Cheers

2015-06-30 13:48 GMT+01:00 Erick Erickson erickerick...@gmail.com:

 Pesky computers, they keep doing exactly what I tell 'em to do, not
 what I mean ;)

 I'll open a JIRA for making Solr DWIM-compliant, Do What I Mean ;) ;)

 On Tue, Jun 30, 2015 at 4:17 AM, Thomas Michael Engelke
 thomas.enge...@posteo.de wrote:
   God damn. Thank you.
 
  *ashamed*
 
  Am 30.06.2015 00:21 schrieb Erick Erickson:
 
  Try not putting it in double quotes?
 
  Best,
  Erick
 
  On Mon, Jun 29, 2015 at 12:22 PM, Thomas Michael Engelke
  thomas.enge...@posteo.de wrote:
 
  A friend and I are trying to develop some software using Solr in the
 background, and with that comes alot of changes. We're used to older
 versions (4.3 and below). We especially have problems with the autosuggest
 feature. This is the field definition (schema.xml) for our autosuggest
 field: field name=autosuggest type=autosuggest indexed=true
 stored=true required=false multiValued=true / ... copyField
 source=name dest=autosuggest / ... fieldType name=autosuggest
 class=solr.TextField positionIncrementGap=100 analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/ filter
 class=solr.WordDelimiterFilterFactory splitOnCaseChange=0
 splitOnNumerics=1 generateWordParts=1 generateNumberParts=1
 catenateWords=1 catenateNumbers=0 catenateAll=0
 preserveOriginal=0/ filter class=solr.LowerCaseFilterFactory/
 filter class=solr.StopFilterFactory words=stopwords.txt
 ignoreCase=true enablePositionIncrements=true
  format=snowball/ filter
 class=solr.DictionaryCompoundWordTokenFilterFactory
 dictionary=dictionary.txt minWordSize=5 minSubwordSize=3
 maxSubwordSize=30 onlyLongestMatch=false/ filter
 class=solr.GermanNormalizationFilterFactory/ filter
 class=solr.SnowballPorterFilterFactory language=German2
 protected=protwords.txt/ filter class=solr.EdgeNGramFilterFactory
 minGramSize=2 maxGramSize=30/ filter
 class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer
 type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter
 class=solr.WordDelimiterFilterFactory splitOnCaseChange=0
 splitOnNumerics=1 generateWordParts=1 generateNumberParts=1
 catenateWords=1 catenateNumbers=0 catenateAll=0
 preserveOriginal=0/ filter class=solr.LowerCaseFilterFactory/
 filter class=solr.StopFilterFactory words=stopwords.txt
 ignoreCase=true enablePositionIncrements=true format=snowball/
 filter
  class=solr.GermanNormalizationFilterFactory/ filter
 class=solr.SnowballPorterFilterFactory language=German2
 protected=protwords.txt/ filter
 class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType
 Afterwards, we defined an autosuggest component to use this field, like
 this (solrconfig.xml): searchComponent name=suggest
 class=solr.SuggestComponent lst name=suggester str
 name=namemySuggester/str str
 name=lookupImplFuzzyLookupFactory/str str
 name=storeDirsuggester_fuzzy_dir/str str
 name=dictionaryImplDocumentDictionaryFactory/str str
 name=fieldsuggest/str str
 name=suggestAnalyzerFieldTypeautosuggest/str str
 name=buildOnStartupfalse/str str name=buildOnCommitfalse/str
 /lst /searchComponent And add a requesthandler to test out the
 functionality: requestHandler name=/suggesthandler
 class=solr.SearchHandler startup=lazy  lst name=defaults str
 name=suggesttrue/str str
  name=suggest.count10/str str
 name=suggest.dictionarymySuggester/str /lst arr name=components
 strsuggest/str /arr /requestHandler However, trying to start the
 core that has this configuration, a long exception occurs, telling us this:
 Error in configuration: autosuggest is not defined in the schema Now,
 that seems to be wrong. Any idea how to fix that?
 




-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England


Re: Questions regarding autosuggest (Solr 5.2.1)

2015-06-30 Thread Erick Erickson
Pesky computers, they keep doing exactly what I tell 'em to do, not
what I mean ;)

I'll open a JIRA for making Solr DWIM-compliant, Do What I Mean ;) ;)

On Tue, Jun 30, 2015 at 4:17 AM, Thomas Michael Engelke
thomas.enge...@posteo.de wrote:
  God damn. Thank you.

 *ashamed*

 Am 30.06.2015 00:21 schrieb Erick Erickson:

 Try not putting it in double quotes?

 Best,
 Erick

 On Mon, Jun 29, 2015 at 12:22 PM, Thomas Michael Engelke
 thomas.enge...@posteo.de wrote:

 A friend and I are trying to develop some software using Solr in the 
 background, and with that comes alot of changes. We're used to older 
 versions (4.3 and below). We especially have problems with the autosuggest 
 feature. This is the field definition (schema.xml) for our autosuggest 
 field: field name=autosuggest type=autosuggest indexed=true 
 stored=true required=false multiValued=true / ... copyField 
 source=name dest=autosuggest / ... fieldType name=autosuggest 
 class=solr.TextField positionIncrementGap=100 analyzer type=index 
 tokenizer class=solr.WhitespaceTokenizerFactory/ filter 
 class=solr.WordDelimiterFilterFactory splitOnCaseChange=0 
 splitOnNumerics=1 generateWordParts=1 generateNumberParts=1 
 catenateWords=1 catenateNumbers=0 catenateAll=0 
 preserveOriginal=0/ filter class=solr.LowerCaseFilterFactory/ 
 filter class=solr.StopFilterFactory words=stopwords.txt 
 ignoreCase=true enablePositionIncrements=true
 format=snowball/ filter 
 class=solr.DictionaryCompoundWordTokenFilterFactory 
 dictionary=dictionary.txt minWordSize=5 minSubwordSize=3 
 maxSubwordSize=30 onlyLongestMatch=false/ filter 
 class=solr.GermanNormalizationFilterFactory/ filter 
 class=solr.SnowballPorterFilterFactory language=German2 
 protected=protwords.txt/ filter class=solr.EdgeNGramFilterFactory 
 minGramSize=2 maxGramSize=30/ filter 
 class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer 
 type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter 
 class=solr.WordDelimiterFilterFactory splitOnCaseChange=0 
 splitOnNumerics=1 generateWordParts=1 generateNumberParts=1 
 catenateWords=1 catenateNumbers=0 catenateAll=0 preserveOriginal=0/ 
 filter class=solr.LowerCaseFilterFactory/ filter 
 class=solr.StopFilterFactory words=stopwords.txt ignoreCase=true 
 enablePositionIncrements=true format=snowball/ filter
 class=solr.GermanNormalizationFilterFactory/ filter 
 class=solr.SnowballPorterFilterFactory language=German2 
 protected=protwords.txt/ filter 
 class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType 
 Afterwards, we defined an autosuggest component to use this field, like this 
 (solrconfig.xml): searchComponent name=suggest 
 class=solr.SuggestComponent lst name=suggester str 
 name=namemySuggester/str str name=lookupImplFuzzyLookupFactory/str 
 str name=storeDirsuggester_fuzzy_dir/str str 
 name=dictionaryImplDocumentDictionaryFactory/str str 
 name=fieldsuggest/str str 
 name=suggestAnalyzerFieldTypeautosuggest/str str 
 name=buildOnStartupfalse/str str name=buildOnCommitfalse/str 
 /lst /searchComponent And add a requesthandler to test out the 
 functionality: requestHandler name=/suggesthandler 
 class=solr.SearchHandler startup=lazy  lst name=defaults str 
 name=suggesttrue/str str
 name=suggest.count10/str str 
 name=suggest.dictionarymySuggester/str /lst arr name=components 
 strsuggest/str /arr /requestHandler However, trying to start the core 
 that has this configuration, a long exception occurs, telling us this: Error 
 in configuration: autosuggest is not defined in the schema Now, that seems 
 to be wrong. Any idea how to fix that?



Questions regarding autosuggest (Solr 5.2.1)

2015-06-29 Thread Thomas Michael Engelke
 

 A friend and I are trying to develop some software using Solr in the
background, and with that comes alot of changes. We're used to older
versions (4.3 and below). We especially have problems with the
autosuggest feature.

This is the field definition (schema.xml) for our autosuggest field:

field name=autosuggest type=autosuggest indexed=true
stored=true required=false multiValued=true /
...
copyField source=name dest=autosuggest /
...
fieldType name=autosuggest class=solr.TextField
positionIncrementGap=100
 analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.WordDelimiterFilterFactory splitOnCaseChange=0
splitOnNumerics=1 generateWordParts=1 generateNumberParts=1
catenateWords=1 catenateNumbers=0 catenateAll=0
preserveOriginal=0/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.StopFilterFactory words=stopwords.txt
ignoreCase=true enablePositionIncrements=true format=snowball/
 filter class=solr.DictionaryCompoundWordTokenFilterFactory
dictionary=dictionary.txt minWordSize=5 minSubwordSize=3
maxSubwordSize=30 onlyLongestMatch=false/
 filter class=solr.GermanNormalizationFilterFactory/
 filter class=solr.SnowballPorterFilterFactory language=German2
protected=protwords.txt/
 filter class=solr.EdgeNGramFilterFactory minGramSize=2
maxGramSize=30/
 filter class=solr.RemoveDuplicatesTokenFilterFactory/
 /analyzer
 analyzer type=query
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.WordDelimiterFilterFactory splitOnCaseChange=0
splitOnNumerics=1 generateWordParts=1 generateNumberParts=1
catenateWords=1 catenateNumbers=0 catenateAll=0
preserveOriginal=0/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.StopFilterFactory words=stopwords.txt
ignoreCase=true enablePositionIncrements=true format=snowball/
 filter class=solr.GermanNormalizationFilterFactory/
 filter class=solr.SnowballPorterFilterFactory language=German2
protected=protwords.txt/
 filter class=solr.RemoveDuplicatesTokenFilterFactory/
 /analyzer
/fieldType

Afterwards, we defined an autosuggest component to use this field, like
this (solrconfig.xml):

searchComponent name=suggest class=solr.SuggestComponent
 lst name=suggester
 str name=namemySuggester/str
 str name=lookupImplFuzzyLookupFactory/str
 str name=storeDirsuggester_fuzzy_dir/str
 str name=dictionaryImplDocumentDictionaryFactory/str
 str name=fieldsuggest/str
 str name=suggestAnalyzerFieldTypeautosuggest/str
 str name=buildOnStartupfalse/str
 str name=buildOnCommitfalse/str
 /lst
/searchComponent

And add a requesthandler to test out the functionality:

requestHandler name=/suggesthandler class=solr.SearchHandler
startup=lazy 
 lst name=defaults
 str name=suggesttrue/str
 str name=suggest.count10/str
 str name=suggest.dictionarymySuggester/str
 /lst
 arr name=components
 strsuggest/str
 /arr
/requestHandler

However, trying to start the core that has this configuration, a long
exception occurs, telling us this:

Error in configuration: autosuggest is not defined in the schema

Now, that seems to be wrong. Any idea how to fix that? 

Re: Questions regarding autosuggest (Solr 5.2.1)

2015-06-29 Thread Erick Erickson
Try not putting it in double quotes?

Best,
Erick

On Mon, Jun 29, 2015 at 12:22 PM, Thomas Michael Engelke
thomas.enge...@posteo.de wrote:


  A friend and I are trying to develop some software using Solr in the
 background, and with that comes alot of changes. We're used to older
 versions (4.3 and below). We especially have problems with the
 autosuggest feature.

 This is the field definition (schema.xml) for our autosuggest field:

 field name=autosuggest type=autosuggest indexed=true
 stored=true required=false multiValued=true /
 ...
 copyField source=name dest=autosuggest /
 ...
 fieldType name=autosuggest class=solr.TextField
 positionIncrementGap=100
  analyzer type=index
  tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.WordDelimiterFilterFactory splitOnCaseChange=0
 splitOnNumerics=1 generateWordParts=1 generateNumberParts=1
 catenateWords=1 catenateNumbers=0 catenateAll=0
 preserveOriginal=0/
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.StopFilterFactory words=stopwords.txt
 ignoreCase=true enablePositionIncrements=true format=snowball/
  filter class=solr.DictionaryCompoundWordTokenFilterFactory
 dictionary=dictionary.txt minWordSize=5 minSubwordSize=3
 maxSubwordSize=30 onlyLongestMatch=false/
  filter class=solr.GermanNormalizationFilterFactory/
  filter class=solr.SnowballPorterFilterFactory language=German2
 protected=protwords.txt/
  filter class=solr.EdgeNGramFilterFactory minGramSize=2
 maxGramSize=30/
  filter class=solr.RemoveDuplicatesTokenFilterFactory/
  /analyzer
  analyzer type=query
  tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.WordDelimiterFilterFactory splitOnCaseChange=0
 splitOnNumerics=1 generateWordParts=1 generateNumberParts=1
 catenateWords=1 catenateNumbers=0 catenateAll=0
 preserveOriginal=0/
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.StopFilterFactory words=stopwords.txt
 ignoreCase=true enablePositionIncrements=true format=snowball/
  filter class=solr.GermanNormalizationFilterFactory/
  filter class=solr.SnowballPorterFilterFactory language=German2
 protected=protwords.txt/
  filter class=solr.RemoveDuplicatesTokenFilterFactory/
  /analyzer
 /fieldType

 Afterwards, we defined an autosuggest component to use this field, like
 this (solrconfig.xml):

 searchComponent name=suggest class=solr.SuggestComponent
  lst name=suggester
  str name=namemySuggester/str
  str name=lookupImplFuzzyLookupFactory/str
  str name=storeDirsuggester_fuzzy_dir/str
  str name=dictionaryImplDocumentDictionaryFactory/str
  str name=fieldsuggest/str
  str name=suggestAnalyzerFieldTypeautosuggest/str
  str name=buildOnStartupfalse/str
  str name=buildOnCommitfalse/str
  /lst
 /searchComponent

 And add a requesthandler to test out the functionality:

 requestHandler name=/suggesthandler class=solr.SearchHandler
 startup=lazy 
  lst name=defaults
  str name=suggesttrue/str
  str name=suggest.count10/str
  str name=suggest.dictionarymySuggester/str
  /lst
  arr name=components
  strsuggest/str
  /arr
 /requestHandler

 However, trying to start the core that has this configuration, a long
 exception occurs, telling us this:

 Error in configuration: autosuggest is not defined in the schema

 Now, that seems to be wrong. Any idea how to fix that?


solr autosuggest to stop/filter suggesting the phrases that ends with stopwords

2015-01-15 Thread Rajesh Hazari
Hi Folks,

Solr Version 4.7+

Do we have any analyzer or filter or any plugin in solr to stop suggesting
the phrase that ends with stopwords?

For ex: If the suggestion are as below for query
http://localhost.com/solr/suggest?q=jazz+a

suggestion: [
jazz and,
jazz at,
jazz at lincoln,
jazz at lincoln center,
jazz artists,
jazz and classic
]

Is there any config or solution to remove only *jazz at* and *jazz and*
phrases so that the final suggestion response looks more sensible!

suggestion: [
jazz at lincoln,
jazz at lincoln center,
jazz artists,
jazz and classic
]

Google does this intelligently :)

I have tested with StopFilterFactory and SuggestStopFilter both of which
filters all of stop terms in the phrases now matter where they appear.

Do i have to come up with a custom plugin or some kind of phrase filter to
do this in solr?

I am on the way to design SuggestStopPhraseFilter and its factory , as we
have existing SuggestStopFilter, and use this in my schema

or do we have any existing plugin or feature that i can use of leverage
from?
*Thanks,*
*Rajesh.*


Re: Partial match autosuggest (match a word occurring anywhere in a field)

2014-12-17 Thread bbarani
Thanks for your response.

I fixed this issue by using the filter class=solr.PositionFilterFactory
/

 fieldType name=edgytext class=solr.TextField
positionIncrementGap=100 omitNorms=true
  analyzer type=index
filter class=solr.LowerCaseFilterFactory/ 
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.NGramFilterFactory minGramSize=1
maxGramSize=15 / 
  /analyzer
  analyzer type=query
filter class=solr.LowerCaseFilterFactory/ 
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.ShingleFilterFactory outputUnigrams=true
outputUnigramIfNoNgram=true maxShingleSize=99/
filter class=solr.PositionFilterFactory /
  /analyzer
/fieldType 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Predictive-search-match-a-word-occurring-anywhere-in-a-field-tp4174660p4174822.html
Sent from the Solr - User mailing list archive at Nabble.com.


Partial match autosuggest (match a word occurring anywhere in a field)

2014-12-16 Thread bbarani
Hi,

I am trying to figure out a way to implement partial match autosuggest but
it doesn't work in some 
cases.

When I search for iphone 5s, I am able to see the below results.

title_new:Apple iPhone 5s - 16GB - Gold

but when I search for iphone gold (in title_new field), I am not able to see
the above result. Is there a way to implement full partial match (occuring
anywhere in a field)?


Please find below my fieldtype configuration for title_new

fieldType name=edgytext class=solr.TextField 
 analyzer type=index 
   tokenizer class=solr.KeywordTokenizerFactory/ 
   filter class=solr.LowerCaseFilterFactory/ 
   filter class=solr.NGramFilterFactory minGramSize=1 maxGramSize=15
/ 
 /analyzer 
 analyzer type=query 
   tokenizer class=solr.KeywordTokenizerFactory/ 
   filter class=solr.LowerCaseFilterFactory/ 
 /analyzer 
/fieldType 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Partial-match-autosuggest-match-a-word-occurring-anywhere-in-a-field-tp4174660.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Partial match autosuggest (match a word occurring anywhere in a field)

2014-12-16 Thread Ahmet Arslan
Hi BBrani,

Yes it is possible. Create another field, say edgytext_partial, use whitespace 
tokenises this time.
And query on both edgytext and edgytext_partial. you can even apply different 
boosts.

Ahmet

 



On Wednesday, December 17, 2014 2:44 AM, bbarani bbar...@gmail.com wrote:
Hi,

I am trying to figure out a way to implement partial match autosuggest but
it doesn't work in some 
cases.

When I search for iphone 5s, I am able to see the below results.

title_new:Apple iPhone 5s - 16GB - Gold

but when I search for iphone gold (in title_new field), I am not able to see
the above result. Is there a way to implement full partial match (occuring
anywhere in a field)?


Please find below my fieldtype configuration for title_new

fieldType name=edgytext class=solr.TextField 
analyzer type=index 
   tokenizer class=solr.KeywordTokenizerFactory/ 
   filter class=solr.LowerCaseFilterFactory/ 
   filter class=solr.NGramFilterFactory minGramSize=1 maxGramSize=15
/ 
/analyzer 
analyzer type=query 
   tokenizer class=solr.KeywordTokenizerFactory/ 
   filter class=solr.LowerCaseFilterFactory/ 
/analyzer 
/fieldType 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Partial-match-autosuggest-match-a-word-occurring-anywhere-in-a-field-tp4174660.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Best practice: Autosuggest/autocomplete vs. real search

2014-11-10 Thread Michael Sokolov
The goal is to ensure that suggestions from autocomplete are actually 
terms in the main index, so that the suggestions will actually result in 
matches.  You've considered expanding the main index by adding the 
suggestion n-grams to it, but it would probably be better to alter your 
suggester so that it produces only tokens that are in the main index.  I 
think this is basically how all the Suggester implementations are 
designed to work already; are you using one of those, or are you using 
the TermsComponent, or something else?


-Mike

On 11/10/14 2:54 AM, Thomas Michael Engelke wrote:
  


  We're using Solr as a backend for an ECommerce site/system. The Solr
index stores products with selected attributes, as well as a dedicated
field for autocomplete suggestions (Done via AJAX request when typing in
the search box without pressing return).

The autosuggest field is supplied by copyField directives from certain
select product attribute fields (description and/or name mostly). It
uses EdgeNGramFilterFactory to complete words not yet typed completely,
and it works quite well.

However, we come across an issue with a disconnect between the
autosuggest results and results of a normal search, that is, a query
over the full fields of the product. Let's say there are products that
are called motor.

- When autosuggesting, typing mot autosuggests all products with
motor, because the EdgeNGram created m, mo, mot, moto and
motor, respectively, and it matches.
- When searching for mot, however (i.e. pressing enter when seeing the
autosuggestions), it doesn't find any products. The autosuggest field is
not part of the real search, and no product attribute contains mot
as a word.

One obvious solution would be to incorporate the autosuggest field
into the real search, however, this adds many tokens to the index that
aren't really part of the products indexed and makes for strange search
results, for example when an NGram is also a word, but the record itself
does contain the search term only as part of a word.

Are there clever solutions to this problem?




Re: Best practice: Autosuggest/autocomplete vs. real search

2014-11-10 Thread Jorge Luis Betancourt Gonzalez
It wouldn’t be easy if in the site you’ll ensure that only terms are submitted 
to the actual search? In app I worked some time ago the default behavior of the 
Javascript component used for autocompletion was to first autocomplete the term 
in the input and then submit the query against the backend. I know this is not 
what you’ve asked for but could work? I’m just firing a bullet in the air here! 
:-)

On Nov 10, 2014, at 8:37 AM, Michael Sokolov msoko...@safaribooksonline.com 
wrote:

 The goal is to ensure that suggestions from autocomplete are actually terms 
 in the main index, so that the suggestions will actually result in matches.  
 You've considered expanding the main index by adding the suggestion n-grams 
 to it, but it would probably be better to alter your suggester so that it 
 produces only tokens that are in the main index.  I think this is basically 
 how all the Suggester implementations are designed to work already; are you 
 using one of those, or are you using the TermsComponent, or something else?
 
 -Mike
 
 On 11/10/14 2:54 AM, Thomas Michael Engelke wrote:
  
  We're using Solr as a backend for an ECommerce site/system. The Solr
 index stores products with selected attributes, as well as a dedicated
 field for autocomplete suggestions (Done via AJAX request when typing in
 the search box without pressing return).
 
 The autosuggest field is supplied by copyField directives from certain
 select product attribute fields (description and/or name mostly). It
 uses EdgeNGramFilterFactory to complete words not yet typed completely,
 and it works quite well.
 
 However, we come across an issue with a disconnect between the
 autosuggest results and results of a normal search, that is, a query
 over the full fields of the product. Let's say there are products that
 are called motor.
 
 - When autosuggesting, typing mot autosuggests all products with
 motor, because the EdgeNGram created m, mo, mot, moto and
 motor, respectively, and it matches.
 - When searching for mot, however (i.e. pressing enter when seeing the
 autosuggestions), it doesn't find any products. The autosuggest field is
 not part of the real search, and no product attribute contains mot
 as a word.
 
 One obvious solution would be to incorporate the autosuggest field
 into the real search, however, this adds many tokens to the index that
 aren't really part of the products indexed and makes for strange search
 results, for example when an NGram is also a word, but the record itself
 does contain the search term only as part of a word.
 
 Are there clever solutions to this problem?
 



Re: Best practice: Autosuggest/autocomplete vs. real search

2014-11-10 Thread Thomas Michael Engelke
 The dedicated autosuggest field is not used by a suggester component,
instead we just directly query it (/select). I'm trying to read my way
into how the suggesters work, and toying around with some configurations
(For instance from here:
http://www.andornot.com/blog/post/Advanced-autocomplete-with-Solr-Ngrams-and-Twitters-typeaheadjs.aspx).

Compared to how you can analyze search result through the Solr backend,
the analysis of suggester results seems to be sorely lacking.

Am 10.11.2014 14:37 schrieb Michael Sokolov: 

 The goal is to ensure that suggestions from autocomplete are actually terms 
 in the main index, so that the suggestions will actually result in matches. 
 You've considered expanding the main index by adding the suggestion n-grams 
 to it, but it would probably be better to alter your suggester so that it 
 produces only tokens that are in the main index. I think this is basically 
 how all the Suggester implementations are designed to work already; are you 
 using one of those, or are you using the TermsComponent, or something else?
 
 -Mike
 
 On 11/10/14 2:54 AM, Thomas Michael Engelke wrote:
 
 We're using Solr as a backend for an ECommerce site/system. The Solr index 
 stores products with selected attributes, as well as a dedicated field for 
 autocomplete suggestions (Done via AJAX request when typing in the search 
 box without pressing return). The autosuggest field is supplied by copyField 
 directives from certain select product attribute fields (description and/or 
 name mostly). It uses EdgeNGramFilterFactory to complete words not yet typed 
 completely, and it works quite well. However, we come across an issue with a 
 disconnect between the autosuggest results and results of a normal search, 
 that is, a query over the full fields of the product. Let's say there are 
 products that are called motor. - When autosuggesting, typing mot 
 autosuggests all products with motor, because the EdgeNGram created m, 
 mo, mot, moto and motor, respectively, and it matches. - When 
 searching for mot, however (i.e. pressing enter when seeing the 
 autosuggestions), it doesn't
find any products. The autosuggest field is not part of the real search, and 
no product attribute contains mot as a word. One obvious solution would be to 
incorporate the autosuggest field into the real search, however, this adds 
many tokens to the index that aren't really part of the products indexed and 
makes for strange search results, for example when an NGram is also a word, but 
the record itself does contain the search term only as part of a word. Are 
there clever solutions to this problem?
 

Best practice: Autosuggest/autocomplete vs. real search

2014-11-09 Thread Thomas Michael Engelke
 

 We're using Solr as a backend for an ECommerce site/system. The Solr
index stores products with selected attributes, as well as a dedicated
field for autocomplete suggestions (Done via AJAX request when typing in
the search box without pressing return).

The autosuggest field is supplied by copyField directives from certain
select product attribute fields (description and/or name mostly). It
uses EdgeNGramFilterFactory to complete words not yet typed completely,
and it works quite well.

However, we come across an issue with a disconnect between the
autosuggest results and results of a normal search, that is, a query
over the full fields of the product. Let's say there are products that
are called motor.

- When autosuggesting, typing mot autosuggests all products with
motor, because the EdgeNGram created m, mo, mot, moto and
motor, respectively, and it matches.
- When searching for mot, however (i.e. pressing enter when seeing the
autosuggestions), it doesn't find any products. The autosuggest field is
not part of the real search, and no product attribute contains mot
as a word.

One obvious solution would be to incorporate the autosuggest field
into the real search, however, this adds many tokens to the index that
aren't really part of the products indexed and makes for strange search
results, for example when an NGram is also a word, but the record itself
does contain the search term only as part of a word.

Are there clever solutions to this problem? 

Autosuggest using EdgeNGrams with strange highlighting

2014-11-07 Thread Thomas Michael Engelke
We've moved from an asterisk based autosuggest functionality 
(searchterm*) to a version using a special field called autosuggest, 
filled via copyField directives. The field definition:


fieldType name=autosuggest class=solr.TextField 
positionIncrementGap=100

analyzer type=index
tokenizer 
class=solr.StandardTokenizerFactory/
filter 
class=solr.LowerCaseFilterFactory/
filter class=solr.StopFilterFactory 
words=stopwords.txt ignoreCase=true enablePositionIncrements=true 
format=snowball/
filter 
class=solr.DictionaryCompoundWordTokenFilterFactory 
dictionary=dictionary.txt minWordSize=5 minSubwordSize=3 
maxSubwordSize=30 onlyLongestMatch=false/
filter 
class=solr.GermanNormalizationFilterFactory/
filter 
class=solr.SnowballPorterFilterFactory language=German2 
protected=protwords.txt/
filter 
class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=15 
side=front/
filter 
class=solr.RemoveDuplicatesTokenFilterFactory/

/analyzer
analyzer type=query
tokenizer 
class=solr.StandardTokenizerFactory/
filter 
class=solr.LowerCaseFilterFactory/
filter class=solr.StopFilterFactory 
words=stopwords.txt ignoreCase=true enablePositionIncrements=true 
format=snowball/
filter 
class=solr.DictionaryCompoundWordTokenFilterFactory 
dictionary=dictionary.txt minWordSize=5 minSubwordSize=3 
maxSubwordSize=30 onlyLongestMatch=false/
filter 
class=solr.GermanNormalizationFilterFactory/
filter 
class=solr.SnowballPorterFilterFactory language=German2 
protected=protwords.txt/
filter 
class=solr.RemoveDuplicatesTokenFilterFactory/

/analyzer
/fieldType

It works like a charm. Now, we've had highlighting from Solr before, 
using these parameters:


hl=truehl.simple.pre=span+class%3Dhighlighthl.snippets=1hl.simple.post=/spanspellcheck=truehl.fl=description

Now, we've seen something strange. This is just an example, the problem 
is with more than this record. In this example, the autosuggest field 
contains:


2CV4 Spot, Dekorsatz, für 2CV.

However, the highlighting branch for this autosuggest field in the 
record looks like this:


lst name=highlighting
  lst name=34725
arr name=short_description
  str2CV4 Spot, Dekorsatz, für em2CV/em./str
/arr
  /lst
  ...

Although the EdgeNGramFilterFactory generated the NGrams so that 2CV4 
- 2, 2C, 2CV, 2CV4, the term is not highlighted. Shouldn't it? 
It's not a question of the number of highlights, records containing 
multiple occurances of 2CV get highlighted multiple times with no 
problems.


It seems that words only containing parts of the search term which match 
the EdgeNGrams are not highlighted. As we're using highlighting from 
Solr exclusively, this leads to records being found, but having no 
highlight at all.


Is there a way to prevent some keywords from being added to autosuggest dictionary?

2014-10-17 Thread bbarani
We index around 10k documents in SOLR and use inbuilt suggest functionality
for auto complete.

We have a field that contain a flag that is used to show or hide the
documents from search results. 

I am trying to figure out a way to control the terms added to autosuggest
index (to skip the documents from getting added to auto suggest index) based
on the value of the flag. Is there a way to do that?






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-there-a-way-to-prevent-some-keywords-from-being-added-to-autosuggest-dictionary-tp4164699.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Is there a way to prevent some keywords from being added to autosuggest dictionary?

2014-10-17 Thread Garth Grimm
What field(s) auto suggest uses is configurable.  So you could create special 
fields (and associated ‘copyField’ configs) to populate specific fields for 
auto suggest.

For example, you could have 2 fields for “hidden_desc” and “visible_desc”.  
Copy field both of them to a field named “description”.  Then set auto suggest 
to use only the “visible_desc” field to drive auto suggests.

That might be one viable option.

Regard,
Garth

On Oct 17, 2014, at 1:02 PM, bbarani bbar...@gmail.com wrote:

 We index around 10k documents in SOLR and use inbuilt suggest functionality
 for auto complete.
 
 We have a field that contain a flag that is used to show or hide the
 documents from search results. 
 
 I am trying to figure out a way to control the terms added to autosuggest
 index (to skip the documents from getting added to auto suggest index) based
 on the value of the flag. Is there a way to do that?
 
 
 
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Is-there-a-way-to-prevent-some-keywords-from-being-added-to-autosuggest-dictionary-tp4164699.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Filtering autosuggest results in Solr

2014-08-15 Thread Chakravarthy Yeleswarapu -X (cyeleswa - ASQUARE INC at Cisco)
Hi,

We have following use case:

Filter autosuggest results of solr_field1 based on solr_field2 values. The 
solr_field2 values are constants such as source1, source2 etc.
If user types xyz for solr_field1, suggestions  returned can match anywhere 
in solr_field1 value such as abcxyz, xyzabc, abcxyzdef, acXyZc etc. But 
returned suggestions should be filtered by solr_field2 value.
Something like, q=solr_field1:xyzfq=solr_field2:source1 or solr_field2:source2 
= abcxyz, xyzabc, abcxyzdef

We have tried /terms component returns suggestions case insensitive but it has 
no filtering capability. Faceting query works with only prefix values.

Please can anyone suggest alternative approaches.

Thanks
Chakra



Autosuggest with spelling correction

2014-08-13 Thread Harun Reşit Zafer

Hi everyone,

Currently I'm using AnalyzingInfixLookupFactory with a suggestions file 
containing up to 3 word phrases. However this component can't keep 
suggesting in case of spelling errors. I heard about FuzzySuggester and 
found some sample configurations here 
http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/test-files/solr/collection1/conf/solrconfig-phrasesuggest.xml. 
But I  couldn't make any of them work. I got the same error: 
...solr-4.9.0\example\solr\collection1\data\fuzzy_suggest_analyzing\fwfsta.bin 
(The system cannot find the file specified).


In short, is there a Suggester component that supports both infix lookup 
and fuzzy suggest, and where can I find a proper sample configuration.


Thanks

--
Harun Reşit Zafer
TÜBİTAK BİLGEM BTE
Bulut Bilişim ve Büyük Veri Analiz Sistemleri Bölümü
T +90 262 675 3268
W  http://www.hrzafer.com



Re: Autosuggest with spelling correction

2014-08-13 Thread Gopal Patwa
This jira has some documentation, may be this will help you..

https://issues.apache.org/jira/browse/SOLR-5683



On Wed, Aug 13, 2014 at 1:28 AM, Harun Reşit Zafer 
harun.za...@tubitak.gov.tr wrote:

 Hi everyone,

 Currently I'm using AnalyzingInfixLookupFactory with a suggestions file
 containing up to 3 word phrases. However this component can't keep
 suggesting in case of spelling errors. I heard about FuzzySuggester and
 found some sample configurations here http://svn.apache.org/repos/
 asf/lucene/dev/trunk/solr/core/src/test-files/solr/
 collection1/conf/solrconfig-phrasesuggest.xml. But I  couldn't make any
 of them work. I got the same error: ...solr-4.9.0\example\solr\
 collection1\data\fuzzy_suggest_analyzing\fwfsta.bin (The system cannot
 find the file specified).

 In short, is there a Suggester component that supports both infix lookup
 and fuzzy suggest, and where can I find a proper sample configuration.

 Thanks

 --
 Harun Reşit Zafer
 TÜBİTAK BİLGEM BTE
 Bulut Bilişim ve Büyük Veri Analiz Sistemleri Bölümü
 T +90 262 675 3268
 W  http://www.hrzafer.com




Re: Extend the Solr Terms Component to implement a customized Autosuggest

2014-08-01 Thread Erick Erickson
Ummm, 400k documents is _tiny_ by Solr/Lucene standards. I've seen 150M
docs fit in 16G on Solr. I put 11M docs on my laptop

So I would _strongly_ advise that you don't worry about space at all as a
first approach and freely copy as many fields as you need to support your
use-case. Only after you've proved that this is untenable would I recommend
you develop custom code. You'll be in production much faster that way ;)

Of course this is irrelevant if each doc is War and Peace, but

Best,
Erick


On Thu, Jul 31, 2014 at 3:29 PM, Juan Pablo Albuja jpalb...@dustland.com
wrote:

 Good afternoon guys, I really appreciate if someone on the community can
 help me with the following issue:

 I need to implement a Solr autosuggest that supports:

 1.   Get autosuggestion over multivalued fields

 2.   Case - Insensitiveness

 3.   Look for content in the middle for example I have the value
 Hello World indexed, and I need to get that value when the user types
 wor

 4.   Filter by an additional field.

 I was using the terms component because with it I can satisfy 1 to 3, but
 for point 4 is not possible. I also was looking at faceting searches and
 Ngram.Edge-Ngrams, but the problem with those approaches is that I need to
 copy fields over to make them tokenized or apply grams to those, and I
 don't want to do that because I have more than 6 fields that needs
 autosuggest, my index is big I have more than 400k documents and I don't
 want to increase the size.
 I was trying to Extend the terms component in order to add an additional
 filter but it uses TermsEnum that is a vector over an specific field and I
 couldn't figure out how to filter it in a really efficient way.
 Do you guys have an idea on how can I satisfy my requirements in an
 efficient way? If there is another way without using the terms component
 for me is also awesome.

 Thanks




 Juan Pablo Albuja
 Senior Developer





Extend the Solr Terms Component to implement a customized Autosuggest

2014-07-31 Thread Juan Pablo Albuja
Good afternoon guys, I really appreciate if someone on the community can help 
me with the following issue:

I need to implement a Solr autosuggest that supports:

1.   Get autosuggestion over multivalued fields

2.   Case - Insensitiveness

3.   Look for content in the middle for example I have the value Hello 
World indexed, and I need to get that value when the user types wor

4.   Filter by an additional field.

I was using the terms component because with it I can satisfy 1 to 3, but for 
point 4 is not possible. I also was looking at faceting searches and 
Ngram.Edge-Ngrams, but the problem with those approaches is that I need to copy 
fields over to make them tokenized or apply grams to those, and I don't want to 
do that because I have more than 6 fields that needs autosuggest, my index is 
big I have more than 400k documents and I don't want to increase the size.
I was trying to Extend the terms component in order to add an additional filter 
but it uses TermsEnum that is a vector over an specific field and I couldn't 
figure out how to filter it in a really efficient way.
Do you guys have an idea on how can I satisfy my requirements in an efficient 
way? If there is another way without using the terms component for me is also 
awesome.

Thanks




Juan Pablo Albuja
Senior Developer




Sorting is not correct in autosuggest

2014-04-30 Thread neha sinha
Hi All

In my auto suggest page sorting is not correct for the suggestions i am
getting.
However suggestions are all correct.





Any guidance will be helpful



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Sorting-is-not-correct-in-autosuggest-tp4133859.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Sorting is not correct in autosuggest

2014-04-30 Thread Erick Erickson
Please review:

http://wiki.apache.org/solr/UsingMailingLists

You've given us  virtually no information here.

Best,
Erick

On Wed, Apr 30, 2014 at 12:35 AM, neha sinha nehasinha...@gmail.com wrote:
 Hi All

 In my auto suggest page sorting is not correct for the suggestions i am
 getting.
 However suggestions are all correct.





 Any guidance will be helpful



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Sorting-is-not-correct-in-autosuggest-tp4133859.html
 Sent from the Solr - User mailing list archive at Nabble.com.


AutoSuggest like Google in Solr using Solarium Client.

2014-03-17 Thread Sohan Kalsariya
Can anyone suggest me the best practices how to do SpellCheck and
AutoSuggest in solarium.
Can anyone give me example for that?


-- 
Regards,
*Sohan Kalsariya*


RE: AutoSuggest like Google in Solr using Solarium Client.

2014-03-17 Thread Suresh Soundararajan
Hi Sohan,

The best approach for the auto suggest is using the facet query.

Please refer the link : 
http://solr.pl/en/2010/10/18/solr-and-autocomplete-part-1/


Thanks,
SureshKumar.S


From: Sohan Kalsariya sohankalsar...@gmail.com
Sent: Monday, March 17, 2014 8:14 PM
To: solr-user@lucene.apache.org
Subject: AutoSuggest like Google in Solr using Solarium Client.

Can anyone suggest me the best practices how to do SpellCheck and
AutoSuggest in solarium.
Can anyone give me example for that?


--
Regards,
*Sohan Kalsariya*
[Aspire Systems]

This e-mail message and any attachments are for the sole use of the intended 
recipient(s) and may contain proprietary, confidential, trade secret or 
privileged information. Any unauthorized review, use, disclosure or 
distribution is prohibited and may be a violation of law. If you are not the 
intended recipient, please contact the sender by reply e-mail and destroy all 
copies of the original message.


Re: AutoSuggest like Google in Solr using Solarium Client.

2014-03-17 Thread Michael McCandless
I think it's best to use one of the many autosuggesters Lucene/Solr provide?

E.g. AnalyzingInfixSuggester is running here:
http://jirasearch.mikemccandless.com

But that's just one suggester... there are many more.

Mike McCandless

http://blog.mikemccandless.com


On Mon, Mar 17, 2014 at 10:44 AM, Sohan Kalsariya
sohankalsar...@gmail.com wrote:
 Can anyone suggest me the best practices how to do SpellCheck and
 AutoSuggest in solarium.
 Can anyone give me example for that?


 --
 Regards,
 *Sohan Kalsariya*


Re: AutoSuggest like Google in Solr using Solarium Client.

2014-03-17 Thread bbi123
Not sure if you have already seen this one..

http://www.solarium-project.org/2012/01/suggester-query-support/

You can also use edge N gram filter to implement typeahead auto suggest.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/AutoSuggest-like-Google-in-Solr-using-Solarium-Client-tp4124821p4124871.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Autosuggest - Strange issue with leading numbers in query

2014-03-14 Thread bbi123
I tied almost all possible combination but in vain.

Does anyone know if there is any logic build in to suggester component to
ignore the leading numbers?

autocomplete?qt=/lucidreq_type=auto_completespellcheck.collate=falseq=34g

lst name=spellcheck
lst name=suggestions
lst name=g
int name=numFound1/int
int name=startOffset2/int
int name=endOffset3/int
arr name=suggestion
strgalaxy/str
/arr
/lst
/lst
/lst
/response


/autocomplete?qt=/lucidreq_type=auto_completespellcheck.collate=falseq=11123423432423243ip

response
lst name=responseHeader
int name=status0/int
int name=QTime0/int
/lst
lst name=spellcheck
lst name=suggestions
lst name=ip
int name=numFound2/int
*int name=startOffset17/int*
int name=endOffset19/int
arr name=suggestion
stripad/str
striphone/str
/arr
/lst
/lst
/lst
/response



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Autosuggest-Strange-issue-with-leading-numbers-in-query-tp4116751p4123702.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Autosuggest - Strange issue with leading numbers in query

2014-02-19 Thread Jason Hellman
Here’s a rather obvious question:  have you rebuilt your spell index recently?  
Is it possible the offending numbers snuck into the spell dictionary?  The 
terms component will show you what’s in your current, searchable field…but not 
the dictionary.

If my memory serves correctly, with collate=true this would allow for such 
behavior to occur, especially with onlyMorePopular set to false (which would 
ensure the resulting collation has a query count greater than the current 
query).  Have you flipped onlyMorePopular to true to confirm?




On Feb 18, 2014, at 10:16 AM, bbi123 bbar...@gmail.com wrote:

 Thanks a lot for your response Erik.
 
 I was trying to find if I have any suggestion starting with numbers using
 terms component but I couldn't find any.. Its very strange!!!
 
 Anyways, thanks again for your response.
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Solr-Autosuggest-Strange-issue-with-leading-numbers-in-query-tp4116751p4118072.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr Autosuggest - Strange issue with leading numbers in query

2014-02-18 Thread bbi123
Thanks a lot for your response Erik.

I was trying to find if I have any suggestion starting with numbers using
terms component but I couldn't find any.. Its very strange!!!

Anyways, thanks again for your response.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Autosuggest-Strange-issue-with-leading-numbers-in-query-tp4116751p4118072.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Autosuggest - Strange issue with leading numbers in query

2014-02-17 Thread Developer
Hi Erik,

Thanks a lot for your reply.

I expect it to return zero suggestions since the suggested keyword doesnt
actually start with numbers.

Expected results 
Searching for ga - returns galaxy 
Searching for gal - returns galaxy
Searching for 12321312321312ga - should not return any suggestion since
there is no keyword (combination) exists in the index.

Thanks




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Autosuggest-Strange-issue-with-leading-numbers-in-query-tp4116751p4117846.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Autosuggest - Strange issue with leading numbers in query

2014-02-17 Thread Erick Erickson
Ah, OK, I though you were indexing things like 123412335ga, but not so.

Afraid I'm fresh out of ideas. Although I might try using TermsComponent
to examine the index and see if, somehow, there _are_ terms with leading
numbers in the output.

It's also possible that numbers are stripped when building the FST that
is used, but I don't know one way or the other.

Best,
Erick


On Mon, Feb 17, 2014 at 11:30 AM, Developer bbar...@gmail.com wrote:

 Hi Erik,

 Thanks a lot for your reply.

 I expect it to return zero suggestions since the suggested keyword doesnt
 actually start with numbers.

 Expected results
 Searching for ga - returns galaxy
 Searching for gal - returns galaxy
 Searching for 12321312321312ga - should not return any suggestion since
 there is no keyword (combination) exists in the index.

 Thanks




 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Solr-Autosuggest-Strange-issue-with-leading-numbers-in-query-tp4116751p4117846.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Solr Autosuggest - Strange issue with leading numbers in query

2014-02-11 Thread Developer
I have a strange issue with Autosuggest.

Whenever I query for a keyword along with numbers (leading) it returns the
suggestion corresponding to the alphabets (ignoring the numbers). I was
under assumption that it will return an empty result back. I am not sure
what I am doing wrong. Can someone help?

*Query:*
/autocomplete?qt=/lucidreq_type=auto_completespellcheck.maxCollations=10q=12342343243242gaspellcheck.count=10

*Result:*

response
lst name=responseHeader
int name=status0/int
int name=QTime1/int
/lst
lst name=spellcheck
lst name=suggestions
lst name=ga
int name=numFound1/int
int name=startOffset15/int
int name=endOffset17/int
arr name=suggestion
strgalaxy/str
/arr
/lst
str name=collation12342343243242galaxy/str
/lst
/lst
/response


*My field configuration is as below:*
fieldType class=solr.TextField name=textSpell_word
positionIncrementGap=100
analyzer
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.StopFilterFactory 
enablePositionIncrements=true
ignoreCase=true words=stopwords_autosuggest.txt/
/analyzer
/fieldType

*SolrConfig.xml*

searchComponent class=solr.SpellCheckComponent name=autocomplete
lst name=spellchecker
str name=nameautocomplete/str
str 
name=classnameorg.apache.solr.spelling.suggest.Suggester/str
str
name=lookupImplorg.apache.solr.spelling.suggest.tst.TSTLookup/str
str name=fieldautocomplete_word/str
str name=storeDirautocomplete/str
str name=buildOnCommittrue/str
float name=threshold.005/float

/lst
/searchComponent
requestHandler class=org.apache.solr.handler.component.SearchHandler
name=/autocomplete
lst name=defaults
str name=spellchecktrue/str
str name=spellcheck.dictionaryautocomplete/str
str name=spellcheck.collatetrue/str
str name=spellcheck.count10/str
str name=spellcheck.onlyMorePopularfalse/str
/lst
arr name=components
strautocomplete/str
/arr
/requestHandler



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Autosuggest-Strange-issue-with-leading-numbers-in-query-tp4116751.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Autosuggest - Strange issue with leading numbers in query

2014-02-11 Thread Erick Erickson
Hmmm, the example you post seems correct to me, the returned
suggestion is really close to the term. What are you expecting here?

The example is inconsistent with
it returns the suggestion corresponding to the alphabets (ignoring the
numbers)

It looks like it's considering the numbers just fine, which is what makes
the returned suggestion close to the term I think.

Best,
Erick


On Tue, Feb 11, 2014 at 1:01 PM, Developer bbar...@gmail.com wrote:

 I have a strange issue with Autosuggest.

 Whenever I query for a keyword along with numbers (leading) it returns the
 suggestion corresponding to the alphabets (ignoring the numbers). I was
 under assumption that it will return an empty result back. I am not sure
 what I am doing wrong. Can someone help?

 *Query:*

 /autocomplete?qt=/lucidreq_type=auto_completespellcheck.maxCollations=10q=12342343243242gaspellcheck.count=10

 *Result:*

 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime1/int
 /lst
 lst name=spellcheck
 lst name=suggestions
 lst name=ga
 int name=numFound1/int
 int name=startOffset15/int
 int name=endOffset17/int
 arr name=suggestion
 strgalaxy/str
 /arr
 /lst
 str name=collation12342343243242galaxy/str
 /lst
 /lst
 /response


 *My field configuration is as below:*
 fieldType class=solr.TextField name=textSpell_word
 positionIncrementGap=100
 analyzer
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.StopFilterFactory
 enablePositionIncrements=true
 ignoreCase=true words=stopwords_autosuggest.txt/
 /analyzer
 /fieldType

 *SolrConfig.xml*

 searchComponent class=solr.SpellCheckComponent
 name=autocomplete
 lst name=spellchecker
 str name=nameautocomplete/str
 str
 name=classnameorg.apache.solr.spelling.suggest.Suggester/str
 str
 name=lookupImplorg.apache.solr.spelling.suggest.tst.TSTLookup/str
 str name=fieldautocomplete_word/str
 str name=storeDirautocomplete/str
 str name=buildOnCommittrue/str
 float name=threshold.005/float

 /lst
 /searchComponent
 requestHandler
 class=org.apache.solr.handler.component.SearchHandler
 name=/autocomplete
 lst name=defaults
 str name=spellchecktrue/str
 str
 name=spellcheck.dictionaryautocomplete/str
 str name=spellcheck.collatetrue/str
 str name=spellcheck.count10/str
 str name=spellcheck.onlyMorePopularfalse/str
 /lst
 arr name=components
 strautocomplete/str
 /arr
 /requestHandler



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Solr-Autosuggest-Strange-issue-with-leading-numbers-in-query-tp4116751.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Autosuggest - Custom sorting

2013-10-01 Thread SolrLover
Is there a way to sort the returned Autosuggest list based on a particular
value (ex: score)?

I am trying to sort the returned suggestions based on a field that has been
calculated manually but not sure how to use that field for sorting
suggestions.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Autosuggest-Custom-sorting-tp4092980.html
Sent from the Solr - User mailing list archive at Nabble.com.


Autosuggest on very large index

2013-08-20 Thread Greg Preston
Using 4.4.0 -

I would like to be able to do an autosuggest query against one of the
fields in our index and have the results be limited by an fq.

I can get exactly the results I want with a facet query using a
facet.prefix, but the first query takes ~5 minutes to run on our QA
env (~240M docs).  I'm afraid to attempt it on prod (~2B docs).
Subsequent queries are sufficiently fast (~500ms).

I'm assuming the first query is uninverting the field.  Is there any
way to mark that field so that an uninverted copy is maintained as
updates come in?  We plan to soft commit every 5 minutes, and we'd
prefer to not be continuously uninverting this one field.

Or is there a better way to do what I'm trying to do?  I've looked at
the spellcheck component a little bit, but it looks like I can't
filter results by fq.  The fq I'm using is based on which client is
logged in, and we can't autosuggest terms from one client to another.

Thanks.

-Greg


RE: Autosuggest on very large index

2013-08-20 Thread Markus Jelsma
I am not entirely sure but the Suggester's FST uses prefixes so you may be able 
to prefix the value you otherwise use for the filter query when you build the 
suggester.
 
-Original message-
 From:Greg Preston gpres...@marinsoftware.com
 Sent: Tuesday 20th August 2013 20:00
 To: solr-user@lucene.apache.org
 Subject: Autosuggest on very large index
 
 Using 4.4.0 -
 
 I would like to be able to do an autosuggest query against one of the
 fields in our index and have the results be limited by an fq.
 
 I can get exactly the results I want with a facet query using a
 facet.prefix, but the first query takes ~5 minutes to run on our QA
 env (~240M docs).  I'm afraid to attempt it on prod (~2B docs).
 Subsequent queries are sufficiently fast (~500ms).
 
 I'm assuming the first query is uninverting the field.  Is there any
 way to mark that field so that an uninverted copy is maintained as
 updates come in?  We plan to soft commit every 5 minutes, and we'd
 prefer to not be continuously uninverting this one field.
 
 Or is there a better way to do what I'm trying to do?  I've looked at
 the spellcheck component a little bit, but it looks like I can't
 filter results by fq.  The fq I'm using is based on which client is
 logged in, and we can't autosuggest terms from one client to another.
 
 Thanks.
 
 -Greg


Re: Autosuggest on very large index

2013-08-20 Thread Greg Preston
The filter query would be on a different field (clientId) than the
field we want to autosuggest on (title).

Or are you proposing we index a compound field that would be
clientId+titleTokens so we would then prefix the suggester with
clientId+userInput ?

Interesting idea.

-Greg


On Tue, Aug 20, 2013 at 11:21 AM, Markus Jelsma
markus.jel...@openindex.io wrote:
 I am not entirely sure but the Suggester's FST uses prefixes so you may be 
 able to prefix the value you otherwise use for the filter query when you 
 build the suggester.

 -Original message-
 From:Greg Preston gpres...@marinsoftware.com
 Sent: Tuesday 20th August 2013 20:00
 To: solr-user@lucene.apache.org
 Subject: Autosuggest on very large index

 Using 4.4.0 -

 I would like to be able to do an autosuggest query against one of the
 fields in our index and have the results be limited by an fq.

 I can get exactly the results I want with a facet query using a
 facet.prefix, but the first query takes ~5 minutes to run on our QA
 env (~240M docs).  I'm afraid to attempt it on prod (~2B docs).
 Subsequent queries are sufficiently fast (~500ms).

 I'm assuming the first query is uninverting the field.  Is there any
 way to mark that field so that an uninverted copy is maintained as
 updates come in?  We plan to soft commit every 5 minutes, and we'd
 prefer to not be continuously uninverting this one field.

 Or is there a better way to do what I'm trying to do?  I've looked at
 the spellcheck component a little bit, but it looks like I can't
 filter results by fq.  The fq I'm using is based on which client is
 logged in, and we can't autosuggest terms from one client to another.

 Thanks.

 -Greg


Re: Autosuggest on very large index

2013-08-20 Thread Jack Krupansky
Sounds like a problem for DocValues - assuming the number of unique values 
fits reasonably in memory to avoid I/O.


How many unique values do you have or contemplate for two your billion 
documents?


Two possibilities:

1. You need a lot more hardware.
2. You need to scale back your ambitions.

-- Jack Krupansky

-Original Message- 
From: Greg Preston

Sent: Tuesday, August 20, 2013 2:00 PM
To: solr-user@lucene.apache.org
Subject: Autosuggest on very large index

Using 4.4.0 -

I would like to be able to do an autosuggest query against one of the
fields in our index and have the results be limited by an fq.

I can get exactly the results I want with a facet query using a
facet.prefix, but the first query takes ~5 minutes to run on our QA
env (~240M docs).  I'm afraid to attempt it on prod (~2B docs).
Subsequent queries are sufficiently fast (~500ms).

I'm assuming the first query is uninverting the field.  Is there any
way to mark that field so that an uninverted copy is maintained as
updates come in?  We plan to soft commit every 5 minutes, and we'd
prefer to not be continuously uninverting this one field.

Or is there a better way to do what I'm trying to do?  I've looked at
the spellcheck component a little bit, but it looks like I can't
filter results by fq.  The fq I'm using is based on which client is
logged in, and we can't autosuggest terms from one client to another.

Thanks.

-Greg 



Re: Autosuggest on very large index

2013-08-20 Thread Greg Preston
DocValues looks interesting, a non-inverted field.  I'll play with it
a bit and see how it works.  Thanks for the suggestion.

I don't know how many total terms we've got, but each document is
only 2-5 words/terms on average, and there is a TON of overlap between
docs.



-Greg


On Tue, Aug 20, 2013 at 11:38 AM, Jack Krupansky
j...@basetechnology.com wrote:
 Sounds like a problem for DocValues - assuming the number of unique values
 fits reasonably in memory to avoid I/O.

 How many unique values do you have or contemplate for two your billion
 documents?

 Two possibilities:

 1. You need a lot more hardware.
 2. You need to scale back your ambitions.

 -- Jack Krupansky

 -Original Message- From: Greg Preston
 Sent: Tuesday, August 20, 2013 2:00 PM

 To: solr-user@lucene.apache.org
 Subject: Autosuggest on very large index

 Using 4.4.0 -

 I would like to be able to do an autosuggest query against one of the
 fields in our index and have the results be limited by an fq.

 I can get exactly the results I want with a facet query using a
 facet.prefix, but the first query takes ~5 minutes to run on our QA
 env (~240M docs).  I'm afraid to attempt it on prod (~2B docs).
 Subsequent queries are sufficiently fast (~500ms).

 I'm assuming the first query is uninverting the field.  Is there any
 way to mark that field so that an uninverted copy is maintained as
 updates come in?  We plan to soft commit every 5 minutes, and we'd
 prefer to not be continuously uninverting this one field.

 Or is there a better way to do what I'm trying to do?  I've looked at
 the spellcheck component a little bit, but it looks like I can't
 filter results by fq.  The fq I'm using is based on which client is
 logged in, and we can't autosuggest terms from one client to another.

 Thanks.

 -Greg


Re: AutoSuggest+Grouping in one request

2013-05-02 Thread Otis Gospodnetic
Hi,

Hm, I *think* you can't do it in one go with Solr's Suggester, but I'm
not expert there.  I can only point you to something like our
AutoComplete - http://sematext.com/products/autocomplete/index.html -
which, as you can see on that screenshot, has the grouping you seem to
be after.  Maybe somebody else can point out if Solr Suggester can do
the same?

Otis
--
Solr  ElasticSearch Support
http://sematext.com/





On Fri, Apr 26, 2013 at 9:58 AM, Rounak Jain rouna...@gmail.com wrote:
 Hi everyone,

 Search dropdowns on popular sites like Amazon (example
 imagehttp://i.imgur.com/aQyM8WD.jpg)
 use autosuggested words along with grouping (Field Collapsing in Solr).

 While I can replicate the same functionality in Solr using two requests
 (first to obtain suggestions, second for the actual query using the most
 probable suggestion), I want to know if this can be done in one request
 itself.

 I understand that there are various ways to obtain suggestions (term
 component, facets, Solr's inbuilt
 Suggesterhttp://wiki.apache.org/solr/Suggester),
 and I'm open to using any one of them, if it means I'll be able to get
 everything (groups + suggestions) in one request.

 Looking forward to some advice with regard to this.

 Thanks,

 Rounak


AutoSuggest+Grouping in one request

2013-04-26 Thread Rounak Jain
Hi everyone,

Search dropdowns on popular sites like Amazon (example
imagehttp://i.imgur.com/aQyM8WD.jpg)
use autosuggested words along with grouping (Field Collapsing in Solr).

While I can replicate the same functionality in Solr using two requests
(first to obtain suggestions, second for the actual query using the most
probable suggestion), I want to know if this can be done in one request
itself.

I understand that there are various ways to obtain suggestions (term
component, facets, Solr's inbuilt
Suggesterhttp://wiki.apache.org/solr/Suggester),
and I'm open to using any one of them, if it means I'll be able to get
everything (groups + suggestions) in one request.

Looking forward to some advice with regard to this.

Thanks,

Rounak


Re: Issue with spellcheck and autosuggest

2013-02-03 Thread Dixline
But if i use my system as solr server it is working fine. The problem comes
only if i use another machine as solr server. But both machines have the
same schema and solrconfig files.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Issue-with-spellcheck-and-autosuggest-tp4036208p4038287.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Issue with spellcheck and autosuggest

2013-01-30 Thread Artyom
you can of course check suggestions, but then you should remove
  str name=spellcheck.dictionarywordbreak/str 
from your handler, because its purpose is to find cases, when user types
spaces wrongly (e.g., solrrocks, sol rrocks, so lr)



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Issue-with-spellcheck-and-autosuggest-tp4036208p4037631.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Issue with spellcheck and autosuggest

2013-01-29 Thread Artyom
you should check not suggestions, but collations in the response xml



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Issue-with-spellcheck-and-autosuggest-tp4036208p4036977.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Issue with spellcheck and autosuggest

2013-01-26 Thread Jack Krupansky
Is there any chance that you were experimenting with an ngram filter for the 
field? If you were, and merely changed the field type without reindexing, 
this behavior makes sense. In other words, you appear to have had some 
filter that broke words into one and two-character terms.


Separate from that, the analyzer for a spellchecker should be very simple 
and preserve the structure of the term rather than decompose it, as 
WordDelimiterFilter does. So, be sure the use an analyzer that is very 
simple, such as StandardTokenizer and lower case filter, but nothing else. 
In general, use a separate field, like textSpell that has the simple 
analyzer and do a copyField from the original text field that can still have 
a richer analyzer


-- Jack Krupansky

-Original Message- 
From: Dixline

Sent: Friday, January 25, 2013 6:30 AM
To: solr-user@lucene.apache.org
Subject: Issue with spellcheck and autosuggest

Hi,

this is my spellcheck/autosuggest dictionary field and field type,

field name=searchText type=spelltext indexed=true stored=true
multiValued=true default=JulyMSO /

fieldType name=spelltext class=solr.TextField
positionIncrementGap=100
analyzer type=index
   tokenizer class=solr.WhitespaceTokenizerFactory /
   filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt /
filter class=solr.WordDelimiterFilterFactory generateWordParts=1
generateNumberParts=1 catenateWords=1 catenateNumbers=1
catenateAll=0/
   filter class=solr.LowerCaseFilterFactory /
   filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt /

   filter class=solr.RemoveDuplicatesTokenFilterFactory /
/analyzer
analyzer type=query
   tokenizer class=solr.WhitespaceTokenizerFactory /
   filter class=solr.SynonymFilterFactory
synonyms=synonyms.txt ignoreCase=true expand=true /
   filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt /
filter class=solr.WordDelimiterFilterFactory generateWordParts=1
generateNumberParts=1 catenateWords=0 catenateNumbers=0
catenateAll=0/
   filter class=solr.LowerCaseFilterFactory /
   filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt /

   filter class=solr.RemoveDuplicatesTokenFilterFactory/
/analyzer
 /fieldType

And this is my solrconfig.xml,

searchComponent name=spellcheck class=solr.SpellCheckComponent

   str name=queryAnalyzerFieldTypespelltext/str




   lst name=spellchecker
 str name=namedefault/str
 str name=fieldsearchText/str
 str name=classnamesolr.DirectSolrSpellChecker/str
 str name=buildOnOptimizetrue/str


 str name=distanceMeasureinternal/str

 float name=accuracy0.1/float

 int name=maxEdits2/int

 int name=minPrefix1/int

 int name=maxInspections5/int

 int name=minQueryLength4/int

 float name=maxQueryFrequency0.01/float

 float name=thresholdTokenFrequency.01/float
   /lst


   lst name=spellchecker
 str name=namewordbreak/str
 str name=classnamesolr.WordBreakSolrSpellChecker/str
 str name=fieldsearchText/str
 str name=combineWordstrue/str
 str name=breakWordstrue/str
 str name=buildOnOptimizetrue/str
 int name=maxChanges10/int
   /lst
 /searchComponent
requestHandler name=/spell class=solr.SearchHandler startup=lazy
   lst name=defaults
 str name=dfsearchText/str
 str name=spellcheck.dictionarydefault/str
 str name=spellcheck.dictionarywordbreak/str
 str name=spellchecktrue/str
 str name=spellcheck.onlyMorePopulartrue/str
 str name=spellcheck.count6/str
 str name=spellcheck.extendedResultsfalse/str
 str name=spellcheck.alternativeTermCount5/str
 str name=spellcheck.maxResultsForSuggest5/str
 str name=spellcheck.collatetrue/str
 str name=spellcheck.collateExtendedResultsfalse/str
 str name=spellcheck.maxCollationTries3/str
 str name=spellcheck.maxCollations1/str
   /lst
   arr name=last-components
 strspellcheck/str
   /arr
 /requestHandler



  searchComponent class=solr.SpellCheckComponent name=suggest
str name=queryAnalyzerFieldTypespelltext/str
   lst name=spellchecker
 str name=namesuggest/str
 str name=classnameorg.apache.solr.spelling.suggest.Suggester/str
 str
name=lookupImplorg.apache.solr.spelling.suggest.tst.TSTLookup/str

 str name=fieldsearchText/str
 str name=buildOnOptimizetrue/str
 float name=accuracy0.1/float
 float name=threshold0.005/float
   /lst
 /searchComponent
 requestHandler class=org.apache.solr.handler.component.SearchHandler
name=/suggest
   lst name=defaults
 str name=dfsearchText/str
 str name=spellcheck.dictionarysuggest/str
 str name=spellchecktrue/str
 str name=spellcheck.onlyMorePopularfalse/str
 str name=spellcheck.count6/str
 str name=spellcheck.extendedResultsfalse/str
 str name=spellcheck.collatetrue/str
 str name=spellcheck.collateExtendedResultsfalse/str
 str name=spellcheck.maxCollationTries3

Issue with spellcheck and autosuggest

2013-01-25 Thread Dixline
Hi,

this is my spellcheck/autosuggest dictionary field and field type,

field name=searchText type=spelltext indexed=true stored=true
multiValued=true default=JulyMSO /

fieldType name=spelltext class=solr.TextField
positionIncrementGap=100
 analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory /
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt /
filter class=solr.WordDelimiterFilterFactory 
generateWordParts=1
generateNumberParts=1 catenateWords=1 catenateNumbers=1
catenateAll=0/ 
filter class=solr.LowerCaseFilterFactory /
filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt /

filter class=solr.RemoveDuplicatesTokenFilterFactory /
 /analyzer
 analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory /
filter class=solr.SynonymFilterFactory
synonyms=synonyms.txt ignoreCase=true expand=true /
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt /
filter class=solr.WordDelimiterFilterFactory 
generateWordParts=1
generateNumberParts=1 catenateWords=0 catenateNumbers=0
catenateAll=0/ 
filter class=solr.LowerCaseFilterFactory /
filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt /

filter class=solr.RemoveDuplicatesTokenFilterFactory/
 /analyzer
  /fieldType

And this is my solrconfig.xml,

 searchComponent name=spellcheck class=solr.SpellCheckComponent

str name=queryAnalyzerFieldTypespelltext/str




lst name=spellchecker
  str name=namedefault/str
  str name=fieldsearchText/str
  str name=classnamesolr.DirectSolrSpellChecker/str
  str name=buildOnOptimizetrue/str
  
 
  str name=distanceMeasureinternal/str
  
  float name=accuracy0.1/float
  
  int name=maxEdits2/int
  
  int name=minPrefix1/int
  
  int name=maxInspections5/int
  
  int name=minQueryLength4/int
  
  float name=maxQueryFrequency0.01/float
  
  float name=thresholdTokenFrequency.01/float 
/lst


lst name=spellchecker
  str name=namewordbreak/str
  str name=classnamesolr.WordBreakSolrSpellChecker/str  
  str name=fieldsearchText/str
  str name=combineWordstrue/str
  str name=breakWordstrue/str
  str name=buildOnOptimizetrue/str
  int name=maxChanges10/int
/lst
  /searchComponent
 requestHandler name=/spell class=solr.SearchHandler startup=lazy
lst name=defaults
  str name=dfsearchText/str
  str name=spellcheck.dictionarydefault/str
  str name=spellcheck.dictionarywordbreak/str
  str name=spellchecktrue/str
  str name=spellcheck.onlyMorePopulartrue/str
  str name=spellcheck.count6/str
  str name=spellcheck.extendedResultsfalse/str 
  str name=spellcheck.alternativeTermCount5/str 
  str name=spellcheck.maxResultsForSuggest5/str   
  str name=spellcheck.collatetrue/str
  str name=spellcheck.collateExtendedResultsfalse/str  
  str name=spellcheck.maxCollationTries3/str
  str name=spellcheck.maxCollations1/str 
/lst
arr name=last-components
  strspellcheck/str
/arr
  /requestHandler
  
  
  
   searchComponent class=solr.SpellCheckComponent name=suggest
str name=queryAnalyzerFieldTypespelltext/str
lst name=spellchecker
  str name=namesuggest/str
  str name=classnameorg.apache.solr.spelling.suggest.Suggester/str
  str
name=lookupImplorg.apache.solr.spelling.suggest.tst.TSTLookup/str
  
  str name=fieldsearchText/str  
  str name=buildOnOptimizetrue/str
  float name=accuracy0.1/float
  float name=threshold0.005/float  
/lst
  /searchComponent
  requestHandler class=org.apache.solr.handler.component.SearchHandler
name=/suggest
lst name=defaults 
  str name=dfsearchText/str
  str name=spellcheck.dictionarysuggest/str
  str name=spellchecktrue/str
  str name=spellcheck.onlyMorePopularfalse/str
  str name=spellcheck.count6/str
  str name=spellcheck.extendedResultsfalse/str 
  str name=spellcheck.collatetrue/str
  str name=spellcheck.collateExtendedResultsfalse/str  
  str name=spellcheck.maxCollationTries3/str
  str name=spellcheck.maxCollations1/str   
/lst
arr name=components
  strsuggest/str
/arr
  /requestHandler


If i try spellcheck , i'm not getting proper suggestions. For eg there's a
word yellow in my solr document. If i search for yello i'm getting
suggestions as yellow, ye ll wo, y e ll ow,ye ll ow. Why is is coming like
this?

And when i try autosuggest i'm not getting any suggestions for any query.

Can anyone help me with this?

Thanks in advance.

-Dixline.M






--
View this message in context

Sor Cloud Autosuggest not working

2013-01-08 Thread Jay Parashar
I recently migrated to Solr Cloud (4.0.0 from 3.6.0) and my auto suggest
feature does not seem to be working. It is a typical implementation with a
/suggest searchHandler defined on the config.
Are there any changes I need to incorporate?

Regards
Jay



Re: Sor Cloud Autosuggest not working

2013-01-08 Thread Mark Miller
I think distrib with components has to be setup a little differently - you 
might need to use shards.qt to point back to the same request handler for the 
sub searches. Just a guess - been a while since I've looked at spellcheck 
distrib support and I'm not 100% positive the suggest stuff is all distrib 
capable - though I think it should be.

- Mark

On Jan 8, 2013, at 10:06 AM, Jay Parashar jparas...@itscape.com wrote:

 I recently migrated to Solr Cloud (4.0.0 from 3.6.0) and my auto suggest
 feature does not seem to be working. It is a typical implementation with a
 /suggest searchHandler defined on the config.
 Are there any changes I need to incorporate?
 
 Regards
 Jay
 



RE: Sor Cloud Autosuggest not working

2013-01-08 Thread Jay Parashar
Thanks Mark!

-Original Message-
From: Mark Miller [mailto:markrmil...@gmail.com] 
Sent: Tuesday, January 08, 2013 10:16 AM
To: solr-user@lucene.apache.org
Subject: Re: Sor Cloud Autosuggest not working

I think distrib with components has to be setup a little differently - you
might need to use shards.qt to point back to the same request handler for
the sub searches. Just a guess - been a while since I've looked at
spellcheck distrib support and I'm not 100% positive the suggest stuff is
all distrib capable - though I think it should be.

- Mark

On Jan 8, 2013, at 10:06 AM, Jay Parashar jparas...@itscape.com wrote:

 I recently migrated to Solr Cloud (4.0.0 from 3.6.0) and my auto 
 suggest feature does not seem to be working. It is a typical 
 implementation with a /suggest searchHandler defined on the config.
 Are there any changes I need to incorporate?
 
 Regards
 Jay
 



solr -autosuggest

2012-10-25 Thread Sujatha Arun
Hi,

A  few question on Solr Auto suggest below

Q1)I tried using  the  Index based Suggest functionality with  solr 3.6.1 ,
can I combine this with  file based boosting .Currently when I specify the
index field and the sourcelocation,the file in the source location is not
considered.
Is there any way both can be used?

Q2)I saw this line where it says Currently implemented Lookups keep their
data in memory, so unlike spellchecker data, this data is discarded on core
reload and not available until you invoke the build command, either
explicitly or implicitly during a commit.I have used the wfst lookup  and
using the index based suggestion ,I suppose that this applies to only File
based suggestion? Is this correct?


Q3) if spellcheck.onlyMorePopular=true is selected: weights are treated as
popularity score ,Does this mean that this is based on frequency of words
 or is this based on ranking [tf * idf...ect] ?



Regards,
Sujatha


Re: Custom Geocoder with Solr and Autosuggest

2012-08-16 Thread Alexey Serba
 My first decision was to divide SOLR into two cores, since I am already
 using SOLR as my search server. One core would be for the main search of the
 site and one for the geocoding.
Correct. And you can even use that location index/collection for
locations extraction for a non structural documents - i.e. if you
don't have separate field with geographical names in your corpus (or
location data is just not good enough compared to what can be mined
from documents)

 My second decision is to store the name data in a normalised state, some
 examples are shown below:
 London, England
 England
 Swindon, Wiltshire, England
Yes, you can add postcode/outcodes there also. And I would add
additional field type region/county/town/postcode/outcode.

 The third decision was to return “autosuggest” results, for example when the
 user types “Lond” I would like to suggest “London, England”. For this to
 work I think it makes sense to return up to 5 results via JSON based on
 relevancy and have these displayed under the search box.
Yeah, you might want to boost cities more than towns (I'm sure there
are plenty ambiguous terms), use some kind of geoip service,
additional scoring factors.

 My fourth decision is that when the user actually hits the “search” button
 on the location field, SOLR is again queries and returns the most relevant
 result, including the co-ordinates which are stored.
You can also have special logic to decide if you want to use spatial
search or just simple textual match would be better. I.e. you have
England in your example. It doesn't sound practical to return
coordinates and use spatial search for this use case, right?

HTH,
Alexey


Custom Geocoder with Solr and Autosuggest

2012-08-13 Thread Spadez
Hi,

I want to create a very simple geocoder for returning co-ordinates of a
place if a user enters in a town or city. There seems to be very little
information about doing it the way I suggest, so I hope I am on a good path.

My first decision was to divide SOLR into two cores, since I am already
using SOLR as my search server. One core would be for the main search of the
site and one for the geocoding.

My second decision is to store the name data in a normalised state, some
examples are shown below:
London, England
England
Swindon, Wiltshire, England

The third decision was to return “autosuggest” results, for example when the
user types “Lond” I would like to suggest “London, England”. For this to
work I think it makes sense to return up to 5 results via JSON based on
relevancy and have these displayed under the search box.

My fourth decision is that when the user actually hits the “search” button
on the location field, SOLR is again queries and returns the most relevant
result, including the co-ordinates which are stored.

Am I on a good path here? 




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Custom-Geocoder-with-Solr-and-Autosuggest-tp4000791.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr Autosuggest

2012-06-20 Thread Shri Kanish
Hi,
I have a question regarding solr Autosuggest. (If this is not the correct link 
to Post, Please suggest).
 
I have implemented solr Autosuggest with Suggester component. I have read in a 
blog saying, Currently implemented Lookups keep their data in memory, so 
unlike spellchecker data, this data is discarded on core reload and not 
available until you invoke the build command, either explicitly or implicitly 
during a commit.
 
I have a Master-Slave setup. If i add new documents to Master and give commit, 
then suggest would be built( as i gave given buildOnCommit=true). But, when 
replication is done, the Slave would reload the core, At that point, will it 
affect Autosuggestion of the newly added docs.
 
Thanks,
Shri

WFST with autosuggest/geo

2012-05-22 Thread William Bell
Does anyone have the slides or sample code from:

Building Query Auto-Completion Systems with Lucene 4.0
Presented by Sudarshan Gaikaiwari, Software Engineer,Yelp

We want to implement WFST with GEO boosting.


-- 
Bill Bell
billnb...@gmail.com
cell 720-256-8076


Re: Problems with AutoSuggest feature(Terms Components)

2011-11-23 Thread Erick Erickson
I'll have to defer that to one of the sharding experts.

Best
Erick

On Tue, Nov 22, 2011 at 1:28 PM, mechravi25 mechrav...@yahoo.co.in wrote:
 Hi Erick,

 Thanks for your reply. I would know all the options that can be given under
 the defaults section and how they can be overridden. is there any
 documentation available in solr forum. Cos we tried searching and wasn't
 able to succeed.

 My Exact scenario is that, I have one master core which has many underlying
 shards core(Disturbed architecture). I want the terms.limit should be
 defaulted to 10 in the underlying shards cores. When i hit the master core,
 it will in-turn hit the underlying shard cores. At this point of time, the
 terms.limit which has been passed to the master core has to passed to these
 underlying shard cores overriding the default value set. Can you please
 suggest the definition of the terms component for the underlying shard
 cores.

 Regards,
 Sivaganesh


 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Problems-with-AutoSuggest-feature-Terms-Components-tp3512734p3528597.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Problems with AutoSuggest feature(Terms Components)

2011-11-22 Thread mechravi25
Hi Erick,

Thanks for your reply. I would know all the options that can be given under
the defaults section and how they can be overridden. is there any
documentation available in solr forum. Cos we tried searching and wasn't
able to succeed. 

My Exact scenario is that, I have one master core which has many underlying
shards core(Disturbed architecture). I want the terms.limit should be
defaulted to 10 in the underlying shards cores. When i hit the master core,
it will in-turn hit the underlying shard cores. At this point of time, the
terms.limit which has been passed to the master core has to passed to these
underlying shard cores overriding the default value set. Can you please
suggest the definition of the terms component for the underlying shard
cores.

Regards,
Sivaganesh
 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Problems-with-AutoSuggest-feature-Terms-Components-tp3512734p3528597.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Problems with AutoSuggest feature(Terms Components)

2011-11-17 Thread Erick Erickson
TermsComponent only reacts to what you send it. How are these requests
getting to the TermsComponent?  That's where you should look.

As far as terms.limit, your requesthandler for TermsComponent in
solrconfig.xml has a defaults section and you can set whatever
you want in there and then override it as you choose if you
sometimes want other values in there.

Best
Erick

On Wed, Nov 16, 2011 at 9:17 AM, mechravi25 mechrav...@yahoo.co.in wrote:
 Hi,

 When i search for a data i noticed two things

 1.) I noticed that *terms.regex=.** in the logs which does a blank search
 on terms because of the query time is more. Is there anyway to overcome
 this. My actual query should go like the first one bolded but instead of
 that it happens like in the second case(the 2nd text highlighted in bold)

 2.) Also I noticed that *terms.limit=-1* which is very expensive as it asks
 solr to return all the terms back. It should be set to 10 or 20 at most.
 Please provide some suggestions to set the same.



 Nov 14, 2011 2:04:08 PM org.apache.solr.core.SolrCore execute
 INFO: [db] webapp=/solr path=/terms
 params={*terms.regex=ABC\+CCC\+lll*\+data.*terms.regex.flag=case_insensitiveterms.fl=nameFacet}
 status=0 QTime=935
 Nov 14, 2011 2:04:08 PM org.apache.solr.core.SolrCore execute
 INFO: [core2] webapp=/solr path=/terms
 params={terms.regex.flag=case_insensitiveshards.qt=/termsterms.fl=nameFacetterms=trueterms.limit=-1terms.regex=ABC\+CCC\+lll\+data.*isShard=trueqt=/termswt=javabinterms.sort=indexversion=1}
 status=0 QTime=842
 Nov 14, 2011 2:04:08 PM org.apache.solr.core.SolrCore execute
 INFO: [db] webapp=/solr path=/terms
 params={terms.regex=ABC\+CCC\+lll\+data.*terms.regex.flag=case_insensitiveterms.fl=nameFacet}
 status=0 QTime=927
 Nov 14, 2011 2:04:08 PM org.apache.solr.core.SolrCore execute
 INFO: [core3] webapp=/solr path=/terms
 params={terms.regex.flag=case_insensitiveshards.qt=/termsterms.fl=nameFacetterms=trueterms.limit=-1terms.regex=.*isShard=trueqt=/termswt=javabinterms.sort=indexversion=1}
 status=0 QTime=115

 Nov 14, 2011 2:05:55 PM org.apache.solr.core.SolrCore execute
 INFO: [core1] webapp=/solr path=/terms
 params={terms.regex.flag=case_insensitiveshards.qt=/termsterms.fl=nameFacetterms=trueterms.limit=-1*terms.regex=.**isShard=trueqt=/termswt=javabinterms.sort=indexversion=1}
 status=0 QTime=106767
 Nov 14, 2011 2:05:55 PM org.apache.solr.core.SolrCore execute
 INFO: [core4] webapp=/solr path=/terms
 params={terms.regex.flag=case_insensitiveshards.qt=/termsterms.fl=nameFacetterms=trueterms.limit=-1terms.regex=.*isShard=trueqt=/termswt=javabinterms.sort=indexversion=1}
 status=0 QTime=106766
 Nov 14, 2011 2:05:55 PM org.apache.solr.core.SolrCore execute

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Problems-with-AutoSuggest-feature-Terms-Components-tp3512734p3512734.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Problems with AutoSuggest feature(Terms Components)

2011-11-16 Thread mechravi25
Hi,

When i search for a data i noticed two things

1.) I noticed that *terms.regex=.** in the logs which does a blank search
on terms because of the query time is more. Is there anyway to overcome
this. My actual query should go like the first one bolded but instead of
that it happens like in the second case(the 2nd text highlighted in bold)

2.) Also I noticed that *terms.limit=-1* which is very expensive as it asks
solr to return all the terms back. It should be set to 10 or 20 at most.
Please provide some suggestions to set the same.



Nov 14, 2011 2:04:08 PM org.apache.solr.core.SolrCore execute
INFO: [db] webapp=/solr path=/terms
params={*terms.regex=ABC\+CCC\+lll*\+data.*terms.regex.flag=case_insensitiveterms.fl=nameFacet}
status=0 QTime=935 
Nov 14, 2011 2:04:08 PM org.apache.solr.core.SolrCore execute
INFO: [core2] webapp=/solr path=/terms
params={terms.regex.flag=case_insensitiveshards.qt=/termsterms.fl=nameFacetterms=trueterms.limit=-1terms.regex=ABC\+CCC\+lll\+data.*isShard=trueqt=/termswt=javabinterms.sort=indexversion=1}
status=0 QTime=842 
Nov 14, 2011 2:04:08 PM org.apache.solr.core.SolrCore execute
INFO: [db] webapp=/solr path=/terms
params={terms.regex=ABC\+CCC\+lll\+data.*terms.regex.flag=case_insensitiveterms.fl=nameFacet}
status=0 QTime=927 
Nov 14, 2011 2:04:08 PM org.apache.solr.core.SolrCore execute
INFO: [core3] webapp=/solr path=/terms
params={terms.regex.flag=case_insensitiveshards.qt=/termsterms.fl=nameFacetterms=trueterms.limit=-1terms.regex=.*isShard=trueqt=/termswt=javabinterms.sort=indexversion=1}
status=0 QTime=115 

Nov 14, 2011 2:05:55 PM org.apache.solr.core.SolrCore execute
INFO: [core1] webapp=/solr path=/terms
params={terms.regex.flag=case_insensitiveshards.qt=/termsterms.fl=nameFacetterms=trueterms.limit=-1*terms.regex=.**isShard=trueqt=/termswt=javabinterms.sort=indexversion=1}
status=0 QTime=106767 
Nov 14, 2011 2:05:55 PM org.apache.solr.core.SolrCore execute
INFO: [core4] webapp=/solr path=/terms
params={terms.regex.flag=case_insensitiveshards.qt=/termsterms.fl=nameFacetterms=trueterms.limit=-1terms.regex=.*isShard=trueqt=/termswt=javabinterms.sort=indexversion=1}
status=0 QTime=106766 
Nov 14, 2011 2:05:55 PM org.apache.solr.core.SolrCore execute

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Problems-with-AutoSuggest-feature-Terms-Components-tp3512734p3512734.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: autosuggest combination of data from documents and popular queries

2011-09-29 Thread abhayd
hi Hoss,
This helps.

Only thing i am not sure is use of TermsComponent. As I understand
TermsComponent allows sorking only on count|index. So I m not sure how
popularity could be used for sort or boost.

Any thoughts around using TermsComponent with popularity? If this is
possible then i dont think I would even need ngrams at all

Any suggestions?

abhay

--
View this message in context: 
http://lucene.472066.n3.nabble.com/autosuggest-combination-of-data-from-documents-and-popular-queries-tp3360657p3378874.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: autosuggest combination of data from documents and popular queries

2011-09-29 Thread abhayd
anyone?

How to sort for termscomponent?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/autosuggest-combination-of-data-from-documents-and-popular-queries-tp3360657p3381201.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: autosuggest combination of data from documents and popular queries

2011-09-28 Thread Chris Hostetter

: If user starts typing m i wil show mango as suggestion. And other
: suggestions should come from the document title in index. So if I have a
: document in index with title Man .. so suggestions would be
: mango
: man
...
: Is this doable ? any options ?

It's totally doable, and you've already done the hard part by building up 
a database of the popular queries you want to seed the suggestions with, 
abd building up an suggestion index where each document corrisponds to a 
single suggestion.
  
but in order to also have suggestions come from the fields of your 
main index, you'll need to also add them as individual documents to that same 
suggestion index.

you could either get those field values from whatever original source you 
used, or you crawl your own solr index.  If you want individual *terms* 
from the index to be added as suggestions, then the LukeRequestHandler or 
the TermsComponent would probably be the easiest way to extract them.

-Hoss


Re: autosuggest combination of data from documents and popular queries

2011-09-28 Thread abhayd
hi hoss,
This helps..
But as I understand TermsComponent does not allow sort on popularity..Just
coun|index. Or I m missing something?

If TermsComponent allows custom sorting i dont even have to use ngrams.

Any thoughts?

abhay



--
View this message in context: 
http://lucene.472066.n3.nabble.com/autosuggest-combination-of-data-from-documents-and-popular-queries-tp3360657p3378096.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: autosuggest combination of data from documents and popular queries

2011-09-23 Thread abhayd
hi

My requirement is 
i have a list of popular search terms in database
seachterm | count
---
mango  | 100

Consider i have only oneterm in that table, mango. I use edgengram and put
that in auto_complete field in solr index with count.

If user starts typing m i wil show mango as suggestion. And other
suggestions should come from the document title in index. So if I have a
document in index with title Man .. so suggestions would be
mango
man

Now say user starts typing sa now i dont have a popular search term then
it should show suggestions from index data 

Is this doable ? any options ?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/autosuggest-combination-of-data-from-documents-and-popular-queries-tp3360657p3362049.html
Sent from the Solr - User mailing list archive at Nabble.com.


Autosuggest best practice / feedback

2011-09-22 Thread Doug McKenzie

Hi there,

I'm relatively new to Solr and have been playing around with it for a 
few weeks now. I've got a system setup now that I'm currently quite 
happy with and is returning some decent results (although there's always 
room for improvement). Just hoping to get some feedback on the setup


Currently running 2 seperate Solr engines, one tasked with storing 
products and their various info, the other is storing previous site 
searches and is being used for auto suggest functionality.


The auto suggest schema :

fieldType name=text_ngram class=solr.TextField 
positionIncrementGap=100

analyzer type=index
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.StopFilterFactory ignoreCase=true 
words=stopwords_en.txt enablePositionIncrement=true/
filter class=solr.EdgeNGramFilterFactory minGramSize=2 
maxGramSize=15 side=front/

/analyzer
analyzer type=query
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
/analyzer
/fieldType

Stopwords is being used to filter out rude words from previous searches 
(is this the best way of doing things?)


Also looking at implementing a Did you mean? suggestor which will 
probably search against a WhitespaceTokened field of the same data 
rather than this one.


Any thoughts / feedback / comments / criticism / biscuits appreciated

Cheers
Doug

--
Become a Firebox Fan on Facebook: http://facebook.com/firebox
And Follow us on Twitter: http://twitter.com/firebox

Firebox has been nominated for Retailer of the Year in the 2011 Stuff Awards. 
Who will win? It's up to you! Visit http://www.stuff.tv/awards and place your 
vote. We'll do a special dance if it's us.

Firebox HQ is MOVING HOUSE! We're migrating from Streatham Hill to  shiny new 
digs in Shoreditch. As of 3rd October please update your records to:
Firebox.com, 6.10 The Tea Building, 56 Shoreditch High Street, London, E1 6JJ

Global Head Office: Firebox House, Ardwell Road, London SW2 4RT
Firebox.com Ltd is registered in England and Wales, company number 3874477
Registered Company Address: 41 Welbeck Street London W1G 8EA Firebox.com

Any views expressed in this email are those of the individual sender, except 
where the sender expressly, and with authority, states them to be the views of 
Firebox.com Ltd.


autosuggest combination of data from documents and popular queries

2011-09-22 Thread abhayd
hi 
we already have autosuggest working using solr based on popular search
terms.
we use following approach..
http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/

Now we want to use data indexed in solr also for autosuggest. with popular
search terms to have higher priority.

can we just copy field containing doc text to a auto suggest filed which
does edgengram analysis?
also we have around 100 K docs in index so performance would be be a
concern?

Any help is really appreciated 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/autosuggest-combination-of-data-from-documents-and-popular-queries-tp3360657p3360657.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: autosuggest combination of data from documents and popular queries

2011-09-22 Thread Otis Gospodnetic
Hello,

hi 
we already have autosuggest working using solr based on popular search
terms.

Just terms of whole queries?  I assume the latter.

we use following approach..
http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/

Now we want to use data indexed in solr also for autosuggest. with popular
search terms to have higher priority.

can we just copy field containing doc text to a auto suggest filed which
does edgengram analysis?

Something doesn't feel right here.  Using data from the index for suggestions 
makes sense - we do that on http://search-lucene.com/ for example.
Popular search terms having high priority and doc text, how does that work?
Oh, you mean if you have a doc with field body whose value is foo bar baz 
then, assuming the term bar is one of those popular search terms you would 
want bar to come up as a suggestion?

That's doable with some coding, yes, but I don't think this would create a very 
good search experience.

Here are some thoughts:
* instead of suggesting popular query terms, suggest popular query strings
* suggest phrases such as query strings, titles from a title field if you have 
it, author names from an author name field if you have it, and other fields of 
that nature
* ...

also we have around 100 K docs in index so performance would be be a
concern?


I think that depends on the implementation.  For example, suggestions you see 
on search-lucene.com are powered 
by http://sematext.com/products/autocomplete/index.html and that solution works 
well with millions of suggestions.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/


How can I create a good autosuggest list with phrases?

2011-08-04 Thread Shawn Heisey
I'm at the point in my Solr deployment where I want to start using it 
for autosuggest, but I've run into a snag.  Because the fields that I 
want to use for autosuggest are tokenized, I can only get single terms 
out of it.  I would like to have it find common phrases that are between 
two and five words long, so that if someone starts typing ang their 
autosuggest list will include Angelina Jolie as well as possibly Brad 
Pitt and Angelina Jolie.


My index is already quite large, so I do not want to add shingles.  I 
tried to use the clustering component, but that will only give you 
halfway decent results if you make the rows= parameter absolutely huge 
and therefore things run very slowly.  Also, it only works against 
stored fields, so I can only run it against the field where we retrieve 
captions, not the full description.  It's impractical to get results 
based on an entire index, much less all seven shards.


I'm OK with offline analysis to generate a list of suggestions, and I'm 
also OK with doing that analysis against the MySQL data source rather 
than Solr.  I just need some pointers about what software and/or 
techniques I can use to generate a good list, and then some idea of how 
to configure Solr to use that list.  Can anyone help?


Thanks,
Shawn



Re: How can I create a good autosuggest list with phrases?

2011-08-04 Thread Sethi, Parampreet
We handled similar requirement in our product kitchendaily.com by creating a
list of Search terms which were frequently searched over a period of time
and then building auto-suggestion index from this data. The constant updates
of this will allow you to support a well formed auto-suggest feature. This
is a good and faster solution if you have application logs to start with and
not very high volume of data.

Or you can search Solr with the user entered data, which returns all the
matching results and boost the data by field which will be used in
AutoSuggest box, use top 5 items in the dynamic div.

Hope it Helps.

-param


On 8/4/11 11:42 AM, Shawn Heisey s...@elyograg.org wrote:

 I'm at the point in my Solr deployment where I want to start using it
 for autosuggest, but I've run into a snag.  Because the fields that I
 want to use for autosuggest are tokenized, I can only get single terms
 out of it.  I would like to have it find common phrases that are between
 two and five words long, so that if someone starts typing ang their
 autosuggest list will include Angelina Jolie as well as possibly Brad
 Pitt and Angelina Jolie.
 
 My index is already quite large, so I do not want to add shingles.  I
 tried to use the clustering component, but that will only give you
 halfway decent results if you make the rows= parameter absolutely huge
 and therefore things run very slowly.  Also, it only works against
 stored fields, so I can only run it against the field where we retrieve
 captions, not the full description.  It's impractical to get results
 based on an entire index, much less all seven shards.
 
 I'm OK with offline analysis to generate a list of suggestions, and I'm
 also OK with doing that analysis against the MySQL data source rather
 than Solr.  I just need some pointers about what software and/or
 techniques I can use to generate a good list, and then some idea of how
 to configure Solr to use that list.  Can anyone help?
 
 Thanks,
 Shawn
 



Re: How can I create a good autosuggest list with phrases?

2011-08-04 Thread Shawn Heisey

On 8/4/2011 10:04 AM, Sethi, Parampreet wrote:

We handled similar requirement in our product kitchendaily.com by creating a
list of Search terms which were frequently searched over a period of time
and then building auto-suggestion index from this data. The constant updates
of this will allow you to support a well formed auto-suggest feature. This
is a good and faster solution if you have application logs to start with and
not very high volume of data.


I do have some separate plans to include data from our query logs, but 
I'd also like to get data from the index itself, more than one term at a 
time.


Thanks,
Shawn



Re: Solr Autosuggest help

2011-03-17 Thread rahul
Hi,

One more query.

Currently in the autosuggestion Solr returns words like below:

googl
googl _
googl search
googl chrome
googl map

The last letter seems to be missing in autosuggestion. I have send the query
as
?qt=/termsterms=trueterms.fl=mydataterms.lower=googterms.prefix=goog.

The following is my schema.xml for the Text filed.

fieldType name=text class=solr.TextField positionIncrementGap=100
 analyzer
tokenizer class=solr.WhitespaceTokenizerFactory
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt
filter class=solr.WordDelimiterFilterFactory
generateWordParts=0 generateNumberParts=1
catenateWords=0 catenateNumbers=1 catenateAll=0
splitOnCaseChange=1
  
filter class=solr.LowerCaseFilterFactory
filter class=solr.EnglishPorterFilterFactory
protected=protwords.txt  
filter class=solr.RemoveDuplicatesTokenFilterFactory
filter class=solr.ShingleFilterFactory maxShingleSize=2
outputUnigrams=true outputUnigramIfNoNgram=true
 analyzer
fieldType

Could anyone update what could be wrong? why the last letter get missing. It
occurs for a few word only. Suggestions for other words are good only.

One more query, how the word 'sci/tech' will be indexed in solr. If I search
on sci/tech it wont send any results.

Thanks in Advance.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Autosuggest-help-tp2580944p2692651.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Autosuggest help

2011-03-17 Thread rahul
hi,

We have found that 'EnglishPorterFilterFactory' causes that issue. I believe
that is used for stemming words. Once we commented that factory, it works
fine.

And another thing, currently I am checking about how the word 'sci/tech'
will be indexed in solr. As mentioned in my previous email, if I search on
sci/tech it wont send any results. But solr has the terms as sci/tech. When
I search on other terms which also contain sci/tech, it returns both the
words.

Please let me know, if you have any idea regarding that.. If I came to know
I will update this thread.

thanks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Autosuggest-help-tp2580944p2693601.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Autosuggest help

2011-03-17 Thread Otis Gospodnetic
Rahul,

Go to your Solr Admin Analysis page, enter sci/tech, check appropriate check 
boxes, and see how sci/tech gets analyzed.  This will lead you in the right 
direction.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: rahul asharud...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Thu, March 17, 2011 10:12:27 AM
 Subject: Re: Solr Autosuggest help
 
 hi,
 
 We have found that 'EnglishPorterFilterFactory' causes that issue. I  believe
 that is used for stemming words. Once we commented that factory, it  works
 fine.
 
 And another thing, currently I am checking about how the  word 'sci/tech'
 will be indexed in solr. As mentioned in my previous email,  if I search on
 sci/tech it wont send any results. But solr has the terms as  sci/tech. When
 I search on other terms which also contain sci/tech, it  returns both the
 words.
 
 Please let me know, if you have any idea  regarding that.. If I came to know
 I will update this  thread.
 
 thanks.
 
 
 
 --
 View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Autosuggest-help-tp2580944p2693601.html
 Sent  from the Solr - User mailing list archive at Nabble.com.
 


Re: Solr Autosuggest help

2011-03-07 Thread Ahmet Arslan
 I have added the following line in both the  section
 and in   section in
 schema.xml.
 
 filter class=solr.ShingleFilterFactory
 maxShingleSize=2
 outputUnigrams=true outputUnigramIfNoNgram=true
 
 And reindex my content. However, if I query solr for the
 multi work search
 terms suggestion , it only send the single word
 suggestions.
 
 http://localhost:8080/solr/mydata/select?qt=/termsterms=trueterms.fl=contentterms.lower=javaterms.prefix=javaterms.lower.incl=falseindent=true
 
 It wont return the words like 'java final', it only returns
 words like
 javadoc, javascript..
 
 Could any one update me how to correct this.. or what I am
 missing..

What happens when you add terms.limit=-1 to your search URL?

Or when you use java plus one blank character in terms.prefix?
terms.prefix=java indent=true

Can you see multi-word terms in admin/schema.jsp page?





Re: Solr Autosuggest help

2011-03-07 Thread rahul
hi..

thanks for your replies..

It seems I mistakenly put ShingleFilterFactory in another field. When I put
the factory in correct field it works fine now. 

Thanks.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Autosuggest-help-tp2580944p2645780.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Autosuggest help

2011-03-06 Thread rahul
Hi

I have added the following line in both the  section and in   section in
schema.xml.

filter class=solr.ShingleFilterFactory maxShingleSize=2
outputUnigrams=true outputUnigramIfNoNgram=true

And reindex my content. However, if I query solr for the multi work search
terms suggestion , it only send the single word suggestions.

http://localhost:8080/solr/mydata/select?qt=/termsterms=trueterms.fl=contentterms.lower=javaterms.prefix=javaterms.lower.incl=falseindent=true

It wont return the words like 'java final', it only returns words like
javadoc, javascript..

Could any one update me how to correct this.. or what I am missing..

thanks, 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Autosuggest-help-tp2580944p2645316.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr Autosuggest help

2011-02-26 Thread rahul
Hi,

I am using Solr  (1.4.1) AutoSuggest feature using termsComponent.

Currently, if I type 'goo' means, Solr suggest words like 'google'.

But I would like to receive suggestions like 'google, google alerts, ..' .
ie, suggestions with single and multiple terms.

Not sure, whether I need to use edgengrams for that. for eg, indexing google
like 'go', 'oo', 'og', ... . But I think I don't need this, Since I don't
want partial search. Please let me know if there is any way to do multiple
word suggestions .

Thanks in Advance. 

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Autosuggest-help-tp2580944p2580944.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Autosuggest help

2011-02-26 Thread Ahmet Arslan
 I am using Solr  (1.4.1) AutoSuggest feature using
 termsComponent.
 
 Currently, if I type 'goo' means, Solr suggest words like
 'google'.
 
 But I would like to receive suggestions like 'google,
 google alerts, ..' .
 ie, suggestions with single and multiple terms.
 
 Not sure, whether I need to use edgengrams for that. for
 eg, indexing google
 like 'go', 'oo', 'og', ... . But I think I don't need this,
 Since I don't
 want partial search. Please let me know if there is any way
 to do multiple
 word suggestions .

If you will stick with TermsComponent, you need to add ShingleFilterFactory to 
your index analyzer chain for that.

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ShingleFilterFactory





Autosuggest terms which GOOGLE uses?

2010-12-08 Thread Anurag

How Google selects the autosuggest terms? Is that Google uses Userrs
Queries from Log files to suggest only those terms?

-
Kumar Anurag

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Autosuggest-terms-which-GOOGLE-uses-tp2039078p2039078.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Autosuggest terms which GOOGLE uses?

2010-12-08 Thread Tanguy Moal
Kind of : their suggestions are based on users queries with some filtering.
You can have a little read there :
http://www.google.com/support/websearch/bin/answer.py?hl=enanswer=106230

They perform little filtering to remove offending content such as
hate speech, violence and pornography (quoting the page).
You can also have a look at this slideshow :
http://www.slideshare.net/sturlese/use-ofsolrattrovitclassifiedads-marcsturlese
.

You'll see how they build their suggest service using a dedicated solr instance.

Hope this helps ;-)

--
Tanguy

2010/12/8 Anurag anurag.it.jo...@gmail.com:

 How Google selects the autosuggest terms? Is that Google uses Userrs
 Queries from Log files to suggest only those terms?

 -
 Kumar Anurag

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Autosuggest-terms-which-GOOGLE-uses-tp2039078p2039078.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Autosuggest terms which GOOGLE uses?

2010-12-08 Thread Anurag

Thanks a lot!!

If I want to index query terms from lof files ? Is it possible . And then
want to do autosuggest query on all those terms using termsComponentTill
now my autosuggest options are like q.prefix= or q.suffix=   which matches
the terms available in the documents. 

-
Kumar Anurag

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Autosuggest-terms-which-GOOGLE-uses-tp2039078p2039307.html
Sent from the Solr - User mailing list archive at Nabble.com.


facet+shingle in autosuggest

2010-11-11 Thread Lukas Kahwe Smith
Hi,

I am using a facet.prefix search with shingle's in my autosuggest:
fieldType name=shingle class=solr.TextField positionIncrementGap=100 
stored=false multiValued=true
  analyzer
tokenizer class=solr.StandardTokenizerFactory /
filter class=solr.LowerCaseFilterFactory /
filter class=solr.RemoveDuplicatesTokenFilterFactory/
filter class=solr.ShingleFilterFactory
  maxShingleSize=3 outputUnigrams=true 
outputUnigramIfNoNgram=false /
  /analyzer
/fieldType

Now I would like to prevent stop words to appear in the suggestions:

lst name=autosuggest_shingle
int name=member states52/int
int name=member states experiencing6/int
int name=member states in6/int
int name=member states the5/int
int name=member states to25/int
int name=member states with7/int
/lst

Here I would like to filter out the last 4 suggestions really. Is there a way I 
can sensibly bring in a stop word filter here? Actually in theory the stop 
words could appear as the first or second word as well.

So I guess when producing shingle's I want to skip any stop word from being 
part of any shingle.

regards,
Lukas Kahwe Smith
m...@pooteeweet.org





Re: facet+shingle in autosuggest

2010-11-11 Thread Erick Erickson
I don't know all the implications here, but can't you just
insert the StopwordFilterFactory before the ShingleFilterFactory
and turn it loose?

Best
Erick

On Thu, Nov 11, 2010 at 4:02 PM, Lukas Kahwe Smith m...@pooteeweet.orgwrote:

 Hi,

 I am using a facet.prefix search with shingle's in my autosuggest:
fieldType name=shingle class=solr.TextField
 positionIncrementGap=100 stored=false multiValued=true
  analyzer
tokenizer class=solr.StandardTokenizerFactory /
filter class=solr.LowerCaseFilterFactory /
filter class=solr.RemoveDuplicatesTokenFilterFactory/
filter class=solr.ShingleFilterFactory
  maxShingleSize=3 outputUnigrams=true
 outputUnigramIfNoNgram=false /
  /analyzer
/fieldType

 Now I would like to prevent stop words to appear in the suggestions:

 lst name=autosuggest_shingle
 int name=member states52/int
 int name=member states experiencing6/int
 int name=member states in6/int
 int name=member states the5/int
 int name=member states to25/int
 int name=member states with7/int
 /lst

 Here I would like to filter out the last 4 suggestions really. Is there a
 way I can sensibly bring in a stop word filter here? Actually in theory the
 stop words could appear as the first or second word as well.

 So I guess when producing shingle's I want to skip any stop word from being
 part of any shingle.

 regards,
 Lukas Kahwe Smith
 m...@pooteeweet.org






Re: facet+shingle in autosuggest

2010-11-11 Thread Lukas Kahwe Smith

On 11.11.2010, at 17:42, Erick Erickson wrote:

 I don't know all the implications here, but can't you just
 insert the StopwordFilterFactory before the ShingleFilterFactory
 and turn it loose?


havent tried this, but i would suspect that i would then get in trouble with 
stuff like united states of america. it would then generate a shingle with 
united states america which in turn wouldnt generate a proper phrase search 
string.

one option of course would be to restrict the shingles to 2 words and then 
using the stop word filter would work as expected.

regards,
Lukas Kahwe Smith
m...@pooteeweet.org





phrase query with autosuggest (SOLR-1316)

2010-10-06 Thread mike anderson
It seemed like SOLR-1316 was a little too long to continue the conversation.

Is there support for quotes indicating a phrase query. For example, my
autosuggest query for mike sha ought to return mike shaffer, mike
sharp, etc. Instead I get suggestions for mike and for sha, resulting
in a collated result mike r meyer shaw,

Cheers,
Mike


RE: phrase query with autosuggest (SOLR-1316)

2010-10-06 Thread Robert Petersen
My simple but effective solution to that problem was to replace the
white spaces in the items you index for autosuggest with some special
character, then your wildcarding will work with the whole phrase as you
desire.

Index: mike_shaffer
Query: mike_sha*  

-Original Message-
From: mike anderson [mailto:saidthero...@gmail.com] 
Sent: Wednesday, October 06, 2010 7:33 AM
To: solr-user@lucene.apache.org
Subject: phrase query with autosuggest (SOLR-1316)

It seemed like SOLR-1316 was a little too long to continue the
conversation.

Is there support for quotes indicating a phrase query. For example, my
autosuggest query for mike sha ought to return mike shaffer, mike
sharp, etc. Instead I get suggestions for mike and for sha,
resulting
in a collated result mike r meyer shaw,

Cheers,
Mike


Re: phrase query with autosuggest (SOLR-1316)

2010-10-06 Thread Jonathan Rochkind
If you use Chantal's suggestion from an earlier thread, involving facets 
and tokenized fields, but not the tokens handling -- i think it will 
work. (But that solution requires only one auto-suggest value per 
document).


There are a bunch of ways people have figured out to do auto-suggest 
without putting it in an entirely seperate Solr core. They all have 
their issues and strengths and weaknesses, including a weakness of being 
kind of confusing to implement sometimes. I don't think anyone's come up 
with a general purpose works for everything isn't confusing solution yet.


Robert Petersen wrote:

My simple but effective solution to that problem was to replace the
white spaces in the items you index for autosuggest with some special
character, then your wildcarding will work with the whole phrase as you
desire.

Index: mike_shaffer
Query: mike_sha*  


-Original Message-
From: mike anderson [mailto:saidthero...@gmail.com] 
Sent: Wednesday, October 06, 2010 7:33 AM

To: solr-user@lucene.apache.org
Subject: phrase query with autosuggest (SOLR-1316)

It seemed like SOLR-1316 was a little too long to continue the
conversation.

Is there support for quotes indicating a phrase query. For example, my
autosuggest query for mike sha ought to return mike shaffer, mike
sharp, etc. Instead I get suggestions for mike and for sha,
resulting
in a collated result mike r meyer shaw,

Cheers,
Mike

  


Re: Autosuggest with inner phrases

2010-10-04 Thread Otis Gospodnetic
Or, plug, this: http://www.sematext.com/products/autocomplete/index.html , 
which happens to use the same bass examples as the original poster. :)

You can see this Autosuggest in action on http://search-lucene.com/ .

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch



- Original Message 
 From: Jason Rutherglen jason.rutherg...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Sat, October 2, 2010 3:40:52 PM
 Subject: Re: Autosuggest with inner phrases
 
 This's what yer lookin' for:
 
http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
/
 
 On  Sat, Oct 2, 2010 at 3:14 AM, sivaprasad sivaprasa...@echidnainc.com  
wrote:
 
  Hi ,
  I implemented the auto suggest using terms  component.But the suggestions 
are
  coming from the starting of the  word.But i want inner phrases also.For
  example, if I type bass  Auto-Complete should offer suggestions that
  include bass fishing  or  bass guitar, and even sea bass (note how
  bass is not necessarily  the first word).
 
  How can i achieve this using solr's terms  component.
 
  Regards,
  Siva
  --
  View this  message in context: 
http://lucene.472066.n3.nabble.com/Autosuggest-with-inner-phrases-tp1619326p1619326.html

   Sent from the Solr - User mailing list archive at Nabble.com.
 
 


Re: Autosuggest with inner phrases

2010-10-03 Thread Arunkumar Ayyavu
I had the same question few days back. You can look at the solution
suggested by Chantal in this link.
http://www.lucidimagination.com/search/document/9bbce5302bd3940e/autocomplete_match_words_anywhere_in_the_token#cec7133bbaf5b49c

On Sat, Oct 2, 2010 at 3:44 PM, sivaprasad sivaprasa...@echidnainc.comwrote:


 Hi ,
 I implemented the auto suggest using terms component.But the suggestions
 are
 coming from the starting of the word.But i want inner phrases also.For
 example, if I type bass Auto-Complete should offer suggestions that
 include bass fishing  or bass guitar, and even sea bass (note how
 bass is not necessarily the first word).

 How can i achieve this using solr's terms component.

 Regards,
 Siva
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Autosuggest-with-inner-phrases-tp1619326p1619326.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Arun


Re: Autosuggest with inner phrases

2010-10-03 Thread Bhavnik Gajjar
  Hi,

This thread can be useful

http://www.lucidimagination.com/search/document/9edc01a90a195336/enhancing_auto_complete#d1340d7715162608

Regards,
Bhavnik

On 10/3/2010 11:51 PM, Arunkumar Ayyavu wrote:
 I had the same question few days back. You can look at the solution
 suggested by Chantal in this link.
 http://www.lucidimagination.com/search/document/9bbce5302bd3940e/autocomplete_match_words_anywhere_in_the_token#cec7133bbaf5b49c

 On Sat, Oct 2, 2010 at 3:44 PM, sivaprasadsivaprasa...@echidnainc.comwrote:

 Hi ,
 I implemented the auto suggest using terms component.But the suggestions
 are
 coming from the starting of the word.But i want inner phrases also.For
 example, if I type bass Auto-Complete should offer suggestions that
 include bass fishing  or bass guitar, and even sea bass (note how
 bass is not necessarily the first word).

 How can i achieve this using solr's terms component.

 Regards,
 Siva
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Autosuggest-with-inner-phrases-tp1619326p1619326.html
 Sent from the Solr - User mailing list archive at Nabble.com.


The contents of this eMail including the contents of attachment(s) are 
privileged and confidential material of Gateway NINtec Pvt. Ltd. (GNPL) and 
should not be disclosed to, used by or copied in any manner by anyone other 
than the intended addressee(s). If this eMail has been received by error, 
please advise the sender immediately and delete it from your system. The views 
expressed in this eMail message are those of the individual sender, except 
where the sender expressly, and with authority, states them to be the views of 
GNPL. Any unauthorized review, use, disclosure, dissemination, forwarding, 
printing or copying of this eMail or any action taken in reliance on this eMail 
is strictly prohibited and may be unlawful. This eMail may contain viruses. 
GNPL has taken every reasonable precaution to minimize this risk, but is not 
liable for any damage you may sustain as a result of any virus in this eMail. 
You should carry out your own virus checks before opening the eMail or 
attachment(s). GNPL is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt. GNPL reserves the right to monitor and review the content of all 
messages sent to or from this eMail address and may be stored on the GNPL eMail 
system. In case this eMail has reached you in error, and you  would no longer 
like to receive eMails from us, then please send an eMail to 
d...@gatewaynintec.com



  1   2   >