Re: Autosuggest help
Any update? On Thu, 4 Apr 2019, 1:09 pm Midas A, wrote: > Hi, > > We need to use auto suggest click stream data in Auto suggestion . How we > can achieve this ? > > Currently we are using suggester for auto suggestions . > > > Regards, > Midas >
Autosuggest help
Hi, We need to use auto suggest click stream data in Auto suggestion . How we can achieve this ? Currently we are using suggester for auto suggestions . Regards, Midas
Re: Solr AutoSuggest Configuration Issue
Context filtering, at least using the suggest.cfq parameter, was not introduced before Solr 6 to my knowledge. As Edwin, I highly recommend updating. On Mon, Oct 8, 2018 at 2:20 PM Manu Nair wrote: > Hi, > > I am using Solr 5.1 for my application. > I am trying to use the autoSuggest feature of Solr. > I want to do context filtering on the results returned by Solr suggest. > > Please help me know if this feature is supported in the version that I am > using(5.1). > Also if it works with multivalued field. I tried multiple times but it is > not working. > > I am referring the following link for details : > https://lucene.apache.org/solr/guide/6_6/suggester.html > > Please find the configuration in my solrconfig.xml as below > > > mySuggester > AnalyzingInfixLookupFactory > DocumentDictionaryFactory > name > price > text_en > false > countries > > > > Thanks alot for your help in advance. > > Regards, > Manu Nair. >
Re: Solr AutoSuggest Configuration Issue
The link that you are referring to is for Solr 6.6, but you are using Solr 5.1 which is quite an old version, so there could be some differences. You can refer this guide for Solr 5.1: http://archive.apache.org/dist/lucene/solr/ref-guide/apache-solr-ref-guide-5.1.pdf The current version of Solr is already Solr 7.5, and it is recommended to upgrade to the new version so that you can use the new features and and also things like better memory consumption and better authentication Regards, Edwin On Mon, 8 Oct 2018 at 20:20, Manu Nair wrote: > Hi, > > I am using Solr 5.1 for my application. > I am trying to use the autoSuggest feature of Solr. > I want to do context filtering on the results returned by Solr suggest. > > Please help me know if this feature is supported in the version that I am > using(5.1). > Also if it works with multivalued field. I tried multiple times but it is > not working. > > I am referring the following link for details : > https://lucene.apache.org/solr/guide/6_6/suggester.html > > Please find the configuration in my solrconfig.xml as below > > > mySuggester > AnalyzingInfixLookupFactory > DocumentDictionaryFactory > name > price > text_en > false > countries > > > > Thanks alot for your help in advance. > > Regards, > Manu Nair. >
Solr AutoSuggest Configuration Issue
Hi, I am using Solr 5.1 for my application. I am trying to use the autoSuggest feature of Solr. I want to do context filtering on the results returned by Solr suggest. Please help me know if this feature is supported in the version that I am using(5.1). Also if it works with multivalued field. I tried multiple times but it is not working. I am referring the following link for details : https://lucene.apache.org/solr/guide/6_6/suggester.html Please find the configuration in my solrconfig.xml as below mySuggester AnalyzingInfixLookupFactory DocumentDictionaryFactory name price text_en false countries Thanks alot for your help in advance. Regards, Manu Nair.
Re: Solr 6.5 autosuggest suggests misspelt words and unwanted words
Hi, you should curate your data, that is fundamental to have an healthy search solution, but let's see what you can do anyway : 1) curate a dictionary of such bad words and then configure analysis to skip them 2) Have you tried different dictionary implementations ? I would assume that each single mispelled word has a low document frequency. You could use the High Frequency Document Dictionary[1] and see how it goes. [1] https://lucene.apache.org/solr/guide/7_3/suggester.html#highfrequencydictionaryfactory - --- Alessandro Benedetti Search Consultant, R Software Engineer, Director Sease Ltd. - www.sease.io -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Solr 6.5 autosuggest suggests misspelt words and unwanted words
Hi , My Data contains un-curated data - which consists of *cuss words, misspelt words* like *nd* instead of *need. *We are using a auto-suggest/auto-complete that heavily relies on indexed data to recommend suggestions as the user types in his query. We are using a list of stop words consisting of cuss words to keep check on what is recommended to the user and this list might get huge with time as well. Is there any clean way to get around the problem 1. of eliminating cuss words entirely in suggestions 2. not suggesting misspelt words at all. Thanks and Regards, Sri
Re: autosuggest with solr.EdgeNGramFilterFactory no result found
Thanx Erick, Your blog article was the perfect answer to my problem. Rgds, Roland 2015-07-03 18:57 GMT+02:00 Erick Erickson erickerick...@gmail.com: OK, I think you took a wrong turn at the bakery The FST-based suggesters are intended to look at the beginnings of fields. It is totally unnecessary to use ngrams, the FST that gets built does that _for_ you. Actually it builds an internal FST structure that does this en passant. For getting whole fields that are anywhere in the input field, you probably want to think about AnalyzingInfixSuggester or FreeTextSuggester. The important bit here is that you shouldn't have to do so much work... This might help: http://lucidworks.com/blog/solr-suggester/ Best, Erick On Fri, Jul 3, 2015 at 4:40 AM, Roland Szűcs roland.sz...@bookandwalk.com wrote: I tried to setup an autosuggest feature with multiple dictionaries for title , author and publisher fields. I used the solr.EdgeNGramFilterFactory to optimize the performance of the auto suggest. I have a document in the index with title: Romana. When I test the text analysis for auto suggest (on filed of title_suggest_ngram): ENGTF textraw_bytesstartendpositionLengthtypeposition rom[72 6f 6d]061word1roma[72 6f 6d 61]061word1roman[72 6f 6d 61 6e]061word1 romana[72 6f 6d 61 6e 61]061word1 If I try to run http://localhost:8983/solr/bandw/suggest?q=Roma, I get: response lst name=responseHeader int name=status0/int int name=QTime1/int /lst lst name=suggest lst name=suggest_publisher lst name=Roma int name=numFound0/int arr name=suggestions/ /lst /lst lst name=suggest_title lst name=Roma int name=numFound0/int arr name=suggestions/ /lst /lst lst name=suggest_author lst name=Roma int name=numFound0/int arr name=suggestions/ /lst /lst /lst /response my relevant field definitions: field name=id type=string indexed=true stored=true required=true multiValued=false omitNorms=true / field name=author type=text_hu indexed=true stored=true multiValued=true/ field name=title type=text_hu indexed=true stored=true multiValued=false/ field name=subtitle type=text_hu indexed=true stored=true multiValued=false/ field name=publisher type=text_hu indexed=true stored=true multiValued=false/ field name=title_suggest_ngram type=text_hu_suggest_ngram indexed=true stored=false multiValued=false omitNorms=true/ field name=author_suggest_ngram type=text_hu_suggest_ngram indexed=true stored=false multiValued=false omitNorms=true/ field name=publisher_suggest_ngram type=text_hu_suggest_ngram indexed=true stored=false multiValued=false omitNorms=true/ copyField source=title dest=title_suggest_ngram/ copyField source=author dest=author_suggest_ngram/ copyField source=publisher dest=publisher_suggest_ngram/ My EdgeNGram related field type definition: fieldType name=text_hu_suggest_ngram class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_hu.txt / filter class=solr.LowerCaseFilterFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=3 maxGramSize=8/ /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_hu.txt / filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType My requesthandler for suggest: requestHandler name=/suggest class=solr.SearchHandler startup=lazy lst name=defaults str name=suggesttrue/str str name=suggest.count5/str str name=suggest.dictionarysuggest_author/str str name=suggest.dictionarysuggest_title/str str name=suggest.dictionarysuggest_publisher/str /lst arr name=components strsuggest/str /arr /requestHandler And finally my searchcomponent: searchComponent name=suggest class=solr.SuggestComponent lst name=suggester str name=namesuggest_title/str str name=lookupImplFSTLookupFactory/str str name=dictionaryImplDocumentDictionaryFactory/str str name=fieldtitle_suggest_ngram/str str name=wightFieldprice/str str name=builOnStartuptrue/str str name=buildOnCommittrue/str /lst lst name=suggester str name=namesuggest_author/str str name=lookupImplFSTLookupFactory/str str name=dictionaryImplDocumentDictionaryFactory/str str name=fieldauthor_suggest_ngram/str str name=wightFieldprice/str str name=builOnStartuptrue/str str name=buildOnCommittrue/str /lst lst name=suggester str name=namesuggest_publisher/str str name=lookupImplFSTLookupFactory/str str name=dictionaryImplDocumentDictionaryFactory/str str name=fieldpublisher_suggest_ngram/str str name
Re: autosuggest with solr.EdgeNGramFilterFactory no result found
OK, I think you took a wrong turn at the bakery The FST-based suggesters are intended to look at the beginnings of fields. It is totally unnecessary to use ngrams, the FST that gets built does that _for_ you. Actually it builds an internal FST structure that does this en passant. For getting whole fields that are anywhere in the input field, you probably want to think about AnalyzingInfixSuggester or FreeTextSuggester. The important bit here is that you shouldn't have to do so much work... This might help: http://lucidworks.com/blog/solr-suggester/ Best, Erick On Fri, Jul 3, 2015 at 4:40 AM, Roland Szűcs roland.sz...@bookandwalk.com wrote: I tried to setup an autosuggest feature with multiple dictionaries for title , author and publisher fields. I used the solr.EdgeNGramFilterFactory to optimize the performance of the auto suggest. I have a document in the index with title: Romana. When I test the text analysis for auto suggest (on filed of title_suggest_ngram): ENGTF textraw_bytesstartendpositionLengthtypeposition rom[72 6f 6d]061word1roma[72 6f 6d 61]061word1roman[72 6f 6d 61 6e]061word1 romana[72 6f 6d 61 6e 61]061word1 If I try to run http://localhost:8983/solr/bandw/suggest?q=Roma, I get: response lst name=responseHeader int name=status0/int int name=QTime1/int /lst lst name=suggest lst name=suggest_publisher lst name=Roma int name=numFound0/int arr name=suggestions/ /lst /lst lst name=suggest_title lst name=Roma int name=numFound0/int arr name=suggestions/ /lst /lst lst name=suggest_author lst name=Roma int name=numFound0/int arr name=suggestions/ /lst /lst /lst /response my relevant field definitions: field name=id type=string indexed=true stored=true required=true multiValued=false omitNorms=true / field name=author type=text_hu indexed=true stored=true multiValued=true/ field name=title type=text_hu indexed=true stored=true multiValued=false/ field name=subtitle type=text_hu indexed=true stored=true multiValued=false/ field name=publisher type=text_hu indexed=true stored=true multiValued=false/ field name=title_suggest_ngram type=text_hu_suggest_ngram indexed=true stored=false multiValued=false omitNorms=true/ field name=author_suggest_ngram type=text_hu_suggest_ngram indexed=true stored=false multiValued=false omitNorms=true/ field name=publisher_suggest_ngram type=text_hu_suggest_ngram indexed=true stored=false multiValued=false omitNorms=true/ copyField source=title dest=title_suggest_ngram/ copyField source=author dest=author_suggest_ngram/ copyField source=publisher dest=publisher_suggest_ngram/ My EdgeNGram related field type definition: fieldType name=text_hu_suggest_ngram class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_hu.txt / filter class=solr.LowerCaseFilterFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=3 maxGramSize=8/ /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_hu.txt / filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType My requesthandler for suggest: requestHandler name=/suggest class=solr.SearchHandler startup=lazy lst name=defaults str name=suggesttrue/str str name=suggest.count5/str str name=suggest.dictionarysuggest_author/str str name=suggest.dictionarysuggest_title/str str name=suggest.dictionarysuggest_publisher/str /lst arr name=components strsuggest/str /arr /requestHandler And finally my searchcomponent: searchComponent name=suggest class=solr.SuggestComponent lst name=suggester str name=namesuggest_title/str str name=lookupImplFSTLookupFactory/str str name=dictionaryImplDocumentDictionaryFactory/str str name=fieldtitle_suggest_ngram/str str name=wightFieldprice/str str name=builOnStartuptrue/str str name=buildOnCommittrue/str /lst lst name=suggester str name=namesuggest_author/str str name=lookupImplFSTLookupFactory/str str name=dictionaryImplDocumentDictionaryFactory/str str name=fieldauthor_suggest_ngram/str str name=wightFieldprice/str str name=builOnStartuptrue/str str name=buildOnCommittrue/str /lst lst name=suggester str name=namesuggest_publisher/str str name=lookupImplFSTLookupFactory/str str name=dictionaryImplDocumentDictionaryFactory/str str name=fieldpublisher_suggest_ngram/str str name=wightFieldprice/str str name=buildOnCommittrue/str /lst /searchComponent If I change the search component definition to use title field instead of title_suggest_ngram tahn I manage to get suggest results only if my title field starts with the string specified in q parameter. As a filed level
autosuggest with solr.EdgeNGramFilterFactory no result found
I tried to setup an autosuggest feature with multiple dictionaries for title , author and publisher fields. I used the solr.EdgeNGramFilterFactory to optimize the performance of the auto suggest. I have a document in the index with title: Romana. When I test the text analysis for auto suggest (on filed of title_suggest_ngram): ENGTF textraw_bytesstartendpositionLengthtypeposition rom[72 6f 6d]061word1roma[72 6f 6d 61]061word1roman[72 6f 6d 61 6e]061word1 romana[72 6f 6d 61 6e 61]061word1 If I try to run http://localhost:8983/solr/bandw/suggest?q=Roma, I get: response lst name=responseHeader int name=status0/int int name=QTime1/int /lst lst name=suggest lst name=suggest_publisher lst name=Roma int name=numFound0/int arr name=suggestions/ /lst /lst lst name=suggest_title lst name=Roma int name=numFound0/int arr name=suggestions/ /lst /lst lst name=suggest_author lst name=Roma int name=numFound0/int arr name=suggestions/ /lst /lst /lst /response my relevant field definitions: field name=id type=string indexed=true stored=true required=true multiValued=false omitNorms=true / field name=author type=text_hu indexed=true stored=true multiValued=true/ field name=title type=text_hu indexed=true stored=true multiValued=false/ field name=subtitle type=text_hu indexed=true stored=true multiValued=false/ field name=publisher type=text_hu indexed=true stored=true multiValued=false/ field name=title_suggest_ngram type=text_hu_suggest_ngram indexed=true stored=false multiValued=false omitNorms=true/ field name=author_suggest_ngram type=text_hu_suggest_ngram indexed=true stored=false multiValued=false omitNorms=true/ field name=publisher_suggest_ngram type=text_hu_suggest_ngram indexed=true stored=false multiValued=false omitNorms=true/ copyField source=title dest=title_suggest_ngram/ copyField source=author dest=author_suggest_ngram/ copyField source=publisher dest=publisher_suggest_ngram/ My EdgeNGram related field type definition: fieldType name=text_hu_suggest_ngram class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_hu.txt / filter class=solr.LowerCaseFilterFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=3 maxGramSize=8/ /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_hu.txt / filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType My requesthandler for suggest: requestHandler name=/suggest class=solr.SearchHandler startup=lazy lst name=defaults str name=suggesttrue/str str name=suggest.count5/str str name=suggest.dictionarysuggest_author/str str name=suggest.dictionarysuggest_title/str str name=suggest.dictionarysuggest_publisher/str /lst arr name=components strsuggest/str /arr /requestHandler And finally my searchcomponent: searchComponent name=suggest class=solr.SuggestComponent lst name=suggester str name=namesuggest_title/str str name=lookupImplFSTLookupFactory/str str name=dictionaryImplDocumentDictionaryFactory/str str name=fieldtitle_suggest_ngram/str str name=wightFieldprice/str str name=builOnStartuptrue/str str name=buildOnCommittrue/str /lst lst name=suggester str name=namesuggest_author/str str name=lookupImplFSTLookupFactory/str str name=dictionaryImplDocumentDictionaryFactory/str str name=fieldauthor_suggest_ngram/str str name=wightFieldprice/str str name=builOnStartuptrue/str str name=buildOnCommittrue/str /lst lst name=suggester str name=namesuggest_publisher/str str name=lookupImplFSTLookupFactory/str str name=dictionaryImplDocumentDictionaryFactory/str str name=fieldpublisher_suggest_ngram/str str name=wightFieldprice/str str name=buildOnCommittrue/str /lst /searchComponent If I change the search component definition to use title field instead of title_suggest_ngram tahn I manage to get suggest results only if my title field starts with the string specified in q parameter. As a filed level autosuggester I would suggest also those matches which are not the first term of the title but any of them. What shall I make to use autosuggest correctly? -- https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/huRoland Szűcs https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/huConnect with me on Linkedin https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24 https://bookandwalk.hu/CEOPhone: +36 1 210 81 13Bookandwalk.hu https://bokandwalk.hu/
Re: Questions regarding autosuggest (Solr 5.2.1)
God damn. Thank you. *ashamed* Am 30.06.2015 00:21 schrieb Erick Erickson: Try not putting it in double quotes? Best, Erick On Mon, Jun 29, 2015 at 12:22 PM, Thomas Michael Engelke thomas.enge...@posteo.de wrote: A friend and I are trying to develop some software using Solr in the background, and with that comes alot of changes. We're used to older versions (4.3 and below). We especially have problems with the autosuggest feature. This is the field definition (schema.xml) for our autosuggest field: field name=autosuggest type=autosuggest indexed=true stored=true required=false multiValued=true / ... copyField source=name dest=autosuggest / ... fieldType name=autosuggest class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory splitOnCaseChange=0 splitOnNumerics=1 generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=0 catenateAll=0 preserveOriginal=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StopFilterFactory words=stopwords.txt ignoreCase=true enablePositionIncrements=true format=snowball/ filter class=solr.DictionaryCompoundWordTokenFilterFactory dictionary=dictionary.txt minWordSize=5 minSubwordSize=3 maxSubwordSize=30 onlyLongestMatch=false/ filter class=solr.GermanNormalizationFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=German2 protected=protwords.txt/ filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=30/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory splitOnCaseChange=0 splitOnNumerics=1 generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=0 catenateAll=0 preserveOriginal=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StopFilterFactory words=stopwords.txt ignoreCase=true enablePositionIncrements=true format=snowball/ filter class=solr.GermanNormalizationFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=German2 protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType Afterwards, we defined an autosuggest component to use this field, like this (solrconfig.xml): searchComponent name=suggest class=solr.SuggestComponent lst name=suggester str name=namemySuggester/str str name=lookupImplFuzzyLookupFactory/str str name=storeDirsuggester_fuzzy_dir/str str name=dictionaryImplDocumentDictionaryFactory/str str name=fieldsuggest/str str name=suggestAnalyzerFieldTypeautosuggest/str str name=buildOnStartupfalse/str str name=buildOnCommitfalse/str /lst /searchComponent And add a requesthandler to test out the functionality: requestHandler name=/suggesthandler class=solr.SearchHandler startup=lazy lst name=defaults str name=suggesttrue/str str name=suggest.count10/str str name=suggest.dictionarymySuggester/str /lst arr name=components strsuggest/str /arr /requestHandler However, trying to start the core that has this configuration, a long exception occurs, telling us this: Error in configuration: autosuggest is not defined in the schema Now, that seems to be wrong. Any idea how to fix that?
Re: Questions regarding autosuggest (Solr 5.2.1)
I would like to add some consideration if possible. I find the field type really hard analysed, are you sure is this ok with your suggestions requirement ? Usually is better to keep the field for suggestion as less analysed as possible and then play with the different type of suggesters. If you notice any additional problem, we can discuss through that, if not , it is ok ! Cheers 2015-06-30 13:48 GMT+01:00 Erick Erickson erickerick...@gmail.com: Pesky computers, they keep doing exactly what I tell 'em to do, not what I mean ;) I'll open a JIRA for making Solr DWIM-compliant, Do What I Mean ;) ;) On Tue, Jun 30, 2015 at 4:17 AM, Thomas Michael Engelke thomas.enge...@posteo.de wrote: God damn. Thank you. *ashamed* Am 30.06.2015 00:21 schrieb Erick Erickson: Try not putting it in double quotes? Best, Erick On Mon, Jun 29, 2015 at 12:22 PM, Thomas Michael Engelke thomas.enge...@posteo.de wrote: A friend and I are trying to develop some software using Solr in the background, and with that comes alot of changes. We're used to older versions (4.3 and below). We especially have problems with the autosuggest feature. This is the field definition (schema.xml) for our autosuggest field: field name=autosuggest type=autosuggest indexed=true stored=true required=false multiValued=true / ... copyField source=name dest=autosuggest / ... fieldType name=autosuggest class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory splitOnCaseChange=0 splitOnNumerics=1 generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=0 catenateAll=0 preserveOriginal=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StopFilterFactory words=stopwords.txt ignoreCase=true enablePositionIncrements=true format=snowball/ filter class=solr.DictionaryCompoundWordTokenFilterFactory dictionary=dictionary.txt minWordSize=5 minSubwordSize=3 maxSubwordSize=30 onlyLongestMatch=false/ filter class=solr.GermanNormalizationFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=German2 protected=protwords.txt/ filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=30/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory splitOnCaseChange=0 splitOnNumerics=1 generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=0 catenateAll=0 preserveOriginal=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StopFilterFactory words=stopwords.txt ignoreCase=true enablePositionIncrements=true format=snowball/ filter class=solr.GermanNormalizationFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=German2 protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType Afterwards, we defined an autosuggest component to use this field, like this (solrconfig.xml): searchComponent name=suggest class=solr.SuggestComponent lst name=suggester str name=namemySuggester/str str name=lookupImplFuzzyLookupFactory/str str name=storeDirsuggester_fuzzy_dir/str str name=dictionaryImplDocumentDictionaryFactory/str str name=fieldsuggest/str str name=suggestAnalyzerFieldTypeautosuggest/str str name=buildOnStartupfalse/str str name=buildOnCommitfalse/str /lst /searchComponent And add a requesthandler to test out the functionality: requestHandler name=/suggesthandler class=solr.SearchHandler startup=lazy lst name=defaults str name=suggesttrue/str str name=suggest.count10/str str name=suggest.dictionarymySuggester/str /lst arr name=components strsuggest/str /arr /requestHandler However, trying to start the core that has this configuration, a long exception occurs, telling us this: Error in configuration: autosuggest is not defined in the schema Now, that seems to be wrong. Any idea how to fix that? -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England
Re: Questions regarding autosuggest (Solr 5.2.1)
Pesky computers, they keep doing exactly what I tell 'em to do, not what I mean ;) I'll open a JIRA for making Solr DWIM-compliant, Do What I Mean ;) ;) On Tue, Jun 30, 2015 at 4:17 AM, Thomas Michael Engelke thomas.enge...@posteo.de wrote: God damn. Thank you. *ashamed* Am 30.06.2015 00:21 schrieb Erick Erickson: Try not putting it in double quotes? Best, Erick On Mon, Jun 29, 2015 at 12:22 PM, Thomas Michael Engelke thomas.enge...@posteo.de wrote: A friend and I are trying to develop some software using Solr in the background, and with that comes alot of changes. We're used to older versions (4.3 and below). We especially have problems with the autosuggest feature. This is the field definition (schema.xml) for our autosuggest field: field name=autosuggest type=autosuggest indexed=true stored=true required=false multiValued=true / ... copyField source=name dest=autosuggest / ... fieldType name=autosuggest class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory splitOnCaseChange=0 splitOnNumerics=1 generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=0 catenateAll=0 preserveOriginal=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StopFilterFactory words=stopwords.txt ignoreCase=true enablePositionIncrements=true format=snowball/ filter class=solr.DictionaryCompoundWordTokenFilterFactory dictionary=dictionary.txt minWordSize=5 minSubwordSize=3 maxSubwordSize=30 onlyLongestMatch=false/ filter class=solr.GermanNormalizationFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=German2 protected=protwords.txt/ filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=30/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory splitOnCaseChange=0 splitOnNumerics=1 generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=0 catenateAll=0 preserveOriginal=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StopFilterFactory words=stopwords.txt ignoreCase=true enablePositionIncrements=true format=snowball/ filter class=solr.GermanNormalizationFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=German2 protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType Afterwards, we defined an autosuggest component to use this field, like this (solrconfig.xml): searchComponent name=suggest class=solr.SuggestComponent lst name=suggester str name=namemySuggester/str str name=lookupImplFuzzyLookupFactory/str str name=storeDirsuggester_fuzzy_dir/str str name=dictionaryImplDocumentDictionaryFactory/str str name=fieldsuggest/str str name=suggestAnalyzerFieldTypeautosuggest/str str name=buildOnStartupfalse/str str name=buildOnCommitfalse/str /lst /searchComponent And add a requesthandler to test out the functionality: requestHandler name=/suggesthandler class=solr.SearchHandler startup=lazy lst name=defaults str name=suggesttrue/str str name=suggest.count10/str str name=suggest.dictionarymySuggester/str /lst arr name=components strsuggest/str /arr /requestHandler However, trying to start the core that has this configuration, a long exception occurs, telling us this: Error in configuration: autosuggest is not defined in the schema Now, that seems to be wrong. Any idea how to fix that?
Questions regarding autosuggest (Solr 5.2.1)
A friend and I are trying to develop some software using Solr in the background, and with that comes alot of changes. We're used to older versions (4.3 and below). We especially have problems with the autosuggest feature. This is the field definition (schema.xml) for our autosuggest field: field name=autosuggest type=autosuggest indexed=true stored=true required=false multiValued=true / ... copyField source=name dest=autosuggest / ... fieldType name=autosuggest class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory splitOnCaseChange=0 splitOnNumerics=1 generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=0 catenateAll=0 preserveOriginal=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StopFilterFactory words=stopwords.txt ignoreCase=true enablePositionIncrements=true format=snowball/ filter class=solr.DictionaryCompoundWordTokenFilterFactory dictionary=dictionary.txt minWordSize=5 minSubwordSize=3 maxSubwordSize=30 onlyLongestMatch=false/ filter class=solr.GermanNormalizationFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=German2 protected=protwords.txt/ filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=30/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory splitOnCaseChange=0 splitOnNumerics=1 generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=0 catenateAll=0 preserveOriginal=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StopFilterFactory words=stopwords.txt ignoreCase=true enablePositionIncrements=true format=snowball/ filter class=solr.GermanNormalizationFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=German2 protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType Afterwards, we defined an autosuggest component to use this field, like this (solrconfig.xml): searchComponent name=suggest class=solr.SuggestComponent lst name=suggester str name=namemySuggester/str str name=lookupImplFuzzyLookupFactory/str str name=storeDirsuggester_fuzzy_dir/str str name=dictionaryImplDocumentDictionaryFactory/str str name=fieldsuggest/str str name=suggestAnalyzerFieldTypeautosuggest/str str name=buildOnStartupfalse/str str name=buildOnCommitfalse/str /lst /searchComponent And add a requesthandler to test out the functionality: requestHandler name=/suggesthandler class=solr.SearchHandler startup=lazy lst name=defaults str name=suggesttrue/str str name=suggest.count10/str str name=suggest.dictionarymySuggester/str /lst arr name=components strsuggest/str /arr /requestHandler However, trying to start the core that has this configuration, a long exception occurs, telling us this: Error in configuration: autosuggest is not defined in the schema Now, that seems to be wrong. Any idea how to fix that?
Re: Questions regarding autosuggest (Solr 5.2.1)
Try not putting it in double quotes? Best, Erick On Mon, Jun 29, 2015 at 12:22 PM, Thomas Michael Engelke thomas.enge...@posteo.de wrote: A friend and I are trying to develop some software using Solr in the background, and with that comes alot of changes. We're used to older versions (4.3 and below). We especially have problems with the autosuggest feature. This is the field definition (schema.xml) for our autosuggest field: field name=autosuggest type=autosuggest indexed=true stored=true required=false multiValued=true / ... copyField source=name dest=autosuggest / ... fieldType name=autosuggest class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory splitOnCaseChange=0 splitOnNumerics=1 generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=0 catenateAll=0 preserveOriginal=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StopFilterFactory words=stopwords.txt ignoreCase=true enablePositionIncrements=true format=snowball/ filter class=solr.DictionaryCompoundWordTokenFilterFactory dictionary=dictionary.txt minWordSize=5 minSubwordSize=3 maxSubwordSize=30 onlyLongestMatch=false/ filter class=solr.GermanNormalizationFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=German2 protected=protwords.txt/ filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=30/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory splitOnCaseChange=0 splitOnNumerics=1 generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=0 catenateAll=0 preserveOriginal=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StopFilterFactory words=stopwords.txt ignoreCase=true enablePositionIncrements=true format=snowball/ filter class=solr.GermanNormalizationFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=German2 protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType Afterwards, we defined an autosuggest component to use this field, like this (solrconfig.xml): searchComponent name=suggest class=solr.SuggestComponent lst name=suggester str name=namemySuggester/str str name=lookupImplFuzzyLookupFactory/str str name=storeDirsuggester_fuzzy_dir/str str name=dictionaryImplDocumentDictionaryFactory/str str name=fieldsuggest/str str name=suggestAnalyzerFieldTypeautosuggest/str str name=buildOnStartupfalse/str str name=buildOnCommitfalse/str /lst /searchComponent And add a requesthandler to test out the functionality: requestHandler name=/suggesthandler class=solr.SearchHandler startup=lazy lst name=defaults str name=suggesttrue/str str name=suggest.count10/str str name=suggest.dictionarymySuggester/str /lst arr name=components strsuggest/str /arr /requestHandler However, trying to start the core that has this configuration, a long exception occurs, telling us this: Error in configuration: autosuggest is not defined in the schema Now, that seems to be wrong. Any idea how to fix that?
solr autosuggest to stop/filter suggesting the phrases that ends with stopwords
Hi Folks, Solr Version 4.7+ Do we have any analyzer or filter or any plugin in solr to stop suggesting the phrase that ends with stopwords? For ex: If the suggestion are as below for query http://localhost.com/solr/suggest?q=jazz+a suggestion: [ jazz and, jazz at, jazz at lincoln, jazz at lincoln center, jazz artists, jazz and classic ] Is there any config or solution to remove only *jazz at* and *jazz and* phrases so that the final suggestion response looks more sensible! suggestion: [ jazz at lincoln, jazz at lincoln center, jazz artists, jazz and classic ] Google does this intelligently :) I have tested with StopFilterFactory and SuggestStopFilter both of which filters all of stop terms in the phrases now matter where they appear. Do i have to come up with a custom plugin or some kind of phrase filter to do this in solr? I am on the way to design SuggestStopPhraseFilter and its factory , as we have existing SuggestStopFilter, and use this in my schema or do we have any existing plugin or feature that i can use of leverage from? *Thanks,* *Rajesh.*
Re: Partial match autosuggest (match a word occurring anywhere in a field)
Thanks for your response. I fixed this issue by using the filter class=solr.PositionFilterFactory / fieldType name=edgytext class=solr.TextField positionIncrementGap=100 omitNorms=true analyzer type=index filter class=solr.LowerCaseFilterFactory/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.NGramFilterFactory minGramSize=1 maxGramSize=15 / /analyzer analyzer type=query filter class=solr.LowerCaseFilterFactory/ tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.ShingleFilterFactory outputUnigrams=true outputUnigramIfNoNgram=true maxShingleSize=99/ filter class=solr.PositionFilterFactory / /analyzer /fieldType -- View this message in context: http://lucene.472066.n3.nabble.com/Predictive-search-match-a-word-occurring-anywhere-in-a-field-tp4174660p4174822.html Sent from the Solr - User mailing list archive at Nabble.com.
Partial match autosuggest (match a word occurring anywhere in a field)
Hi, I am trying to figure out a way to implement partial match autosuggest but it doesn't work in some cases. When I search for iphone 5s, I am able to see the below results. title_new:Apple iPhone 5s - 16GB - Gold but when I search for iphone gold (in title_new field), I am not able to see the above result. Is there a way to implement full partial match (occuring anywhere in a field)? Please find below my fieldtype configuration for title_new fieldType name=edgytext class=solr.TextField analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.NGramFilterFactory minGramSize=1 maxGramSize=15 / /analyzer analyzer type=query tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType -- View this message in context: http://lucene.472066.n3.nabble.com/Partial-match-autosuggest-match-a-word-occurring-anywhere-in-a-field-tp4174660.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Partial match autosuggest (match a word occurring anywhere in a field)
Hi BBrani, Yes it is possible. Create another field, say edgytext_partial, use whitespace tokenises this time. And query on both edgytext and edgytext_partial. you can even apply different boosts. Ahmet On Wednesday, December 17, 2014 2:44 AM, bbarani bbar...@gmail.com wrote: Hi, I am trying to figure out a way to implement partial match autosuggest but it doesn't work in some cases. When I search for iphone 5s, I am able to see the below results. title_new:Apple iPhone 5s - 16GB - Gold but when I search for iphone gold (in title_new field), I am not able to see the above result. Is there a way to implement full partial match (occuring anywhere in a field)? Please find below my fieldtype configuration for title_new fieldType name=edgytext class=solr.TextField analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.NGramFilterFactory minGramSize=1 maxGramSize=15 / /analyzer analyzer type=query tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType -- View this message in context: http://lucene.472066.n3.nabble.com/Partial-match-autosuggest-match-a-word-occurring-anywhere-in-a-field-tp4174660.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Best practice: Autosuggest/autocomplete vs. real search
The goal is to ensure that suggestions from autocomplete are actually terms in the main index, so that the suggestions will actually result in matches. You've considered expanding the main index by adding the suggestion n-grams to it, but it would probably be better to alter your suggester so that it produces only tokens that are in the main index. I think this is basically how all the Suggester implementations are designed to work already; are you using one of those, or are you using the TermsComponent, or something else? -Mike On 11/10/14 2:54 AM, Thomas Michael Engelke wrote: We're using Solr as a backend for an ECommerce site/system. The Solr index stores products with selected attributes, as well as a dedicated field for autocomplete suggestions (Done via AJAX request when typing in the search box without pressing return). The autosuggest field is supplied by copyField directives from certain select product attribute fields (description and/or name mostly). It uses EdgeNGramFilterFactory to complete words not yet typed completely, and it works quite well. However, we come across an issue with a disconnect between the autosuggest results and results of a normal search, that is, a query over the full fields of the product. Let's say there are products that are called motor. - When autosuggesting, typing mot autosuggests all products with motor, because the EdgeNGram created m, mo, mot, moto and motor, respectively, and it matches. - When searching for mot, however (i.e. pressing enter when seeing the autosuggestions), it doesn't find any products. The autosuggest field is not part of the real search, and no product attribute contains mot as a word. One obvious solution would be to incorporate the autosuggest field into the real search, however, this adds many tokens to the index that aren't really part of the products indexed and makes for strange search results, for example when an NGram is also a word, but the record itself does contain the search term only as part of a word. Are there clever solutions to this problem?
Re: Best practice: Autosuggest/autocomplete vs. real search
It wouldn’t be easy if in the site you’ll ensure that only terms are submitted to the actual search? In app I worked some time ago the default behavior of the Javascript component used for autocompletion was to first autocomplete the term in the input and then submit the query against the backend. I know this is not what you’ve asked for but could work? I’m just firing a bullet in the air here! :-) On Nov 10, 2014, at 8:37 AM, Michael Sokolov msoko...@safaribooksonline.com wrote: The goal is to ensure that suggestions from autocomplete are actually terms in the main index, so that the suggestions will actually result in matches. You've considered expanding the main index by adding the suggestion n-grams to it, but it would probably be better to alter your suggester so that it produces only tokens that are in the main index. I think this is basically how all the Suggester implementations are designed to work already; are you using one of those, or are you using the TermsComponent, or something else? -Mike On 11/10/14 2:54 AM, Thomas Michael Engelke wrote: We're using Solr as a backend for an ECommerce site/system. The Solr index stores products with selected attributes, as well as a dedicated field for autocomplete suggestions (Done via AJAX request when typing in the search box without pressing return). The autosuggest field is supplied by copyField directives from certain select product attribute fields (description and/or name mostly). It uses EdgeNGramFilterFactory to complete words not yet typed completely, and it works quite well. However, we come across an issue with a disconnect between the autosuggest results and results of a normal search, that is, a query over the full fields of the product. Let's say there are products that are called motor. - When autosuggesting, typing mot autosuggests all products with motor, because the EdgeNGram created m, mo, mot, moto and motor, respectively, and it matches. - When searching for mot, however (i.e. pressing enter when seeing the autosuggestions), it doesn't find any products. The autosuggest field is not part of the real search, and no product attribute contains mot as a word. One obvious solution would be to incorporate the autosuggest field into the real search, however, this adds many tokens to the index that aren't really part of the products indexed and makes for strange search results, for example when an NGram is also a word, but the record itself does contain the search term only as part of a word. Are there clever solutions to this problem?
Re: Best practice: Autosuggest/autocomplete vs. real search
The dedicated autosuggest field is not used by a suggester component, instead we just directly query it (/select). I'm trying to read my way into how the suggesters work, and toying around with some configurations (For instance from here: http://www.andornot.com/blog/post/Advanced-autocomplete-with-Solr-Ngrams-and-Twitters-typeaheadjs.aspx). Compared to how you can analyze search result through the Solr backend, the analysis of suggester results seems to be sorely lacking. Am 10.11.2014 14:37 schrieb Michael Sokolov: The goal is to ensure that suggestions from autocomplete are actually terms in the main index, so that the suggestions will actually result in matches. You've considered expanding the main index by adding the suggestion n-grams to it, but it would probably be better to alter your suggester so that it produces only tokens that are in the main index. I think this is basically how all the Suggester implementations are designed to work already; are you using one of those, or are you using the TermsComponent, or something else? -Mike On 11/10/14 2:54 AM, Thomas Michael Engelke wrote: We're using Solr as a backend for an ECommerce site/system. The Solr index stores products with selected attributes, as well as a dedicated field for autocomplete suggestions (Done via AJAX request when typing in the search box without pressing return). The autosuggest field is supplied by copyField directives from certain select product attribute fields (description and/or name mostly). It uses EdgeNGramFilterFactory to complete words not yet typed completely, and it works quite well. However, we come across an issue with a disconnect between the autosuggest results and results of a normal search, that is, a query over the full fields of the product. Let's say there are products that are called motor. - When autosuggesting, typing mot autosuggests all products with motor, because the EdgeNGram created m, mo, mot, moto and motor, respectively, and it matches. - When searching for mot, however (i.e. pressing enter when seeing the autosuggestions), it doesn't find any products. The autosuggest field is not part of the real search, and no product attribute contains mot as a word. One obvious solution would be to incorporate the autosuggest field into the real search, however, this adds many tokens to the index that aren't really part of the products indexed and makes for strange search results, for example when an NGram is also a word, but the record itself does contain the search term only as part of a word. Are there clever solutions to this problem?
Best practice: Autosuggest/autocomplete vs. real search
We're using Solr as a backend for an ECommerce site/system. The Solr index stores products with selected attributes, as well as a dedicated field for autocomplete suggestions (Done via AJAX request when typing in the search box without pressing return). The autosuggest field is supplied by copyField directives from certain select product attribute fields (description and/or name mostly). It uses EdgeNGramFilterFactory to complete words not yet typed completely, and it works quite well. However, we come across an issue with a disconnect between the autosuggest results and results of a normal search, that is, a query over the full fields of the product. Let's say there are products that are called motor. - When autosuggesting, typing mot autosuggests all products with motor, because the EdgeNGram created m, mo, mot, moto and motor, respectively, and it matches. - When searching for mot, however (i.e. pressing enter when seeing the autosuggestions), it doesn't find any products. The autosuggest field is not part of the real search, and no product attribute contains mot as a word. One obvious solution would be to incorporate the autosuggest field into the real search, however, this adds many tokens to the index that aren't really part of the products indexed and makes for strange search results, for example when an NGram is also a word, but the record itself does contain the search term only as part of a word. Are there clever solutions to this problem?
Autosuggest using EdgeNGrams with strange highlighting
We've moved from an asterisk based autosuggest functionality (searchterm*) to a version using a special field called autosuggest, filled via copyField directives. The field definition: fieldType name=autosuggest class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StopFilterFactory words=stopwords.txt ignoreCase=true enablePositionIncrements=true format=snowball/ filter class=solr.DictionaryCompoundWordTokenFilterFactory dictionary=dictionary.txt minWordSize=5 minSubwordSize=3 maxSubwordSize=30 onlyLongestMatch=false/ filter class=solr.GermanNormalizationFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=German2 protected=protwords.txt/ filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=15 side=front/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StopFilterFactory words=stopwords.txt ignoreCase=true enablePositionIncrements=true format=snowball/ filter class=solr.DictionaryCompoundWordTokenFilterFactory dictionary=dictionary.txt minWordSize=5 minSubwordSize=3 maxSubwordSize=30 onlyLongestMatch=false/ filter class=solr.GermanNormalizationFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=German2 protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType It works like a charm. Now, we've had highlighting from Solr before, using these parameters: hl=truehl.simple.pre=span+class%3Dhighlighthl.snippets=1hl.simple.post=/spanspellcheck=truehl.fl=description Now, we've seen something strange. This is just an example, the problem is with more than this record. In this example, the autosuggest field contains: 2CV4 Spot, Dekorsatz, für 2CV. However, the highlighting branch for this autosuggest field in the record looks like this: lst name=highlighting lst name=34725 arr name=short_description str2CV4 Spot, Dekorsatz, für em2CV/em./str /arr /lst ... Although the EdgeNGramFilterFactory generated the NGrams so that 2CV4 - 2, 2C, 2CV, 2CV4, the term is not highlighted. Shouldn't it? It's not a question of the number of highlights, records containing multiple occurances of 2CV get highlighted multiple times with no problems. It seems that words only containing parts of the search term which match the EdgeNGrams are not highlighted. As we're using highlighting from Solr exclusively, this leads to records being found, but having no highlight at all.
Is there a way to prevent some keywords from being added to autosuggest dictionary?
We index around 10k documents in SOLR and use inbuilt suggest functionality for auto complete. We have a field that contain a flag that is used to show or hide the documents from search results. I am trying to figure out a way to control the terms added to autosuggest index (to skip the documents from getting added to auto suggest index) based on the value of the flag. Is there a way to do that? -- View this message in context: http://lucene.472066.n3.nabble.com/Is-there-a-way-to-prevent-some-keywords-from-being-added-to-autosuggest-dictionary-tp4164699.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Is there a way to prevent some keywords from being added to autosuggest dictionary?
What field(s) auto suggest uses is configurable. So you could create special fields (and associated ‘copyField’ configs) to populate specific fields for auto suggest. For example, you could have 2 fields for “hidden_desc” and “visible_desc”. Copy field both of them to a field named “description”. Then set auto suggest to use only the “visible_desc” field to drive auto suggests. That might be one viable option. Regard, Garth On Oct 17, 2014, at 1:02 PM, bbarani bbar...@gmail.com wrote: We index around 10k documents in SOLR and use inbuilt suggest functionality for auto complete. We have a field that contain a flag that is used to show or hide the documents from search results. I am trying to figure out a way to control the terms added to autosuggest index (to skip the documents from getting added to auto suggest index) based on the value of the flag. Is there a way to do that? -- View this message in context: http://lucene.472066.n3.nabble.com/Is-there-a-way-to-prevent-some-keywords-from-being-added-to-autosuggest-dictionary-tp4164699.html Sent from the Solr - User mailing list archive at Nabble.com.
Filtering autosuggest results in Solr
Hi, We have following use case: Filter autosuggest results of solr_field1 based on solr_field2 values. The solr_field2 values are constants such as source1, source2 etc. If user types xyz for solr_field1, suggestions returned can match anywhere in solr_field1 value such as abcxyz, xyzabc, abcxyzdef, acXyZc etc. But returned suggestions should be filtered by solr_field2 value. Something like, q=solr_field1:xyzfq=solr_field2:source1 or solr_field2:source2 = abcxyz, xyzabc, abcxyzdef We have tried /terms component returns suggestions case insensitive but it has no filtering capability. Faceting query works with only prefix values. Please can anyone suggest alternative approaches. Thanks Chakra
Autosuggest with spelling correction
Hi everyone, Currently I'm using AnalyzingInfixLookupFactory with a suggestions file containing up to 3 word phrases. However this component can't keep suggesting in case of spelling errors. I heard about FuzzySuggester and found some sample configurations here http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/test-files/solr/collection1/conf/solrconfig-phrasesuggest.xml. But I couldn't make any of them work. I got the same error: ...solr-4.9.0\example\solr\collection1\data\fuzzy_suggest_analyzing\fwfsta.bin (The system cannot find the file specified). In short, is there a Suggester component that supports both infix lookup and fuzzy suggest, and where can I find a proper sample configuration. Thanks -- Harun Reşit Zafer TÜBİTAK BİLGEM BTE Bulut Bilişim ve Büyük Veri Analiz Sistemleri Bölümü T +90 262 675 3268 W http://www.hrzafer.com
Re: Autosuggest with spelling correction
This jira has some documentation, may be this will help you.. https://issues.apache.org/jira/browse/SOLR-5683 On Wed, Aug 13, 2014 at 1:28 AM, Harun Reşit Zafer harun.za...@tubitak.gov.tr wrote: Hi everyone, Currently I'm using AnalyzingInfixLookupFactory with a suggestions file containing up to 3 word phrases. However this component can't keep suggesting in case of spelling errors. I heard about FuzzySuggester and found some sample configurations here http://svn.apache.org/repos/ asf/lucene/dev/trunk/solr/core/src/test-files/solr/ collection1/conf/solrconfig-phrasesuggest.xml. But I couldn't make any of them work. I got the same error: ...solr-4.9.0\example\solr\ collection1\data\fuzzy_suggest_analyzing\fwfsta.bin (The system cannot find the file specified). In short, is there a Suggester component that supports both infix lookup and fuzzy suggest, and where can I find a proper sample configuration. Thanks -- Harun Reşit Zafer TÜBİTAK BİLGEM BTE Bulut Bilişim ve Büyük Veri Analiz Sistemleri Bölümü T +90 262 675 3268 W http://www.hrzafer.com
Re: Extend the Solr Terms Component to implement a customized Autosuggest
Ummm, 400k documents is _tiny_ by Solr/Lucene standards. I've seen 150M docs fit in 16G on Solr. I put 11M docs on my laptop So I would _strongly_ advise that you don't worry about space at all as a first approach and freely copy as many fields as you need to support your use-case. Only after you've proved that this is untenable would I recommend you develop custom code. You'll be in production much faster that way ;) Of course this is irrelevant if each doc is War and Peace, but Best, Erick On Thu, Jul 31, 2014 at 3:29 PM, Juan Pablo Albuja jpalb...@dustland.com wrote: Good afternoon guys, I really appreciate if someone on the community can help me with the following issue: I need to implement a Solr autosuggest that supports: 1. Get autosuggestion over multivalued fields 2. Case - Insensitiveness 3. Look for content in the middle for example I have the value Hello World indexed, and I need to get that value when the user types wor 4. Filter by an additional field. I was using the terms component because with it I can satisfy 1 to 3, but for point 4 is not possible. I also was looking at faceting searches and Ngram.Edge-Ngrams, but the problem with those approaches is that I need to copy fields over to make them tokenized or apply grams to those, and I don't want to do that because I have more than 6 fields that needs autosuggest, my index is big I have more than 400k documents and I don't want to increase the size. I was trying to Extend the terms component in order to add an additional filter but it uses TermsEnum that is a vector over an specific field and I couldn't figure out how to filter it in a really efficient way. Do you guys have an idea on how can I satisfy my requirements in an efficient way? If there is another way without using the terms component for me is also awesome. Thanks Juan Pablo Albuja Senior Developer
Extend the Solr Terms Component to implement a customized Autosuggest
Good afternoon guys, I really appreciate if someone on the community can help me with the following issue: I need to implement a Solr autosuggest that supports: 1. Get autosuggestion over multivalued fields 2. Case - Insensitiveness 3. Look for content in the middle for example I have the value Hello World indexed, and I need to get that value when the user types wor 4. Filter by an additional field. I was using the terms component because with it I can satisfy 1 to 3, but for point 4 is not possible. I also was looking at faceting searches and Ngram.Edge-Ngrams, but the problem with those approaches is that I need to copy fields over to make them tokenized or apply grams to those, and I don't want to do that because I have more than 6 fields that needs autosuggest, my index is big I have more than 400k documents and I don't want to increase the size. I was trying to Extend the terms component in order to add an additional filter but it uses TermsEnum that is a vector over an specific field and I couldn't figure out how to filter it in a really efficient way. Do you guys have an idea on how can I satisfy my requirements in an efficient way? If there is another way without using the terms component for me is also awesome. Thanks Juan Pablo Albuja Senior Developer
Sorting is not correct in autosuggest
Hi All In my auto suggest page sorting is not correct for the suggestions i am getting. However suggestions are all correct. Any guidance will be helpful -- View this message in context: http://lucene.472066.n3.nabble.com/Sorting-is-not-correct-in-autosuggest-tp4133859.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Sorting is not correct in autosuggest
Please review: http://wiki.apache.org/solr/UsingMailingLists You've given us virtually no information here. Best, Erick On Wed, Apr 30, 2014 at 12:35 AM, neha sinha nehasinha...@gmail.com wrote: Hi All In my auto suggest page sorting is not correct for the suggestions i am getting. However suggestions are all correct. Any guidance will be helpful -- View this message in context: http://lucene.472066.n3.nabble.com/Sorting-is-not-correct-in-autosuggest-tp4133859.html Sent from the Solr - User mailing list archive at Nabble.com.
AutoSuggest like Google in Solr using Solarium Client.
Can anyone suggest me the best practices how to do SpellCheck and AutoSuggest in solarium. Can anyone give me example for that? -- Regards, *Sohan Kalsariya*
RE: AutoSuggest like Google in Solr using Solarium Client.
Hi Sohan, The best approach for the auto suggest is using the facet query. Please refer the link : http://solr.pl/en/2010/10/18/solr-and-autocomplete-part-1/ Thanks, SureshKumar.S From: Sohan Kalsariya sohankalsar...@gmail.com Sent: Monday, March 17, 2014 8:14 PM To: solr-user@lucene.apache.org Subject: AutoSuggest like Google in Solr using Solarium Client. Can anyone suggest me the best practices how to do SpellCheck and AutoSuggest in solarium. Can anyone give me example for that? -- Regards, *Sohan Kalsariya* [Aspire Systems] This e-mail message and any attachments are for the sole use of the intended recipient(s) and may contain proprietary, confidential, trade secret or privileged information. Any unauthorized review, use, disclosure or distribution is prohibited and may be a violation of law. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.
Re: AutoSuggest like Google in Solr using Solarium Client.
I think it's best to use one of the many autosuggesters Lucene/Solr provide? E.g. AnalyzingInfixSuggester is running here: http://jirasearch.mikemccandless.com But that's just one suggester... there are many more. Mike McCandless http://blog.mikemccandless.com On Mon, Mar 17, 2014 at 10:44 AM, Sohan Kalsariya sohankalsar...@gmail.com wrote: Can anyone suggest me the best practices how to do SpellCheck and AutoSuggest in solarium. Can anyone give me example for that? -- Regards, *Sohan Kalsariya*
Re: AutoSuggest like Google in Solr using Solarium Client.
Not sure if you have already seen this one.. http://www.solarium-project.org/2012/01/suggester-query-support/ You can also use edge N gram filter to implement typeahead auto suggest. -- View this message in context: http://lucene.472066.n3.nabble.com/AutoSuggest-like-Google-in-Solr-using-Solarium-Client-tp4124821p4124871.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Autosuggest - Strange issue with leading numbers in query
I tied almost all possible combination but in vain. Does anyone know if there is any logic build in to suggester component to ignore the leading numbers? autocomplete?qt=/lucidreq_type=auto_completespellcheck.collate=falseq=34g lst name=spellcheck lst name=suggestions lst name=g int name=numFound1/int int name=startOffset2/int int name=endOffset3/int arr name=suggestion strgalaxy/str /arr /lst /lst /lst /response /autocomplete?qt=/lucidreq_type=auto_completespellcheck.collate=falseq=11123423432423243ip response lst name=responseHeader int name=status0/int int name=QTime0/int /lst lst name=spellcheck lst name=suggestions lst name=ip int name=numFound2/int *int name=startOffset17/int* int name=endOffset19/int arr name=suggestion stripad/str striphone/str /arr /lst /lst /lst /response -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Autosuggest-Strange-issue-with-leading-numbers-in-query-tp4116751p4123702.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Autosuggest - Strange issue with leading numbers in query
Here’s a rather obvious question: have you rebuilt your spell index recently? Is it possible the offending numbers snuck into the spell dictionary? The terms component will show you what’s in your current, searchable field…but not the dictionary. If my memory serves correctly, with collate=true this would allow for such behavior to occur, especially with onlyMorePopular set to false (which would ensure the resulting collation has a query count greater than the current query). Have you flipped onlyMorePopular to true to confirm? On Feb 18, 2014, at 10:16 AM, bbi123 bbar...@gmail.com wrote: Thanks a lot for your response Erik. I was trying to find if I have any suggestion starting with numbers using terms component but I couldn't find any.. Its very strange!!! Anyways, thanks again for your response. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Autosuggest-Strange-issue-with-leading-numbers-in-query-tp4116751p4118072.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Autosuggest - Strange issue with leading numbers in query
Thanks a lot for your response Erik. I was trying to find if I have any suggestion starting with numbers using terms component but I couldn't find any.. Its very strange!!! Anyways, thanks again for your response. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Autosuggest-Strange-issue-with-leading-numbers-in-query-tp4116751p4118072.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Autosuggest - Strange issue with leading numbers in query
Hi Erik, Thanks a lot for your reply. I expect it to return zero suggestions since the suggested keyword doesnt actually start with numbers. Expected results Searching for ga - returns galaxy Searching for gal - returns galaxy Searching for 12321312321312ga - should not return any suggestion since there is no keyword (combination) exists in the index. Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Autosuggest-Strange-issue-with-leading-numbers-in-query-tp4116751p4117846.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Autosuggest - Strange issue with leading numbers in query
Ah, OK, I though you were indexing things like 123412335ga, but not so. Afraid I'm fresh out of ideas. Although I might try using TermsComponent to examine the index and see if, somehow, there _are_ terms with leading numbers in the output. It's also possible that numbers are stripped when building the FST that is used, but I don't know one way or the other. Best, Erick On Mon, Feb 17, 2014 at 11:30 AM, Developer bbar...@gmail.com wrote: Hi Erik, Thanks a lot for your reply. I expect it to return zero suggestions since the suggested keyword doesnt actually start with numbers. Expected results Searching for ga - returns galaxy Searching for gal - returns galaxy Searching for 12321312321312ga - should not return any suggestion since there is no keyword (combination) exists in the index. Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Autosuggest-Strange-issue-with-leading-numbers-in-query-tp4116751p4117846.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr Autosuggest - Strange issue with leading numbers in query
I have a strange issue with Autosuggest. Whenever I query for a keyword along with numbers (leading) it returns the suggestion corresponding to the alphabets (ignoring the numbers). I was under assumption that it will return an empty result back. I am not sure what I am doing wrong. Can someone help? *Query:* /autocomplete?qt=/lucidreq_type=auto_completespellcheck.maxCollations=10q=12342343243242gaspellcheck.count=10 *Result:* response lst name=responseHeader int name=status0/int int name=QTime1/int /lst lst name=spellcheck lst name=suggestions lst name=ga int name=numFound1/int int name=startOffset15/int int name=endOffset17/int arr name=suggestion strgalaxy/str /arr /lst str name=collation12342343243242galaxy/str /lst /lst /response *My field configuration is as below:* fieldType class=solr.TextField name=textSpell_word positionIncrementGap=100 analyzer tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StopFilterFactory enablePositionIncrements=true ignoreCase=true words=stopwords_autosuggest.txt/ /analyzer /fieldType *SolrConfig.xml* searchComponent class=solr.SpellCheckComponent name=autocomplete lst name=spellchecker str name=nameautocomplete/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.tst.TSTLookup/str str name=fieldautocomplete_word/str str name=storeDirautocomplete/str str name=buildOnCommittrue/str float name=threshold.005/float /lst /searchComponent requestHandler class=org.apache.solr.handler.component.SearchHandler name=/autocomplete lst name=defaults str name=spellchecktrue/str str name=spellcheck.dictionaryautocomplete/str str name=spellcheck.collatetrue/str str name=spellcheck.count10/str str name=spellcheck.onlyMorePopularfalse/str /lst arr name=components strautocomplete/str /arr /requestHandler -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Autosuggest-Strange-issue-with-leading-numbers-in-query-tp4116751.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Autosuggest - Strange issue with leading numbers in query
Hmmm, the example you post seems correct to me, the returned suggestion is really close to the term. What are you expecting here? The example is inconsistent with it returns the suggestion corresponding to the alphabets (ignoring the numbers) It looks like it's considering the numbers just fine, which is what makes the returned suggestion close to the term I think. Best, Erick On Tue, Feb 11, 2014 at 1:01 PM, Developer bbar...@gmail.com wrote: I have a strange issue with Autosuggest. Whenever I query for a keyword along with numbers (leading) it returns the suggestion corresponding to the alphabets (ignoring the numbers). I was under assumption that it will return an empty result back. I am not sure what I am doing wrong. Can someone help? *Query:* /autocomplete?qt=/lucidreq_type=auto_completespellcheck.maxCollations=10q=12342343243242gaspellcheck.count=10 *Result:* response lst name=responseHeader int name=status0/int int name=QTime1/int /lst lst name=spellcheck lst name=suggestions lst name=ga int name=numFound1/int int name=startOffset15/int int name=endOffset17/int arr name=suggestion strgalaxy/str /arr /lst str name=collation12342343243242galaxy/str /lst /lst /response *My field configuration is as below:* fieldType class=solr.TextField name=textSpell_word positionIncrementGap=100 analyzer tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StopFilterFactory enablePositionIncrements=true ignoreCase=true words=stopwords_autosuggest.txt/ /analyzer /fieldType *SolrConfig.xml* searchComponent class=solr.SpellCheckComponent name=autocomplete lst name=spellchecker str name=nameautocomplete/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.tst.TSTLookup/str str name=fieldautocomplete_word/str str name=storeDirautocomplete/str str name=buildOnCommittrue/str float name=threshold.005/float /lst /searchComponent requestHandler class=org.apache.solr.handler.component.SearchHandler name=/autocomplete lst name=defaults str name=spellchecktrue/str str name=spellcheck.dictionaryautocomplete/str str name=spellcheck.collatetrue/str str name=spellcheck.count10/str str name=spellcheck.onlyMorePopularfalse/str /lst arr name=components strautocomplete/str /arr /requestHandler -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Autosuggest-Strange-issue-with-leading-numbers-in-query-tp4116751.html Sent from the Solr - User mailing list archive at Nabble.com.
Autosuggest - Custom sorting
Is there a way to sort the returned Autosuggest list based on a particular value (ex: score)? I am trying to sort the returned suggestions based on a field that has been calculated manually but not sure how to use that field for sorting suggestions. -- View this message in context: http://lucene.472066.n3.nabble.com/Autosuggest-Custom-sorting-tp4092980.html Sent from the Solr - User mailing list archive at Nabble.com.
Autosuggest on very large index
Using 4.4.0 - I would like to be able to do an autosuggest query against one of the fields in our index and have the results be limited by an fq. I can get exactly the results I want with a facet query using a facet.prefix, but the first query takes ~5 minutes to run on our QA env (~240M docs). I'm afraid to attempt it on prod (~2B docs). Subsequent queries are sufficiently fast (~500ms). I'm assuming the first query is uninverting the field. Is there any way to mark that field so that an uninverted copy is maintained as updates come in? We plan to soft commit every 5 minutes, and we'd prefer to not be continuously uninverting this one field. Or is there a better way to do what I'm trying to do? I've looked at the spellcheck component a little bit, but it looks like I can't filter results by fq. The fq I'm using is based on which client is logged in, and we can't autosuggest terms from one client to another. Thanks. -Greg
RE: Autosuggest on very large index
I am not entirely sure but the Suggester's FST uses prefixes so you may be able to prefix the value you otherwise use for the filter query when you build the suggester. -Original message- From:Greg Preston gpres...@marinsoftware.com Sent: Tuesday 20th August 2013 20:00 To: solr-user@lucene.apache.org Subject: Autosuggest on very large index Using 4.4.0 - I would like to be able to do an autosuggest query against one of the fields in our index and have the results be limited by an fq. I can get exactly the results I want with a facet query using a facet.prefix, but the first query takes ~5 minutes to run on our QA env (~240M docs). I'm afraid to attempt it on prod (~2B docs). Subsequent queries are sufficiently fast (~500ms). I'm assuming the first query is uninverting the field. Is there any way to mark that field so that an uninverted copy is maintained as updates come in? We plan to soft commit every 5 minutes, and we'd prefer to not be continuously uninverting this one field. Or is there a better way to do what I'm trying to do? I've looked at the spellcheck component a little bit, but it looks like I can't filter results by fq. The fq I'm using is based on which client is logged in, and we can't autosuggest terms from one client to another. Thanks. -Greg
Re: Autosuggest on very large index
The filter query would be on a different field (clientId) than the field we want to autosuggest on (title). Or are you proposing we index a compound field that would be clientId+titleTokens so we would then prefix the suggester with clientId+userInput ? Interesting idea. -Greg On Tue, Aug 20, 2013 at 11:21 AM, Markus Jelsma markus.jel...@openindex.io wrote: I am not entirely sure but the Suggester's FST uses prefixes so you may be able to prefix the value you otherwise use for the filter query when you build the suggester. -Original message- From:Greg Preston gpres...@marinsoftware.com Sent: Tuesday 20th August 2013 20:00 To: solr-user@lucene.apache.org Subject: Autosuggest on very large index Using 4.4.0 - I would like to be able to do an autosuggest query against one of the fields in our index and have the results be limited by an fq. I can get exactly the results I want with a facet query using a facet.prefix, but the first query takes ~5 minutes to run on our QA env (~240M docs). I'm afraid to attempt it on prod (~2B docs). Subsequent queries are sufficiently fast (~500ms). I'm assuming the first query is uninverting the field. Is there any way to mark that field so that an uninverted copy is maintained as updates come in? We plan to soft commit every 5 minutes, and we'd prefer to not be continuously uninverting this one field. Or is there a better way to do what I'm trying to do? I've looked at the spellcheck component a little bit, but it looks like I can't filter results by fq. The fq I'm using is based on which client is logged in, and we can't autosuggest terms from one client to another. Thanks. -Greg
Re: Autosuggest on very large index
Sounds like a problem for DocValues - assuming the number of unique values fits reasonably in memory to avoid I/O. How many unique values do you have or contemplate for two your billion documents? Two possibilities: 1. You need a lot more hardware. 2. You need to scale back your ambitions. -- Jack Krupansky -Original Message- From: Greg Preston Sent: Tuesday, August 20, 2013 2:00 PM To: solr-user@lucene.apache.org Subject: Autosuggest on very large index Using 4.4.0 - I would like to be able to do an autosuggest query against one of the fields in our index and have the results be limited by an fq. I can get exactly the results I want with a facet query using a facet.prefix, but the first query takes ~5 minutes to run on our QA env (~240M docs). I'm afraid to attempt it on prod (~2B docs). Subsequent queries are sufficiently fast (~500ms). I'm assuming the first query is uninverting the field. Is there any way to mark that field so that an uninverted copy is maintained as updates come in? We plan to soft commit every 5 minutes, and we'd prefer to not be continuously uninverting this one field. Or is there a better way to do what I'm trying to do? I've looked at the spellcheck component a little bit, but it looks like I can't filter results by fq. The fq I'm using is based on which client is logged in, and we can't autosuggest terms from one client to another. Thanks. -Greg
Re: Autosuggest on very large index
DocValues looks interesting, a non-inverted field. I'll play with it a bit and see how it works. Thanks for the suggestion. I don't know how many total terms we've got, but each document is only 2-5 words/terms on average, and there is a TON of overlap between docs. -Greg On Tue, Aug 20, 2013 at 11:38 AM, Jack Krupansky j...@basetechnology.com wrote: Sounds like a problem for DocValues - assuming the number of unique values fits reasonably in memory to avoid I/O. How many unique values do you have or contemplate for two your billion documents? Two possibilities: 1. You need a lot more hardware. 2. You need to scale back your ambitions. -- Jack Krupansky -Original Message- From: Greg Preston Sent: Tuesday, August 20, 2013 2:00 PM To: solr-user@lucene.apache.org Subject: Autosuggest on very large index Using 4.4.0 - I would like to be able to do an autosuggest query against one of the fields in our index and have the results be limited by an fq. I can get exactly the results I want with a facet query using a facet.prefix, but the first query takes ~5 minutes to run on our QA env (~240M docs). I'm afraid to attempt it on prod (~2B docs). Subsequent queries are sufficiently fast (~500ms). I'm assuming the first query is uninverting the field. Is there any way to mark that field so that an uninverted copy is maintained as updates come in? We plan to soft commit every 5 minutes, and we'd prefer to not be continuously uninverting this one field. Or is there a better way to do what I'm trying to do? I've looked at the spellcheck component a little bit, but it looks like I can't filter results by fq. The fq I'm using is based on which client is logged in, and we can't autosuggest terms from one client to another. Thanks. -Greg
Re: AutoSuggest+Grouping in one request
Hi, Hm, I *think* you can't do it in one go with Solr's Suggester, but I'm not expert there. I can only point you to something like our AutoComplete - http://sematext.com/products/autocomplete/index.html - which, as you can see on that screenshot, has the grouping you seem to be after. Maybe somebody else can point out if Solr Suggester can do the same? Otis -- Solr ElasticSearch Support http://sematext.com/ On Fri, Apr 26, 2013 at 9:58 AM, Rounak Jain rouna...@gmail.com wrote: Hi everyone, Search dropdowns on popular sites like Amazon (example imagehttp://i.imgur.com/aQyM8WD.jpg) use autosuggested words along with grouping (Field Collapsing in Solr). While I can replicate the same functionality in Solr using two requests (first to obtain suggestions, second for the actual query using the most probable suggestion), I want to know if this can be done in one request itself. I understand that there are various ways to obtain suggestions (term component, facets, Solr's inbuilt Suggesterhttp://wiki.apache.org/solr/Suggester), and I'm open to using any one of them, if it means I'll be able to get everything (groups + suggestions) in one request. Looking forward to some advice with regard to this. Thanks, Rounak
AutoSuggest+Grouping in one request
Hi everyone, Search dropdowns on popular sites like Amazon (example imagehttp://i.imgur.com/aQyM8WD.jpg) use autosuggested words along with grouping (Field Collapsing in Solr). While I can replicate the same functionality in Solr using two requests (first to obtain suggestions, second for the actual query using the most probable suggestion), I want to know if this can be done in one request itself. I understand that there are various ways to obtain suggestions (term component, facets, Solr's inbuilt Suggesterhttp://wiki.apache.org/solr/Suggester), and I'm open to using any one of them, if it means I'll be able to get everything (groups + suggestions) in one request. Looking forward to some advice with regard to this. Thanks, Rounak
Re: Issue with spellcheck and autosuggest
But if i use my system as solr server it is working fine. The problem comes only if i use another machine as solr server. But both machines have the same schema and solrconfig files. -- View this message in context: http://lucene.472066.n3.nabble.com/Issue-with-spellcheck-and-autosuggest-tp4036208p4038287.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Issue with spellcheck and autosuggest
you can of course check suggestions, but then you should remove str name=spellcheck.dictionarywordbreak/str from your handler, because its purpose is to find cases, when user types spaces wrongly (e.g., solrrocks, sol rrocks, so lr) -- View this message in context: http://lucene.472066.n3.nabble.com/Issue-with-spellcheck-and-autosuggest-tp4036208p4037631.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Issue with spellcheck and autosuggest
you should check not suggestions, but collations in the response xml -- View this message in context: http://lucene.472066.n3.nabble.com/Issue-with-spellcheck-and-autosuggest-tp4036208p4036977.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Issue with spellcheck and autosuggest
Is there any chance that you were experimenting with an ngram filter for the field? If you were, and merely changed the field type without reindexing, this behavior makes sense. In other words, you appear to have had some filter that broke words into one and two-character terms. Separate from that, the analyzer for a spellchecker should be very simple and preserve the structure of the term rather than decompose it, as WordDelimiterFilter does. So, be sure the use an analyzer that is very simple, such as StandardTokenizer and lower case filter, but nothing else. In general, use a separate field, like textSpell that has the simple analyzer and do a copyField from the original text field that can still have a richer analyzer -- Jack Krupansky -Original Message- From: Dixline Sent: Friday, January 25, 2013 6:30 AM To: solr-user@lucene.apache.org Subject: Issue with spellcheck and autosuggest Hi, this is my spellcheck/autosuggest dictionary field and field type, field name=searchText type=spelltext indexed=true stored=true multiValued=true default=JulyMSO / fieldType name=spelltext class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0/ filter class=solr.LowerCaseFilterFactory / filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt / filter class=solr.RemoveDuplicatesTokenFilterFactory / /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0/ filter class=solr.LowerCaseFilterFactory / filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt / filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType And this is my solrconfig.xml, searchComponent name=spellcheck class=solr.SpellCheckComponent str name=queryAnalyzerFieldTypespelltext/str lst name=spellchecker str name=namedefault/str str name=fieldsearchText/str str name=classnamesolr.DirectSolrSpellChecker/str str name=buildOnOptimizetrue/str str name=distanceMeasureinternal/str float name=accuracy0.1/float int name=maxEdits2/int int name=minPrefix1/int int name=maxInspections5/int int name=minQueryLength4/int float name=maxQueryFrequency0.01/float float name=thresholdTokenFrequency.01/float /lst lst name=spellchecker str name=namewordbreak/str str name=classnamesolr.WordBreakSolrSpellChecker/str str name=fieldsearchText/str str name=combineWordstrue/str str name=breakWordstrue/str str name=buildOnOptimizetrue/str int name=maxChanges10/int /lst /searchComponent requestHandler name=/spell class=solr.SearchHandler startup=lazy lst name=defaults str name=dfsearchText/str str name=spellcheck.dictionarydefault/str str name=spellcheck.dictionarywordbreak/str str name=spellchecktrue/str str name=spellcheck.onlyMorePopulartrue/str str name=spellcheck.count6/str str name=spellcheck.extendedResultsfalse/str str name=spellcheck.alternativeTermCount5/str str name=spellcheck.maxResultsForSuggest5/str str name=spellcheck.collatetrue/str str name=spellcheck.collateExtendedResultsfalse/str str name=spellcheck.maxCollationTries3/str str name=spellcheck.maxCollations1/str /lst arr name=last-components strspellcheck/str /arr /requestHandler searchComponent class=solr.SpellCheckComponent name=suggest str name=queryAnalyzerFieldTypespelltext/str lst name=spellchecker str name=namesuggest/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.tst.TSTLookup/str str name=fieldsearchText/str str name=buildOnOptimizetrue/str float name=accuracy0.1/float float name=threshold0.005/float /lst /searchComponent requestHandler class=org.apache.solr.handler.component.SearchHandler name=/suggest lst name=defaults str name=dfsearchText/str str name=spellcheck.dictionarysuggest/str str name=spellchecktrue/str str name=spellcheck.onlyMorePopularfalse/str str name=spellcheck.count6/str str name=spellcheck.extendedResultsfalse/str str name=spellcheck.collatetrue/str str name=spellcheck.collateExtendedResultsfalse/str str name=spellcheck.maxCollationTries3
Issue with spellcheck and autosuggest
Hi, this is my spellcheck/autosuggest dictionary field and field type, field name=searchText type=spelltext indexed=true stored=true multiValued=true default=JulyMSO / fieldType name=spelltext class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0/ filter class=solr.LowerCaseFilterFactory / filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt / filter class=solr.RemoveDuplicatesTokenFilterFactory / /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0/ filter class=solr.LowerCaseFilterFactory / filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt / filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType And this is my solrconfig.xml, searchComponent name=spellcheck class=solr.SpellCheckComponent str name=queryAnalyzerFieldTypespelltext/str lst name=spellchecker str name=namedefault/str str name=fieldsearchText/str str name=classnamesolr.DirectSolrSpellChecker/str str name=buildOnOptimizetrue/str str name=distanceMeasureinternal/str float name=accuracy0.1/float int name=maxEdits2/int int name=minPrefix1/int int name=maxInspections5/int int name=minQueryLength4/int float name=maxQueryFrequency0.01/float float name=thresholdTokenFrequency.01/float /lst lst name=spellchecker str name=namewordbreak/str str name=classnamesolr.WordBreakSolrSpellChecker/str str name=fieldsearchText/str str name=combineWordstrue/str str name=breakWordstrue/str str name=buildOnOptimizetrue/str int name=maxChanges10/int /lst /searchComponent requestHandler name=/spell class=solr.SearchHandler startup=lazy lst name=defaults str name=dfsearchText/str str name=spellcheck.dictionarydefault/str str name=spellcheck.dictionarywordbreak/str str name=spellchecktrue/str str name=spellcheck.onlyMorePopulartrue/str str name=spellcheck.count6/str str name=spellcheck.extendedResultsfalse/str str name=spellcheck.alternativeTermCount5/str str name=spellcheck.maxResultsForSuggest5/str str name=spellcheck.collatetrue/str str name=spellcheck.collateExtendedResultsfalse/str str name=spellcheck.maxCollationTries3/str str name=spellcheck.maxCollations1/str /lst arr name=last-components strspellcheck/str /arr /requestHandler searchComponent class=solr.SpellCheckComponent name=suggest str name=queryAnalyzerFieldTypespelltext/str lst name=spellchecker str name=namesuggest/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.tst.TSTLookup/str str name=fieldsearchText/str str name=buildOnOptimizetrue/str float name=accuracy0.1/float float name=threshold0.005/float /lst /searchComponent requestHandler class=org.apache.solr.handler.component.SearchHandler name=/suggest lst name=defaults str name=dfsearchText/str str name=spellcheck.dictionarysuggest/str str name=spellchecktrue/str str name=spellcheck.onlyMorePopularfalse/str str name=spellcheck.count6/str str name=spellcheck.extendedResultsfalse/str str name=spellcheck.collatetrue/str str name=spellcheck.collateExtendedResultsfalse/str str name=spellcheck.maxCollationTries3/str str name=spellcheck.maxCollations1/str /lst arr name=components strsuggest/str /arr /requestHandler If i try spellcheck , i'm not getting proper suggestions. For eg there's a word yellow in my solr document. If i search for yello i'm getting suggestions as yellow, ye ll wo, y e ll ow,ye ll ow. Why is is coming like this? And when i try autosuggest i'm not getting any suggestions for any query. Can anyone help me with this? Thanks in advance. -Dixline.M -- View this message in context
Sor Cloud Autosuggest not working
I recently migrated to Solr Cloud (4.0.0 from 3.6.0) and my auto suggest feature does not seem to be working. It is a typical implementation with a /suggest searchHandler defined on the config. Are there any changes I need to incorporate? Regards Jay
Re: Sor Cloud Autosuggest not working
I think distrib with components has to be setup a little differently - you might need to use shards.qt to point back to the same request handler for the sub searches. Just a guess - been a while since I've looked at spellcheck distrib support and I'm not 100% positive the suggest stuff is all distrib capable - though I think it should be. - Mark On Jan 8, 2013, at 10:06 AM, Jay Parashar jparas...@itscape.com wrote: I recently migrated to Solr Cloud (4.0.0 from 3.6.0) and my auto suggest feature does not seem to be working. It is a typical implementation with a /suggest searchHandler defined on the config. Are there any changes I need to incorporate? Regards Jay
RE: Sor Cloud Autosuggest not working
Thanks Mark! -Original Message- From: Mark Miller [mailto:markrmil...@gmail.com] Sent: Tuesday, January 08, 2013 10:16 AM To: solr-user@lucene.apache.org Subject: Re: Sor Cloud Autosuggest not working I think distrib with components has to be setup a little differently - you might need to use shards.qt to point back to the same request handler for the sub searches. Just a guess - been a while since I've looked at spellcheck distrib support and I'm not 100% positive the suggest stuff is all distrib capable - though I think it should be. - Mark On Jan 8, 2013, at 10:06 AM, Jay Parashar jparas...@itscape.com wrote: I recently migrated to Solr Cloud (4.0.0 from 3.6.0) and my auto suggest feature does not seem to be working. It is a typical implementation with a /suggest searchHandler defined on the config. Are there any changes I need to incorporate? Regards Jay
solr -autosuggest
Hi, A few question on Solr Auto suggest below Q1)I tried using the Index based Suggest functionality with solr 3.6.1 , can I combine this with file based boosting .Currently when I specify the index field and the sourcelocation,the file in the source location is not considered. Is there any way both can be used? Q2)I saw this line where it says Currently implemented Lookups keep their data in memory, so unlike spellchecker data, this data is discarded on core reload and not available until you invoke the build command, either explicitly or implicitly during a commit.I have used the wfst lookup and using the index based suggestion ,I suppose that this applies to only File based suggestion? Is this correct? Q3) if spellcheck.onlyMorePopular=true is selected: weights are treated as popularity score ,Does this mean that this is based on frequency of words or is this based on ranking [tf * idf...ect] ? Regards, Sujatha
Re: Custom Geocoder with Solr and Autosuggest
My first decision was to divide SOLR into two cores, since I am already using SOLR as my search server. One core would be for the main search of the site and one for the geocoding. Correct. And you can even use that location index/collection for locations extraction for a non structural documents - i.e. if you don't have separate field with geographical names in your corpus (or location data is just not good enough compared to what can be mined from documents) My second decision is to store the name data in a normalised state, some examples are shown below: London, England England Swindon, Wiltshire, England Yes, you can add postcode/outcodes there also. And I would add additional field type region/county/town/postcode/outcode. The third decision was to return “autosuggest” results, for example when the user types “Lond” I would like to suggest “London, England”. For this to work I think it makes sense to return up to 5 results via JSON based on relevancy and have these displayed under the search box. Yeah, you might want to boost cities more than towns (I'm sure there are plenty ambiguous terms), use some kind of geoip service, additional scoring factors. My fourth decision is that when the user actually hits the “search” button on the location field, SOLR is again queries and returns the most relevant result, including the co-ordinates which are stored. You can also have special logic to decide if you want to use spatial search or just simple textual match would be better. I.e. you have England in your example. It doesn't sound practical to return coordinates and use spatial search for this use case, right? HTH, Alexey
Custom Geocoder with Solr and Autosuggest
Hi, I want to create a very simple geocoder for returning co-ordinates of a place if a user enters in a town or city. There seems to be very little information about doing it the way I suggest, so I hope I am on a good path. My first decision was to divide SOLR into two cores, since I am already using SOLR as my search server. One core would be for the main search of the site and one for the geocoding. My second decision is to store the name data in a normalised state, some examples are shown below: London, England England Swindon, Wiltshire, England The third decision was to return “autosuggest” results, for example when the user types “Lond” I would like to suggest “London, England”. For this to work I think it makes sense to return up to 5 results via JSON based on relevancy and have these displayed under the search box. My fourth decision is that when the user actually hits the “search” button on the location field, SOLR is again queries and returns the most relevant result, including the co-ordinates which are stored. Am I on a good path here? -- View this message in context: http://lucene.472066.n3.nabble.com/Custom-Geocoder-with-Solr-and-Autosuggest-tp4000791.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr Autosuggest
Hi, I have a question regarding solr Autosuggest. (If this is not the correct link to Post, Please suggest). I have implemented solr Autosuggest with Suggester component. I have read in a blog saying, Currently implemented Lookups keep their data in memory, so unlike spellchecker data, this data is discarded on core reload and not available until you invoke the build command, either explicitly or implicitly during a commit. I have a Master-Slave setup. If i add new documents to Master and give commit, then suggest would be built( as i gave given buildOnCommit=true). But, when replication is done, the Slave would reload the core, At that point, will it affect Autosuggestion of the newly added docs. Thanks, Shri
WFST with autosuggest/geo
Does anyone have the slides or sample code from: Building Query Auto-Completion Systems with Lucene 4.0 Presented by Sudarshan Gaikaiwari, Software Engineer,Yelp We want to implement WFST with GEO boosting. -- Bill Bell billnb...@gmail.com cell 720-256-8076
Re: Problems with AutoSuggest feature(Terms Components)
I'll have to defer that to one of the sharding experts. Best Erick On Tue, Nov 22, 2011 at 1:28 PM, mechravi25 mechrav...@yahoo.co.in wrote: Hi Erick, Thanks for your reply. I would know all the options that can be given under the defaults section and how they can be overridden. is there any documentation available in solr forum. Cos we tried searching and wasn't able to succeed. My Exact scenario is that, I have one master core which has many underlying shards core(Disturbed architecture). I want the terms.limit should be defaulted to 10 in the underlying shards cores. When i hit the master core, it will in-turn hit the underlying shard cores. At this point of time, the terms.limit which has been passed to the master core has to passed to these underlying shard cores overriding the default value set. Can you please suggest the definition of the terms component for the underlying shard cores. Regards, Sivaganesh -- View this message in context: http://lucene.472066.n3.nabble.com/Problems-with-AutoSuggest-feature-Terms-Components-tp3512734p3528597.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Problems with AutoSuggest feature(Terms Components)
Hi Erick, Thanks for your reply. I would know all the options that can be given under the defaults section and how they can be overridden. is there any documentation available in solr forum. Cos we tried searching and wasn't able to succeed. My Exact scenario is that, I have one master core which has many underlying shards core(Disturbed architecture). I want the terms.limit should be defaulted to 10 in the underlying shards cores. When i hit the master core, it will in-turn hit the underlying shard cores. At this point of time, the terms.limit which has been passed to the master core has to passed to these underlying shard cores overriding the default value set. Can you please suggest the definition of the terms component for the underlying shard cores. Regards, Sivaganesh -- View this message in context: http://lucene.472066.n3.nabble.com/Problems-with-AutoSuggest-feature-Terms-Components-tp3512734p3528597.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Problems with AutoSuggest feature(Terms Components)
TermsComponent only reacts to what you send it. How are these requests getting to the TermsComponent? That's where you should look. As far as terms.limit, your requesthandler for TermsComponent in solrconfig.xml has a defaults section and you can set whatever you want in there and then override it as you choose if you sometimes want other values in there. Best Erick On Wed, Nov 16, 2011 at 9:17 AM, mechravi25 mechrav...@yahoo.co.in wrote: Hi, When i search for a data i noticed two things 1.) I noticed that *terms.regex=.** in the logs which does a blank search on terms because of the query time is more. Is there anyway to overcome this. My actual query should go like the first one bolded but instead of that it happens like in the second case(the 2nd text highlighted in bold) 2.) Also I noticed that *terms.limit=-1* which is very expensive as it asks solr to return all the terms back. It should be set to 10 or 20 at most. Please provide some suggestions to set the same. Nov 14, 2011 2:04:08 PM org.apache.solr.core.SolrCore execute INFO: [db] webapp=/solr path=/terms params={*terms.regex=ABC\+CCC\+lll*\+data.*terms.regex.flag=case_insensitiveterms.fl=nameFacet} status=0 QTime=935 Nov 14, 2011 2:04:08 PM org.apache.solr.core.SolrCore execute INFO: [core2] webapp=/solr path=/terms params={terms.regex.flag=case_insensitiveshards.qt=/termsterms.fl=nameFacetterms=trueterms.limit=-1terms.regex=ABC\+CCC\+lll\+data.*isShard=trueqt=/termswt=javabinterms.sort=indexversion=1} status=0 QTime=842 Nov 14, 2011 2:04:08 PM org.apache.solr.core.SolrCore execute INFO: [db] webapp=/solr path=/terms params={terms.regex=ABC\+CCC\+lll\+data.*terms.regex.flag=case_insensitiveterms.fl=nameFacet} status=0 QTime=927 Nov 14, 2011 2:04:08 PM org.apache.solr.core.SolrCore execute INFO: [core3] webapp=/solr path=/terms params={terms.regex.flag=case_insensitiveshards.qt=/termsterms.fl=nameFacetterms=trueterms.limit=-1terms.regex=.*isShard=trueqt=/termswt=javabinterms.sort=indexversion=1} status=0 QTime=115 Nov 14, 2011 2:05:55 PM org.apache.solr.core.SolrCore execute INFO: [core1] webapp=/solr path=/terms params={terms.regex.flag=case_insensitiveshards.qt=/termsterms.fl=nameFacetterms=trueterms.limit=-1*terms.regex=.**isShard=trueqt=/termswt=javabinterms.sort=indexversion=1} status=0 QTime=106767 Nov 14, 2011 2:05:55 PM org.apache.solr.core.SolrCore execute INFO: [core4] webapp=/solr path=/terms params={terms.regex.flag=case_insensitiveshards.qt=/termsterms.fl=nameFacetterms=trueterms.limit=-1terms.regex=.*isShard=trueqt=/termswt=javabinterms.sort=indexversion=1} status=0 QTime=106766 Nov 14, 2011 2:05:55 PM org.apache.solr.core.SolrCore execute -- View this message in context: http://lucene.472066.n3.nabble.com/Problems-with-AutoSuggest-feature-Terms-Components-tp3512734p3512734.html Sent from the Solr - User mailing list archive at Nabble.com.
Problems with AutoSuggest feature(Terms Components)
Hi, When i search for a data i noticed two things 1.) I noticed that *terms.regex=.** in the logs which does a blank search on terms because of the query time is more. Is there anyway to overcome this. My actual query should go like the first one bolded but instead of that it happens like in the second case(the 2nd text highlighted in bold) 2.) Also I noticed that *terms.limit=-1* which is very expensive as it asks solr to return all the terms back. It should be set to 10 or 20 at most. Please provide some suggestions to set the same. Nov 14, 2011 2:04:08 PM org.apache.solr.core.SolrCore execute INFO: [db] webapp=/solr path=/terms params={*terms.regex=ABC\+CCC\+lll*\+data.*terms.regex.flag=case_insensitiveterms.fl=nameFacet} status=0 QTime=935 Nov 14, 2011 2:04:08 PM org.apache.solr.core.SolrCore execute INFO: [core2] webapp=/solr path=/terms params={terms.regex.flag=case_insensitiveshards.qt=/termsterms.fl=nameFacetterms=trueterms.limit=-1terms.regex=ABC\+CCC\+lll\+data.*isShard=trueqt=/termswt=javabinterms.sort=indexversion=1} status=0 QTime=842 Nov 14, 2011 2:04:08 PM org.apache.solr.core.SolrCore execute INFO: [db] webapp=/solr path=/terms params={terms.regex=ABC\+CCC\+lll\+data.*terms.regex.flag=case_insensitiveterms.fl=nameFacet} status=0 QTime=927 Nov 14, 2011 2:04:08 PM org.apache.solr.core.SolrCore execute INFO: [core3] webapp=/solr path=/terms params={terms.regex.flag=case_insensitiveshards.qt=/termsterms.fl=nameFacetterms=trueterms.limit=-1terms.regex=.*isShard=trueqt=/termswt=javabinterms.sort=indexversion=1} status=0 QTime=115 Nov 14, 2011 2:05:55 PM org.apache.solr.core.SolrCore execute INFO: [core1] webapp=/solr path=/terms params={terms.regex.flag=case_insensitiveshards.qt=/termsterms.fl=nameFacetterms=trueterms.limit=-1*terms.regex=.**isShard=trueqt=/termswt=javabinterms.sort=indexversion=1} status=0 QTime=106767 Nov 14, 2011 2:05:55 PM org.apache.solr.core.SolrCore execute INFO: [core4] webapp=/solr path=/terms params={terms.regex.flag=case_insensitiveshards.qt=/termsterms.fl=nameFacetterms=trueterms.limit=-1terms.regex=.*isShard=trueqt=/termswt=javabinterms.sort=indexversion=1} status=0 QTime=106766 Nov 14, 2011 2:05:55 PM org.apache.solr.core.SolrCore execute -- View this message in context: http://lucene.472066.n3.nabble.com/Problems-with-AutoSuggest-feature-Terms-Components-tp3512734p3512734.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: autosuggest combination of data from documents and popular queries
hi Hoss, This helps. Only thing i am not sure is use of TermsComponent. As I understand TermsComponent allows sorking only on count|index. So I m not sure how popularity could be used for sort or boost. Any thoughts around using TermsComponent with popularity? If this is possible then i dont think I would even need ngrams at all Any suggestions? abhay -- View this message in context: http://lucene.472066.n3.nabble.com/autosuggest-combination-of-data-from-documents-and-popular-queries-tp3360657p3378874.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: autosuggest combination of data from documents and popular queries
anyone? How to sort for termscomponent? -- View this message in context: http://lucene.472066.n3.nabble.com/autosuggest-combination-of-data-from-documents-and-popular-queries-tp3360657p3381201.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: autosuggest combination of data from documents and popular queries
: If user starts typing m i wil show mango as suggestion. And other : suggestions should come from the document title in index. So if I have a : document in index with title Man .. so suggestions would be : mango : man ... : Is this doable ? any options ? It's totally doable, and you've already done the hard part by building up a database of the popular queries you want to seed the suggestions with, abd building up an suggestion index where each document corrisponds to a single suggestion. but in order to also have suggestions come from the fields of your main index, you'll need to also add them as individual documents to that same suggestion index. you could either get those field values from whatever original source you used, or you crawl your own solr index. If you want individual *terms* from the index to be added as suggestions, then the LukeRequestHandler or the TermsComponent would probably be the easiest way to extract them. -Hoss
Re: autosuggest combination of data from documents and popular queries
hi hoss, This helps.. But as I understand TermsComponent does not allow sort on popularity..Just coun|index. Or I m missing something? If TermsComponent allows custom sorting i dont even have to use ngrams. Any thoughts? abhay -- View this message in context: http://lucene.472066.n3.nabble.com/autosuggest-combination-of-data-from-documents-and-popular-queries-tp3360657p3378096.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: autosuggest combination of data from documents and popular queries
hi My requirement is i have a list of popular search terms in database seachterm | count --- mango | 100 Consider i have only oneterm in that table, mango. I use edgengram and put that in auto_complete field in solr index with count. If user starts typing m i wil show mango as suggestion. And other suggestions should come from the document title in index. So if I have a document in index with title Man .. so suggestions would be mango man Now say user starts typing sa now i dont have a popular search term then it should show suggestions from index data Is this doable ? any options ? -- View this message in context: http://lucene.472066.n3.nabble.com/autosuggest-combination-of-data-from-documents-and-popular-queries-tp3360657p3362049.html Sent from the Solr - User mailing list archive at Nabble.com.
Autosuggest best practice / feedback
Hi there, I'm relatively new to Solr and have been playing around with it for a few weeks now. I've got a system setup now that I'm currently quite happy with and is returning some decent results (although there's always room for improvement). Just hoping to get some feedback on the setup Currently running 2 seperate Solr engines, one tasked with storing products and their various info, the other is storing previous site searches and is being used for auto suggest functionality. The auto suggest schema : fieldType name=text_ngram class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_en.txt enablePositionIncrement=true/ filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=15 side=front/ /analyzer analyzer type=query tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType Stopwords is being used to filter out rude words from previous searches (is this the best way of doing things?) Also looking at implementing a Did you mean? suggestor which will probably search against a WhitespaceTokened field of the same data rather than this one. Any thoughts / feedback / comments / criticism / biscuits appreciated Cheers Doug -- Become a Firebox Fan on Facebook: http://facebook.com/firebox And Follow us on Twitter: http://twitter.com/firebox Firebox has been nominated for Retailer of the Year in the 2011 Stuff Awards. Who will win? It's up to you! Visit http://www.stuff.tv/awards and place your vote. We'll do a special dance if it's us. Firebox HQ is MOVING HOUSE! We're migrating from Streatham Hill to shiny new digs in Shoreditch. As of 3rd October please update your records to: Firebox.com, 6.10 The Tea Building, 56 Shoreditch High Street, London, E1 6JJ Global Head Office: Firebox House, Ardwell Road, London SW2 4RT Firebox.com Ltd is registered in England and Wales, company number 3874477 Registered Company Address: 41 Welbeck Street London W1G 8EA Firebox.com Any views expressed in this email are those of the individual sender, except where the sender expressly, and with authority, states them to be the views of Firebox.com Ltd.
autosuggest combination of data from documents and popular queries
hi we already have autosuggest working using solr based on popular search terms. we use following approach.. http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/ Now we want to use data indexed in solr also for autosuggest. with popular search terms to have higher priority. can we just copy field containing doc text to a auto suggest filed which does edgengram analysis? also we have around 100 K docs in index so performance would be be a concern? Any help is really appreciated -- View this message in context: http://lucene.472066.n3.nabble.com/autosuggest-combination-of-data-from-documents-and-popular-queries-tp3360657p3360657.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: autosuggest combination of data from documents and popular queries
Hello, hi we already have autosuggest working using solr based on popular search terms. Just terms of whole queries? I assume the latter. we use following approach.. http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/ Now we want to use data indexed in solr also for autosuggest. with popular search terms to have higher priority. can we just copy field containing doc text to a auto suggest filed which does edgengram analysis? Something doesn't feel right here. Using data from the index for suggestions makes sense - we do that on http://search-lucene.com/ for example. Popular search terms having high priority and doc text, how does that work? Oh, you mean if you have a doc with field body whose value is foo bar baz then, assuming the term bar is one of those popular search terms you would want bar to come up as a suggestion? That's doable with some coding, yes, but I don't think this would create a very good search experience. Here are some thoughts: * instead of suggesting popular query terms, suggest popular query strings * suggest phrases such as query strings, titles from a title field if you have it, author names from an author name field if you have it, and other fields of that nature * ... also we have around 100 K docs in index so performance would be be a concern? I think that depends on the implementation. For example, suggestions you see on search-lucene.com are powered by http://sematext.com/products/autocomplete/index.html and that solution works well with millions of suggestions. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/
How can I create a good autosuggest list with phrases?
I'm at the point in my Solr deployment where I want to start using it for autosuggest, but I've run into a snag. Because the fields that I want to use for autosuggest are tokenized, I can only get single terms out of it. I would like to have it find common phrases that are between two and five words long, so that if someone starts typing ang their autosuggest list will include Angelina Jolie as well as possibly Brad Pitt and Angelina Jolie. My index is already quite large, so I do not want to add shingles. I tried to use the clustering component, but that will only give you halfway decent results if you make the rows= parameter absolutely huge and therefore things run very slowly. Also, it only works against stored fields, so I can only run it against the field where we retrieve captions, not the full description. It's impractical to get results based on an entire index, much less all seven shards. I'm OK with offline analysis to generate a list of suggestions, and I'm also OK with doing that analysis against the MySQL data source rather than Solr. I just need some pointers about what software and/or techniques I can use to generate a good list, and then some idea of how to configure Solr to use that list. Can anyone help? Thanks, Shawn
Re: How can I create a good autosuggest list with phrases?
We handled similar requirement in our product kitchendaily.com by creating a list of Search terms which were frequently searched over a period of time and then building auto-suggestion index from this data. The constant updates of this will allow you to support a well formed auto-suggest feature. This is a good and faster solution if you have application logs to start with and not very high volume of data. Or you can search Solr with the user entered data, which returns all the matching results and boost the data by field which will be used in AutoSuggest box, use top 5 items in the dynamic div. Hope it Helps. -param On 8/4/11 11:42 AM, Shawn Heisey s...@elyograg.org wrote: I'm at the point in my Solr deployment where I want to start using it for autosuggest, but I've run into a snag. Because the fields that I want to use for autosuggest are tokenized, I can only get single terms out of it. I would like to have it find common phrases that are between two and five words long, so that if someone starts typing ang their autosuggest list will include Angelina Jolie as well as possibly Brad Pitt and Angelina Jolie. My index is already quite large, so I do not want to add shingles. I tried to use the clustering component, but that will only give you halfway decent results if you make the rows= parameter absolutely huge and therefore things run very slowly. Also, it only works against stored fields, so I can only run it against the field where we retrieve captions, not the full description. It's impractical to get results based on an entire index, much less all seven shards. I'm OK with offline analysis to generate a list of suggestions, and I'm also OK with doing that analysis against the MySQL data source rather than Solr. I just need some pointers about what software and/or techniques I can use to generate a good list, and then some idea of how to configure Solr to use that list. Can anyone help? Thanks, Shawn
Re: How can I create a good autosuggest list with phrases?
On 8/4/2011 10:04 AM, Sethi, Parampreet wrote: We handled similar requirement in our product kitchendaily.com by creating a list of Search terms which were frequently searched over a period of time and then building auto-suggestion index from this data. The constant updates of this will allow you to support a well formed auto-suggest feature. This is a good and faster solution if you have application logs to start with and not very high volume of data. I do have some separate plans to include data from our query logs, but I'd also like to get data from the index itself, more than one term at a time. Thanks, Shawn
Re: Solr Autosuggest help
Hi, One more query. Currently in the autosuggestion Solr returns words like below: googl googl _ googl search googl chrome googl map The last letter seems to be missing in autosuggestion. I have send the query as ?qt=/termsterms=trueterms.fl=mydataterms.lower=googterms.prefix=goog. The following is my schema.xml for the Text filed. fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.WhitespaceTokenizerFactory filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt filter class=solr.WordDelimiterFilterFactory generateWordParts=0 generateNumberParts=1 catenateWords=0 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1 filter class=solr.LowerCaseFilterFactory filter class=solr.EnglishPorterFilterFactory protected=protwords.txt filter class=solr.RemoveDuplicatesTokenFilterFactory filter class=solr.ShingleFilterFactory maxShingleSize=2 outputUnigrams=true outputUnigramIfNoNgram=true analyzer fieldType Could anyone update what could be wrong? why the last letter get missing. It occurs for a few word only. Suggestions for other words are good only. One more query, how the word 'sci/tech' will be indexed in solr. If I search on sci/tech it wont send any results. Thanks in Advance. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Autosuggest-help-tp2580944p2692651.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Autosuggest help
hi, We have found that 'EnglishPorterFilterFactory' causes that issue. I believe that is used for stemming words. Once we commented that factory, it works fine. And another thing, currently I am checking about how the word 'sci/tech' will be indexed in solr. As mentioned in my previous email, if I search on sci/tech it wont send any results. But solr has the terms as sci/tech. When I search on other terms which also contain sci/tech, it returns both the words. Please let me know, if you have any idea regarding that.. If I came to know I will update this thread. thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Autosuggest-help-tp2580944p2693601.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Autosuggest help
Rahul, Go to your Solr Admin Analysis page, enter sci/tech, check appropriate check boxes, and see how sci/tech gets analyzed. This will lead you in the right direction. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: rahul asharud...@gmail.com To: solr-user@lucene.apache.org Sent: Thu, March 17, 2011 10:12:27 AM Subject: Re: Solr Autosuggest help hi, We have found that 'EnglishPorterFilterFactory' causes that issue. I believe that is used for stemming words. Once we commented that factory, it works fine. And another thing, currently I am checking about how the word 'sci/tech' will be indexed in solr. As mentioned in my previous email, if I search on sci/tech it wont send any results. But solr has the terms as sci/tech. When I search on other terms which also contain sci/tech, it returns both the words. Please let me know, if you have any idea regarding that.. If I came to know I will update this thread. thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Autosuggest-help-tp2580944p2693601.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Autosuggest help
I have added the following line in both the section and in section in schema.xml. filter class=solr.ShingleFilterFactory maxShingleSize=2 outputUnigrams=true outputUnigramIfNoNgram=true And reindex my content. However, if I query solr for the multi work search terms suggestion , it only send the single word suggestions. http://localhost:8080/solr/mydata/select?qt=/termsterms=trueterms.fl=contentterms.lower=javaterms.prefix=javaterms.lower.incl=falseindent=true It wont return the words like 'java final', it only returns words like javadoc, javascript.. Could any one update me how to correct this.. or what I am missing.. What happens when you add terms.limit=-1 to your search URL? Or when you use java plus one blank character in terms.prefix? terms.prefix=java indent=true Can you see multi-word terms in admin/schema.jsp page?
Re: Solr Autosuggest help
hi.. thanks for your replies.. It seems I mistakenly put ShingleFilterFactory in another field. When I put the factory in correct field it works fine now. Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Autosuggest-help-tp2580944p2645780.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Autosuggest help
Hi I have added the following line in both the section and in section in schema.xml. filter class=solr.ShingleFilterFactory maxShingleSize=2 outputUnigrams=true outputUnigramIfNoNgram=true And reindex my content. However, if I query solr for the multi work search terms suggestion , it only send the single word suggestions. http://localhost:8080/solr/mydata/select?qt=/termsterms=trueterms.fl=contentterms.lower=javaterms.prefix=javaterms.lower.incl=falseindent=true It wont return the words like 'java final', it only returns words like javadoc, javascript.. Could any one update me how to correct this.. or what I am missing.. thanks, -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Autosuggest-help-tp2580944p2645316.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr Autosuggest help
Hi, I am using Solr (1.4.1) AutoSuggest feature using termsComponent. Currently, if I type 'goo' means, Solr suggest words like 'google'. But I would like to receive suggestions like 'google, google alerts, ..' . ie, suggestions with single and multiple terms. Not sure, whether I need to use edgengrams for that. for eg, indexing google like 'go', 'oo', 'og', ... . But I think I don't need this, Since I don't want partial search. Please let me know if there is any way to do multiple word suggestions . Thanks in Advance. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Autosuggest-help-tp2580944p2580944.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Autosuggest help
I am using Solr (1.4.1) AutoSuggest feature using termsComponent. Currently, if I type 'goo' means, Solr suggest words like 'google'. But I would like to receive suggestions like 'google, google alerts, ..' . ie, suggestions with single and multiple terms. Not sure, whether I need to use edgengrams for that. for eg, indexing google like 'go', 'oo', 'og', ... . But I think I don't need this, Since I don't want partial search. Please let me know if there is any way to do multiple word suggestions . If you will stick with TermsComponent, you need to add ShingleFilterFactory to your index analyzer chain for that. http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ShingleFilterFactory
Autosuggest terms which GOOGLE uses?
How Google selects the autosuggest terms? Is that Google uses Userrs Queries from Log files to suggest only those terms? - Kumar Anurag -- View this message in context: http://lucene.472066.n3.nabble.com/Autosuggest-terms-which-GOOGLE-uses-tp2039078p2039078.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Autosuggest terms which GOOGLE uses?
Kind of : their suggestions are based on users queries with some filtering. You can have a little read there : http://www.google.com/support/websearch/bin/answer.py?hl=enanswer=106230 They perform little filtering to remove offending content such as hate speech, violence and pornography (quoting the page). You can also have a look at this slideshow : http://www.slideshare.net/sturlese/use-ofsolrattrovitclassifiedads-marcsturlese . You'll see how they build their suggest service using a dedicated solr instance. Hope this helps ;-) -- Tanguy 2010/12/8 Anurag anurag.it.jo...@gmail.com: How Google selects the autosuggest terms? Is that Google uses Userrs Queries from Log files to suggest only those terms? - Kumar Anurag -- View this message in context: http://lucene.472066.n3.nabble.com/Autosuggest-terms-which-GOOGLE-uses-tp2039078p2039078.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Autosuggest terms which GOOGLE uses?
Thanks a lot!! If I want to index query terms from lof files ? Is it possible . And then want to do autosuggest query on all those terms using termsComponentTill now my autosuggest options are like q.prefix= or q.suffix= which matches the terms available in the documents. - Kumar Anurag -- View this message in context: http://lucene.472066.n3.nabble.com/Autosuggest-terms-which-GOOGLE-uses-tp2039078p2039307.html Sent from the Solr - User mailing list archive at Nabble.com.
facet+shingle in autosuggest
Hi, I am using a facet.prefix search with shingle's in my autosuggest: fieldType name=shingle class=solr.TextField positionIncrementGap=100 stored=false multiValued=true analyzer tokenizer class=solr.StandardTokenizerFactory / filter class=solr.LowerCaseFilterFactory / filter class=solr.RemoveDuplicatesTokenFilterFactory/ filter class=solr.ShingleFilterFactory maxShingleSize=3 outputUnigrams=true outputUnigramIfNoNgram=false / /analyzer /fieldType Now I would like to prevent stop words to appear in the suggestions: lst name=autosuggest_shingle int name=member states52/int int name=member states experiencing6/int int name=member states in6/int int name=member states the5/int int name=member states to25/int int name=member states with7/int /lst Here I would like to filter out the last 4 suggestions really. Is there a way I can sensibly bring in a stop word filter here? Actually in theory the stop words could appear as the first or second word as well. So I guess when producing shingle's I want to skip any stop word from being part of any shingle. regards, Lukas Kahwe Smith m...@pooteeweet.org
Re: facet+shingle in autosuggest
I don't know all the implications here, but can't you just insert the StopwordFilterFactory before the ShingleFilterFactory and turn it loose? Best Erick On Thu, Nov 11, 2010 at 4:02 PM, Lukas Kahwe Smith m...@pooteeweet.orgwrote: Hi, I am using a facet.prefix search with shingle's in my autosuggest: fieldType name=shingle class=solr.TextField positionIncrementGap=100 stored=false multiValued=true analyzer tokenizer class=solr.StandardTokenizerFactory / filter class=solr.LowerCaseFilterFactory / filter class=solr.RemoveDuplicatesTokenFilterFactory/ filter class=solr.ShingleFilterFactory maxShingleSize=3 outputUnigrams=true outputUnigramIfNoNgram=false / /analyzer /fieldType Now I would like to prevent stop words to appear in the suggestions: lst name=autosuggest_shingle int name=member states52/int int name=member states experiencing6/int int name=member states in6/int int name=member states the5/int int name=member states to25/int int name=member states with7/int /lst Here I would like to filter out the last 4 suggestions really. Is there a way I can sensibly bring in a stop word filter here? Actually in theory the stop words could appear as the first or second word as well. So I guess when producing shingle's I want to skip any stop word from being part of any shingle. regards, Lukas Kahwe Smith m...@pooteeweet.org
Re: facet+shingle in autosuggest
On 11.11.2010, at 17:42, Erick Erickson wrote: I don't know all the implications here, but can't you just insert the StopwordFilterFactory before the ShingleFilterFactory and turn it loose? havent tried this, but i would suspect that i would then get in trouble with stuff like united states of america. it would then generate a shingle with united states america which in turn wouldnt generate a proper phrase search string. one option of course would be to restrict the shingles to 2 words and then using the stop word filter would work as expected. regards, Lukas Kahwe Smith m...@pooteeweet.org
phrase query with autosuggest (SOLR-1316)
It seemed like SOLR-1316 was a little too long to continue the conversation. Is there support for quotes indicating a phrase query. For example, my autosuggest query for mike sha ought to return mike shaffer, mike sharp, etc. Instead I get suggestions for mike and for sha, resulting in a collated result mike r meyer shaw, Cheers, Mike
RE: phrase query with autosuggest (SOLR-1316)
My simple but effective solution to that problem was to replace the white spaces in the items you index for autosuggest with some special character, then your wildcarding will work with the whole phrase as you desire. Index: mike_shaffer Query: mike_sha* -Original Message- From: mike anderson [mailto:saidthero...@gmail.com] Sent: Wednesday, October 06, 2010 7:33 AM To: solr-user@lucene.apache.org Subject: phrase query with autosuggest (SOLR-1316) It seemed like SOLR-1316 was a little too long to continue the conversation. Is there support for quotes indicating a phrase query. For example, my autosuggest query for mike sha ought to return mike shaffer, mike sharp, etc. Instead I get suggestions for mike and for sha, resulting in a collated result mike r meyer shaw, Cheers, Mike
Re: phrase query with autosuggest (SOLR-1316)
If you use Chantal's suggestion from an earlier thread, involving facets and tokenized fields, but not the tokens handling -- i think it will work. (But that solution requires only one auto-suggest value per document). There are a bunch of ways people have figured out to do auto-suggest without putting it in an entirely seperate Solr core. They all have their issues and strengths and weaknesses, including a weakness of being kind of confusing to implement sometimes. I don't think anyone's come up with a general purpose works for everything isn't confusing solution yet. Robert Petersen wrote: My simple but effective solution to that problem was to replace the white spaces in the items you index for autosuggest with some special character, then your wildcarding will work with the whole phrase as you desire. Index: mike_shaffer Query: mike_sha* -Original Message- From: mike anderson [mailto:saidthero...@gmail.com] Sent: Wednesday, October 06, 2010 7:33 AM To: solr-user@lucene.apache.org Subject: phrase query with autosuggest (SOLR-1316) It seemed like SOLR-1316 was a little too long to continue the conversation. Is there support for quotes indicating a phrase query. For example, my autosuggest query for mike sha ought to return mike shaffer, mike sharp, etc. Instead I get suggestions for mike and for sha, resulting in a collated result mike r meyer shaw, Cheers, Mike
Re: Autosuggest with inner phrases
Or, plug, this: http://www.sematext.com/products/autocomplete/index.html , which happens to use the same bass examples as the original poster. :) You can see this Autosuggest in action on http://search-lucene.com/ . Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Original Message From: Jason Rutherglen jason.rutherg...@gmail.com To: solr-user@lucene.apache.org Sent: Sat, October 2, 2010 3:40:52 PM Subject: Re: Autosuggest with inner phrases This's what yer lookin' for: http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/ / On Sat, Oct 2, 2010 at 3:14 AM, sivaprasad sivaprasa...@echidnainc.com wrote: Hi , I implemented the auto suggest using terms component.But the suggestions are coming from the starting of the word.But i want inner phrases also.For example, if I type bass Auto-Complete should offer suggestions that include bass fishing or bass guitar, and even sea bass (note how bass is not necessarily the first word). How can i achieve this using solr's terms component. Regards, Siva -- View this message in context: http://lucene.472066.n3.nabble.com/Autosuggest-with-inner-phrases-tp1619326p1619326.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Autosuggest with inner phrases
I had the same question few days back. You can look at the solution suggested by Chantal in this link. http://www.lucidimagination.com/search/document/9bbce5302bd3940e/autocomplete_match_words_anywhere_in_the_token#cec7133bbaf5b49c On Sat, Oct 2, 2010 at 3:44 PM, sivaprasad sivaprasa...@echidnainc.comwrote: Hi , I implemented the auto suggest using terms component.But the suggestions are coming from the starting of the word.But i want inner phrases also.For example, if I type bass Auto-Complete should offer suggestions that include bass fishing or bass guitar, and even sea bass (note how bass is not necessarily the first word). How can i achieve this using solr's terms component. Regards, Siva -- View this message in context: http://lucene.472066.n3.nabble.com/Autosuggest-with-inner-phrases-tp1619326p1619326.html Sent from the Solr - User mailing list archive at Nabble.com. -- Arun
Re: Autosuggest with inner phrases
Hi, This thread can be useful http://www.lucidimagination.com/search/document/9edc01a90a195336/enhancing_auto_complete#d1340d7715162608 Regards, Bhavnik On 10/3/2010 11:51 PM, Arunkumar Ayyavu wrote: I had the same question few days back. You can look at the solution suggested by Chantal in this link. http://www.lucidimagination.com/search/document/9bbce5302bd3940e/autocomplete_match_words_anywhere_in_the_token#cec7133bbaf5b49c On Sat, Oct 2, 2010 at 3:44 PM, sivaprasadsivaprasa...@echidnainc.comwrote: Hi , I implemented the auto suggest using terms component.But the suggestions are coming from the starting of the word.But i want inner phrases also.For example, if I type bass Auto-Complete should offer suggestions that include bass fishing or bass guitar, and even sea bass (note how bass is not necessarily the first word). How can i achieve this using solr's terms component. Regards, Siva -- View this message in context: http://lucene.472066.n3.nabble.com/Autosuggest-with-inner-phrases-tp1619326p1619326.html Sent from the Solr - User mailing list archive at Nabble.com. The contents of this eMail including the contents of attachment(s) are privileged and confidential material of Gateway NINtec Pvt. Ltd. (GNPL) and should not be disclosed to, used by or copied in any manner by anyone other than the intended addressee(s). If this eMail has been received by error, please advise the sender immediately and delete it from your system. The views expressed in this eMail message are those of the individual sender, except where the sender expressly, and with authority, states them to be the views of GNPL. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this eMail or any action taken in reliance on this eMail is strictly prohibited and may be unlawful. This eMail may contain viruses. GNPL has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this eMail. You should carry out your own virus checks before opening the eMail or attachment(s). GNPL is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt. GNPL reserves the right to monitor and review the content of all messages sent to or from this eMail address and may be stored on the GNPL eMail system. In case this eMail has reached you in error, and you would no longer like to receive eMails from us, then please send an eMail to d...@gatewaynintec.com