Re: Solr suggest is related to second letter, not to initial letter
Yes. I did it. Bu it doesn’t work. New Example; TSTLookup doc 1 : shoe adidas 2 hiking doc 2 : galaxy samsung s5 phone doc 3 : shakeology sample packets http://localhost:8983/solr/solr/suggest?q=samsung+hi response lst name=responseHeader int name=status0/int int name=QTime1/int /lst lst name=spellcheck lst name=suggestions lst name=samsung int name=numFound2/int int name=startOffset0/int int name=endOffset7/int arr name=suggestion strsamsung s5/str strsamsung s5 phone/str /arr /lst lst name=hi int name=numFound1/int int name=startOffset8/int int name=endOffset10/int arr name=suggestion strhiking/str /arr /lst lst name=collation str name=collationQuery(samsung s5) hiking/str int name=hits0/int lst name=misspellingsAndCorrections str name=samsungsamsung s5/str str name=hihiking/str /lst /lst lst name=collation str name=collationQuery(samsung s5 phone) hiking/str int name=hits0/int lst name=misspellingsAndCorrections str name=samsungsamsung s5 phone/str str name=hihiking/str /lst /lst lst name=collation str name=collationQuerysamsung hiking/str int name=hits0/int lst name=misspellingsAndCorrections str name=samsungsamsung/str str name=hihiking/str /lst /lst /lst /lst /response field name=suggestions type=suggest_term indexed=true multiValued=true stored=false omitNorms=true/ fieldType name=suggest_term class=solr.TextField positionIncrementGap=100 analyzer type=index charFilter class=solr.MappingCharFilterFactory mapping=mapping-PunctuationToSpace.txt/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1 splitOnNumerics=0 preserveOriginal=1 / filter class=solr.TrimFilterFactory/ filter class=solr.TurkishLowerCaseFilterFactory/ filter class=solr.StandardFilterFactory/ filter class=solr.ShingleFilterFactory minShingleSize=2 maxShingleSize=4 outputUnigrams=true/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / /analyzer analyzer type=query charFilter class=solr.MappingCharFilterFactory mapping=mapping-PunctuationToSpace.txt/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1 splitOnNumerics=0 preserveOriginal=0 / filter class=solr.TrimFilterFactory/ filter class=solr.TurkishLowerCaseFilterFactory/ filter class=solr.StandardFilterFactory/ filter class=solr.ShingleFilterFactory minShingleSize=2 maxShingleSize=4 outputUnigrams=true/ filter class=solr.ApostropheFilterFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ /analyzer /fieldType searchComponent class=solr.SpellCheckComponent name=suggest lst name=spellchecker str name=namedefault/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.tst.TSTLookup/str str name=fieldsuggestions/str !-- the indexed field to derive suggestions from -- float name=threshold0.1/float str name=buildOnCommittrue/str /lst str name=queryAnalyzerFieldTypesuggest_term/str /searchComponent !-- auto-complete -- requestHandler name=/suggest class=solr.SearchHandler lst name=defaults str name=spellchecktrue/str str name=spellcheck.buildfalse/str str name=spellcheck.dictionarydefault/str str name=spellcheck.onlyMorePopulartrue/str str name=spellcheck.count10/str str name=spellcheck.collatetrue/str str name=spellcheck.collateExtendedResultstrue/str str name=spellcheck.maxCollations10/str str name=spellcheck.maxCollationTries100/str /lst arr name=components strsuggest/str /arr /requestHandler --- FreeTextLookupFactory doc 1 : shoe adidas 2 hiking doc 2 : galaxy samsung s5 phone doc 3 : shakeology sample packets http://localhost:8983/solr/solr/suggest?q=samsung+hi response lst name=responseHeader int name=status0/int int name=QTime3/int /lst lst name=spellcheck lst name=suggestions lst name=samsung int name=numFound9/int int name=startOffset0/int int name=endOffset7/int arr name=suggestion strsamsung s5/str strsamsung s5 phone/str strsamsung s5 phone galaxy/str strsamsung s5 phone samsung/str strsamsung s5 phone samsung samsungs5phone/str strsamsung s5 phone samsung samsungs5phone s5/str strsamsung s5
Re: Solr suggest is related to second letter, not to initial letter
On 02/17/2015 03:46 AM, Volkan Altan wrote: First of all thank you for your answer. You're welcome - thanks for sending a more complete example of your problem and expected behavior. I don’t want to use KeywordTokenizer. Because, as long as the compound words written by the user are available in any document, I am able to receive a conclusion. I just don’t want “q=galaxy + samsung” to appear; because it is an inappropriate suggession and it doesn’t work. Many Thanks Ahead of Time! Did you try the other suggestions in my earlier reply? -Mike
Re: Solr suggest is related to second letter, not to initial letter
First of all thank you for your answer. Example Url: doc 1 suggest_field: galaxy samsung s5 phone doc 2 suggest_field: shoe adidas 2 hiking http://localhost:8983/solr/solr/suggest?q=galaxy+s The result for which I am waiting is just like the one indicated below. But; the ‘’Galaxy shoe’’ isn’t supposed to appear. However,unfortunately, the galaxy shoe appears now. lst name=collation str name=collationQuerygalaxy samsung/str int name=hits0/int lst name=misspellingsAndCorrections str name=galaxygalaxy/str str name=samsungsamsung/str /lst /lst lst name=collation str name=collationQuerygalaxy s5/str int name=hits0/int lst name=misspellingsAndCorrections str name=galaxygalaxy/str str name=s5s5/str /lst /lst I don’t want to use KeywordTokenizer. Because, as long as the compound words written by the user are available in any document, I am able to receive a conclusion. I just don’t want “q=galaxy + samsung” to appear; because it is an inappropriate suggession and it doesn’t work. Many Thanks Ahead of Time! My settings; searchComponent class=solr.SpellCheckComponent name=suggest lst name=spellchecker str name=namedefault/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.tst.TSTLookup/str str name=fieldsuggestions/str float name=threshold0.1/float str name=buildOnCommittrue/str /lst str name=queryAnalyzerFieldTypesuggest_term/str /searchComponent !-- auto-complete -- requestHandler name=/suggest class=solr.SearchHandler lst name=defaults str name=spellchecktrue/str str name=spellcheck.buildfalse/str str name=spellcheck.dictionarydefault/str str name=spellcheck.onlyMorePopulartrue/str str name=spellcheck.count10/str str name=“spellcheck.collatetrue/str str name=spellcheck.collateExtendedResultstrue/str str name=spellcheck.maxCollations10/str str name=spellcheck.maxCollationTries100/str /lst arr name=components strsuggest/str /arr /requestHandler fieldType name=suggest_term class=solr.TextField positionIncrementGap=100 analyzer type=index charFilter class=solr.MappingCharFilterFactory mapping=mapping-PunctuationToSpace.txt/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.TrimFilterFactory/ filter class=solr.LowerCaseFilterFactory / filter class=solr.TurkishLowerCaseFilterFactory/ filter class=solr.StandardFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / /analyzer analyzer type=query charFilter class=solr.MappingCharFilterFactory mapping=mapping-PunctuationToSpace.txt/ tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.TrimFilterFactory/ filter class=solr.StandardFilterFactory/ filter class=solr.ApostropheFilterFactory/ filter class=solr.TurkishLowerCaseFilterFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ /analyzer /fieldType On 16 Şub 2015, at 03:52, Michael Sokolov msoko...@safaribooksonline.com wrote: StandardTokenizer splits your text into tokens, and the suggester suggests tokens independently. It sounds as if you want the suggestions to be based on the entire text (not just the current word), and that only adjacent words in the original should appear as suggestions. Assuming that's what you are after (it's a little hard to tell from your e-mail -- you might want to clarify by providing a few example of how you *do* want it to work instead of just examples of how you *don't* want it to work), you have a couple of choices: 1) don't use StandardTokenizer, use KeywordTokenizer instead - this will preserve the entire original text and suggest complete texts, rather than words 2) maybe consider using a shingle filter along with standard tokenizer, so that your tokens include multi-word shingles 3) Use a suggester with better support for a statistical language model, like this one: http://blog.mikemccandless.com/2014/01/finding-long-tail-suggestions-using.html, but to do this you will probably need to do some java programming since it isn't well integrated into solr -Mike On 2/14/2015 3:44 AM, Volkan Altan wrote: Any idea? On 12 Şub 2015, at 11:12, Volkan Altan volkanal...@gmail.com wrote: Hello Everyone, All I want to do with Solr suggester is obtaining the
Re: Solr suggest is related to second letter, not to initial letter
StandardTokenizer splits your text into tokens, and the suggester suggests tokens independently. It sounds as if you want the suggestions to be based on the entire text (not just the current word), and that only adjacent words in the original should appear as suggestions. Assuming that's what you are after (it's a little hard to tell from your e-mail -- you might want to clarify by providing a few example of how you *do* want it to work instead of just examples of how you *don't* want it to work), you have a couple of choices: 1) don't use StandardTokenizer, use KeywordTokenizer instead - this will preserve the entire original text and suggest complete texts, rather than words 2) maybe consider using a shingle filter along with standard tokenizer, so that your tokens include multi-word shingles 3) Use a suggester with better support for a statistical language model, like this one: http://blog.mikemccandless.com/2014/01/finding-long-tail-suggestions-using.html, but to do this you will probably need to do some java programming since it isn't well integrated into solr -Mike On 2/14/2015 3:44 AM, Volkan Altan wrote: Any idea? On 12 Şub 2015, at 11:12, Volkan Altan volkanal...@gmail.com wrote: Hello Everyone, All I want to do with Solr suggester is obtaining the fact that the asserted suggestions for the second letter whose entry actualizes after the initial letter is actually related to initial letter, itself. But; just like the initial letters, the second letters rotate independently, as well. Example; http://localhost:8983/solr/solr/suggest?q=facet_suggest_data:”adidas+s; http://localhost:8983/solr/vitringez/suggest?q=facet_suggest_data:%22adidas+s%22 adidas s response lst name=responseHeader int name=status0/int int name=QTime4/int /lst lst name=spellcheck lst name=suggestions lst name=s int name=numFound1/int int name=startOffset27/int int name=endOffset28/int arr name=suggestion strsamsung/str /arr /lst lst name=collation str name=collationQueryfacet_suggest_data:adidas samsung/str int name=hits0/int lst name=misspellingsAndCorrections str name=adidasadidas/str str name=ssamsung/str /lst /lst /lst /lst /response The terms of ‘’Adidas’’ and ‘’Samsung’’ are available within seperate documents. A common place in which both of them are available cannot be found. How can I solve that problem? schema.xml fieldType name=suggestions_type class=solr.TextField positionIncrementGap=100 analyzer type=index charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.ApostropheFilterFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=false/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.ApostropheFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType field name=“facet_suggest_data type=suggestions_type indexed=true multiValued=true stored=false omitNorms=true/ Best
Re: Solr suggest is related to second letter, not to initial letter
Any idea? On 12 Şub 2015, at 11:12, Volkan Altan volkanal...@gmail.com wrote: Hello Everyone, All I want to do with Solr suggester is obtaining the fact that the asserted suggestions for the second letter whose entry actualizes after the initial letter is actually related to initial letter, itself. But; just like the initial letters, the second letters rotate independently, as well. Example; http://localhost:8983/solr/solr/suggest?q=facet_suggest_data:”adidas+s; http://localhost:8983/solr/vitringez/suggest?q=facet_suggest_data:%22adidas+s%22 adidas s response lst name=responseHeader int name=status0/int int name=QTime4/int /lst lst name=spellcheck lst name=suggestions lst name=s int name=numFound1/int int name=startOffset27/int int name=endOffset28/int arr name=suggestion strsamsung/str /arr /lst lst name=collation str name=collationQueryfacet_suggest_data:adidas samsung/str int name=hits0/int lst name=misspellingsAndCorrections str name=adidasadidas/str str name=ssamsung/str /lst /lst /lst /lst /response The terms of ‘’Adidas’’ and ‘’Samsung’’ are available within seperate documents. A common place in which both of them are available cannot be found. How can I solve that problem? schema.xml fieldType name=suggestions_type class=solr.TextField positionIncrementGap=100 analyzer type=index charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.ApostropheFilterFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=false/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.ApostropheFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType field name=“facet_suggest_data type=suggestions_type indexed=true multiValued=true stored=false omitNorms=true/ Best
Solr suggest is related to second letter, not to initial letter
Hello Everyone, All I want to do with Solr suggester is obtaining the fact that the asserted suggestions for the second letter whose entry actualizes after the initial letter is actually related to initial letter, itself. But; just like the initial letters, the second letters rotate independently, as well. Example; http://localhost:8983/solr/solr/suggest?q=facet_suggest_data:”adidas+s; http://localhost:8983/solr/vitringez/suggest?q=facet_suggest_data:%22adidas+s%22 adidas s response lst name=responseHeader int name=status0/int int name=QTime4/int /lst lst name=spellcheck lst name=suggestions lst name=s int name=numFound1/int int name=startOffset27/int int name=endOffset28/int arr name=suggestion strsamsung/str /arr /lst lst name=collation str name=collationQueryfacet_suggest_data:adidas samsung/str int name=hits0/int lst name=misspellingsAndCorrections str name=adidasadidas/str str name=ssamsung/str /lst /lst /lst /lst /response The terms of ‘’Adidas’’ and ‘’Samsung’’ are available within seperate documents. A common place in which both of them are available cannot be found. How can I solve that problem? schema.xml fieldType name=suggestions_type class=solr.TextField positionIncrementGap=100 analyzer type=index charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.ApostropheFilterFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=false/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.ApostropheFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType field name=“facet_suggest_data type=suggestions_type indexed=true multiValued=true stored=false omitNorms=true/ Best