Re: Switch from Sphinx to Solr - some basics please
"I have for example jobs form country A, jobs from country B and so on until 100 countries. I need to have for each country an separate index, because if someone search for jobs in country A I need to query only the index for country A. How to solve this problem?" Ah! Will the text be in different languages? You probably want a separate index for each language base. Solr/Lucene has very good facilities for text in different languages. In general, this will be a learning experience. After you do the deployment, you will discover problems in search design, deployment design, scaling architecture, and operations tools. You should plan to do two Solr deployments. On Wed, Aug 15, 2012 at 8:27 AM, Walter Underwood wrote: > These do require some Sphinx knowledge. I could answer them on StackOverflow > because I converted Chegg from Sphinx to Solr this year. > > As I said there, read about Solr cores. They are independent search > configurations and indexes within one Solr server: > http://wiki.apache.org/solr/CoreAdmin > > For your jobs example, I would use filter queries to limit the search to a > single country. Filter them to country:us or country:de or country:fr and you > will only get result from that country. > > Solr does not use the term "rotate" for indexes. You can delete with a query, > so you could delete all the jobs for one country, reindex those, then commit. > > Separate cores are best when you have different kinds of data. At Chegg, we > search books and college courses. Those are in different cores and have very > different schemas. > > wunder > > On Aug 15, 2012, at 5:11 AM, nnikolay wrote: > >> HI iorixxx, thanks for the reply. >> >> Well you don't need sphinx knowledge to answer my questions. >> >> I have write you what I want: >> >> 1. I need to have 2 seprate indexes. In Stackoverlfow I became the answer I >> need to start 2 cores for example. How many cores can I run for solr? I have >> for example over 100 different indexes, that they should seeing as separate >> data. This indexes should be reindexed in different times and the data of >> them should not mixed with each other. >> >> You need to understand follow situation: >> >> I have for example jobs form country A, jobs from country B and so on until >> 100 countries. I need to have for each country an separate index, because if >> someone search for jobs in country A I need to query only the index for >> country A. How to solve this problem? >> >> How to do this? Is there are good tutorial? In the wiki of solr, it is very >> bad explained. >> >> 2. When I become new data for example: Should I rotate the whole index >> again, or can I include the new rows and delete the old rows. What is your >> suggestion? >> >> Thanks >> Nik >> >> >> >> -- >> View this message in context: >> http://lucene.472066.n3.nabble.com/Switch-from-Sphinx-to-Solr-some-basics-please-tp4001234p4001379.html >> Sent from the Solr - User mailing list archive at Nabble.com. > > -- > Walter Underwood > wun...@wunderwood.org > > > -- Lance Norskog goks...@gmail.com
Re: Switch from Sphinx to Solr - some basics please
These do require some Sphinx knowledge. I could answer them on StackOverflow because I converted Chegg from Sphinx to Solr this year. As I said there, read about Solr cores. They are independent search configurations and indexes within one Solr server: http://wiki.apache.org/solr/CoreAdmin For your jobs example, I would use filter queries to limit the search to a single country. Filter them to country:us or country:de or country:fr and you will only get result from that country. Solr does not use the term "rotate" for indexes. You can delete with a query, so you could delete all the jobs for one country, reindex those, then commit. Separate cores are best when you have different kinds of data. At Chegg, we search books and college courses. Those are in different cores and have very different schemas. wunder On Aug 15, 2012, at 5:11 AM, nnikolay wrote: > HI iorixxx, thanks for the reply. > > Well you don't need sphinx knowledge to answer my questions. > > I have write you what I want: > > 1. I need to have 2 seprate indexes. In Stackoverlfow I became the answer I > need to start 2 cores for example. How many cores can I run for solr? I have > for example over 100 different indexes, that they should seeing as separate > data. This indexes should be reindexed in different times and the data of > them should not mixed with each other. > > You need to understand follow situation: > > I have for example jobs form country A, jobs from country B and so on until > 100 countries. I need to have for each country an separate index, because if > someone search for jobs in country A I need to query only the index for > country A. How to solve this problem? > > How to do this? Is there are good tutorial? In the wiki of solr, it is very > bad explained. > > 2. When I become new data for example: Should I rotate the whole index > again, or can I include the new rows and delete the old rows. What is your > suggestion? > > Thanks > Nik > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Switch-from-Sphinx-to-Solr-some-basics-please-tp4001234p4001379.html > Sent from the Solr - User mailing list archive at Nabble.com. -- Walter Underwood wun...@wunderwood.org
Re: Switch from Sphinx to Solr - some basics please
> 1. I need to have 2 seprate indexes. In Stackoverlfow I > became the answer I > need to start 2 cores for example. How many cores can I run > for solr? Please see : http://search-lucene.com/m/6rYti2ehFZ82 > I have for example jobs form country A, jobs from country B > and so on until > 100 countries. I need to have for each country an separate > index, because if > someone search for jobs in country A I need to query only > the index for > country A. How to solve this problem? > How to do this? Is there are good tutorial? In the wiki of > solr, it is very > bad explained. http://wiki.apache.org/solr/MultipleIndexes talks about different solutions. One big index with fq is an option too. > 2. When I become new data for example: Should I rotate the > whole index > again, or can I include the new rows and delete the old > rows. What is your > suggestion? I don't understand this. What do you mean by rotate the whole index?
Re: Switch from Sphinx to Solr - some basics please
HI iorixxx, thanks for the reply. Well you don't need sphinx knowledge to answer my questions. I have write you what I want: 1. I need to have 2 seprate indexes. In Stackoverlfow I became the answer I need to start 2 cores for example. How many cores can I run for solr? I have for example over 100 different indexes, that they should seeing as separate data. This indexes should be reindexed in different times and the data of them should not mixed with each other. You need to understand follow situation: I have for example jobs form country A, jobs from country B and so on until 100 countries. I need to have for each country an separate index, because if someone search for jobs in country A I need to query only the index for country A. How to solve this problem? How to do this? Is there are good tutorial? In the wiki of solr, it is very bad explained. 2. When I become new data for example: Should I rotate the whole index again, or can I include the new rows and delete the old rows. What is your suggestion? Thanks Nik -- View this message in context: http://lucene.472066.n3.nabble.com/Switch-from-Sphinx-to-Solr-some-basics-please-tp4001234p4001379.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Switch from Sphinx to Solr - some basics please
> Because I have set a post in Stackoverflow, I wan't, that > there is dublicate > questions. Can you please read this post: > > http://stackoverflow.com/questions/11956608/sphinx-user-is-switching-to-solr Your questions require Sphinx knowledge. I suggest you to read these book(s) http://lucene.apache.org/solr/books.html http://www.manning.com/hatcher3/ "I have in Sphinx: min_word_len ... How to use this in Solr?" http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters/#solr.LengthFilterFactory
Switch from Sphinx to Solr - some basics please
Hi all, I am switching from Sphinx to Solr right now and I am looking for some basic configurations and understanding, so I can switch my project in some weeks. Because I have set a post in Stackoverflow, I wan't, that there is dublicate questions. Can you please read this post: http://stackoverflow.com/questions/11956608/sphinx-user-is-switching-to-solr I will be very happy if someone can tell me the 6 points, where I have some difficulties to understand how to make the things right for me. Thanks Nik -- View this message in context: http://lucene.472066.n3.nabble.com/Switch-from-Sphinx-to-Solr-some-basics-please-tp4001234.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Some basics
Frank, Is the following what you are after: Here is a query for my last name, but misspelled: http://search-lucene.com/?q=gospodneticc But if you look above the results, you will see this text: Search results for "gospodnetic" : ... and the search results are indeed for the auto-corrected query. To get this functionality we built this: http://sematext.com/products/dym-researcher/index.html Regarding your second question: I don't think there is anything in Solr that allows it to automatically figure out which terms are the "more specific" ones and which are the "more general" ones. Perhaps it can base such assumptions about terms based on their occurrence frequency in the index, and here TermVectorsComponent can help. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message > From: Frank A > To: solr-user@lucene.apache.org > Sent: Wed, June 2, 2010 9:32:12 PM > Subject: Some basics > > Hi, I'm new to SOLR and have some basic questions that hopefully steer me > in the right direction. - I want my search to "auto" spell check - > that is if someone types "restarant" I'd like the system to automatically > search for restaurant. I've seen the SpellCheckComponent but that doesn't > seem to have a simple way to automatically do the "near" type > comparison. Is the SpellCheckComponent the wrong one or do I just need > to manually handle the situation in my client code? - Also, what is > the proper analyzer if I want to search a search for "thai food" or "thai > restaurant" to actually match on Thai? I can't totally ignore words > like food and restaurant but I want to ignore more general terms and look for > specific first (or I should say score them higher). Any tips on what I > should be reading up on will be greatly appreciated. Thanks.
Re: Some basics
: - I want my search to "auto" spell check - that is if someone types : "restarant" I'd like the system to automatically search for restaurant. : I've seen the SpellCheckComponent but that doesn't seem to have a simple way : to automatically do the "near" type comparison. Is the SpellCheckComponent : the wrong one or do I just need to manually handle the situation in my : client code? at the moment you need to handle this in your client -- if you get no results back (or too few results based on some expecatation you have) but the spellcheck component retunred a suggestion then trigger a subsequent search using that suggestion. : - Also, what is the proper analyzer if I want to search a search for "thai : food" or "thai restaurant" to actually match on Thai? I can't totally : ignore words like food and restaurant but I want to ignore more general : terms and look for specific first (or I should say score them higher). the issue isn't so much your analyzer as how you structure your query -- i would suggest using the dismax query parser with a very low value for hte 'mm' param (ie: '1' or something like '10%' if you expect a lot of queries with many many words) and a useful "pf" param -- that way two word queries will return matches for either word, but docs that match both words will score higher, and docs that match the full phrase will score the highest. -Hoss
Some basics
Hi, I'm new to SOLR and have some basic questions that hopefully steer me in the right direction. - I want my search to "auto" spell check - that is if someone types "restarant" I'd like the system to automatically search for restaurant. I've seen the SpellCheckComponent but that doesn't seem to have a simple way to automatically do the "near" type comparison. Is the SpellCheckComponent the wrong one or do I just need to manually handle the situation in my client code? - Also, what is the proper analyzer if I want to search a search for "thai food" or "thai restaurant" to actually match on Thai? I can't totally ignore words like food and restaurant but I want to ignore more general terms and look for specific first (or I should say score them higher). Any tips on what I should be reading up on will be greatly appreciated. Thanks.