Re: Switch from Sphinx to Solr - some basics please

2012-08-20 Thread Lance Norskog
"I have for example jobs form country A, jobs from country B and so on until
100 countries. I need to have for each country an separate index, because if
someone search for jobs in country A I need to query only the index for
country A. How to solve this problem?"

Ah! Will the text be in different languages? You probably want a
separate index for each language base. Solr/Lucene has very good
facilities for text in different languages.

In general, this will be a learning experience. After you do the
deployment, you will discover problems in search design, deployment
design, scaling architecture, and operations tools. You should plan to
do two Solr deployments.

On Wed, Aug 15, 2012 at 8:27 AM, Walter Underwood  wrote:
> These do require some Sphinx knowledge. I could answer them on StackOverflow 
> because I converted Chegg from Sphinx to Solr this year.
>
> As I said there, read about Solr cores. They are independent search 
> configurations and indexes within one Solr server: 
> http://wiki.apache.org/solr/CoreAdmin
>
> For your jobs example, I would use filter queries to limit the search to a 
> single country. Filter them to country:us or country:de or country:fr and you 
> will only get result from that country.
>
> Solr does not use the term "rotate" for indexes. You can delete with a query, 
> so you could delete all the jobs for one country, reindex those, then commit.
>
> Separate cores are best when you have different kinds of data. At Chegg, we 
> search books and college courses. Those are in different cores and have very 
> different schemas.
>
> wunder
>
> On Aug 15, 2012, at 5:11 AM, nnikolay wrote:
>
>> HI iorixxx, thanks for the reply.
>>
>> Well you don't need sphinx knowledge to answer my questions.
>>
>> I have write you what I want:
>>
>> 1. I need to have 2 seprate indexes. In Stackoverlfow I became the answer I
>> need to start 2 cores for example. How many cores can I run for solr? I have
>> for example over 100 different indexes, that they should seeing as separate
>> data. This indexes should be reindexed in different times and the data of
>> them should not mixed with each other.
>>
>> You need to understand follow situation:
>>
>> I have for example jobs form country A, jobs from country B and so on until
>> 100 countries. I need to have for each country an separate index, because if
>> someone search for jobs in country A I need to query only the index for
>> country A. How to solve this problem?
>>
>> How to do this? Is there are good tutorial? In the wiki of solr, it is very
>> bad explained.
>>
>> 2. When I become new data for example: Should I rotate the whole index
>> again, or can I include the new rows and delete the old rows. What is your
>> suggestion?
>>
>> Thanks
>> Nik
>>
>>
>>
>> --
>> View this message in context: 
>> http://lucene.472066.n3.nabble.com/Switch-from-Sphinx-to-Solr-some-basics-please-tp4001234p4001379.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>
> --
> Walter Underwood
> wun...@wunderwood.org
>
>
>



-- 
Lance Norskog
goks...@gmail.com


Re: Switch from Sphinx to Solr - some basics please

2012-08-15 Thread Walter Underwood
These do require some Sphinx knowledge. I could answer them on StackOverflow 
because I converted Chegg from Sphinx to Solr this year.

As I said there, read about Solr cores. They are independent search 
configurations and indexes within one Solr server: 
http://wiki.apache.org/solr/CoreAdmin 

For your jobs example, I would use filter queries to limit the search to a 
single country. Filter them to country:us or country:de or country:fr and you 
will only get result from that country.

Solr does not use the term "rotate" for indexes. You can delete with a query, 
so you could delete all the jobs for one country, reindex those, then commit.

Separate cores are best when you have different kinds of data. At Chegg, we 
search books and college courses. Those are in different cores and have very 
different schemas.

wunder

On Aug 15, 2012, at 5:11 AM, nnikolay wrote:

> HI iorixxx, thanks for the reply.
> 
> Well you don't need sphinx knowledge to answer my questions.
> 
> I have write you what I want:
> 
> 1. I need to have 2 seprate indexes. In Stackoverlfow I became the answer I
> need to start 2 cores for example. How many cores can I run for solr? I have
> for example over 100 different indexes, that they should seeing as separate
> data. This indexes should be reindexed in different times and the data of
> them should not mixed with each other.
> 
> You need to understand follow situation:
> 
> I have for example jobs form country A, jobs from country B and so on until
> 100 countries. I need to have for each country an separate index, because if
> someone search for jobs in country A I need to query only the index for
> country A. How to solve this problem?
> 
> How to do this? Is there are good tutorial? In the wiki of solr, it is very
> bad explained.
> 
> 2. When I become new data for example: Should I rotate the whole index
> again, or can I include the new rows and delete the old rows. What is your
> suggestion?
> 
> Thanks
> Nik
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Switch-from-Sphinx-to-Solr-some-basics-please-tp4001234p4001379.html
> Sent from the Solr - User mailing list archive at Nabble.com.

--
Walter Underwood
wun...@wunderwood.org





Re: Switch from Sphinx to Solr - some basics please

2012-08-15 Thread Ahmet Arslan

> 1. I need to have 2 seprate indexes. In Stackoverlfow I
> became the answer I
> need to start 2 cores for example. How many cores can I run
> for solr? 

Please see : http://search-lucene.com/m/6rYti2ehFZ82


> I have for example jobs form country A, jobs from country B
> and so on until
> 100 countries. I need to have for each country an separate
> index, because if
> someone search for jobs in country A I need to query only
> the index for
> country A. How to solve this problem?
> How to do this? Is there are good tutorial? In the wiki of
> solr, it is very
> bad explained.

http://wiki.apache.org/solr/MultipleIndexes talks about different solutions. 
One big index with fq is an option too.

> 2. When I become new data for example: Should I rotate the
> whole index
> again, or can I include the new rows and delete the old
> rows. What is your
> suggestion?

I don't understand this. What do you mean by rotate the whole index?


Re: Switch from Sphinx to Solr - some basics please

2012-08-15 Thread nnikolay
HI iorixxx, thanks for the reply.

Well you don't need sphinx knowledge to answer my questions.

I have write you what I want:

1. I need to have 2 seprate indexes. In Stackoverlfow I became the answer I
need to start 2 cores for example. How many cores can I run for solr? I have
for example over 100 different indexes, that they should seeing as separate
data. This indexes should be reindexed in different times and the data of
them should not mixed with each other.

You need to understand follow situation:

I have for example jobs form country A, jobs from country B and so on until
100 countries. I need to have for each country an separate index, because if
someone search for jobs in country A I need to query only the index for
country A. How to solve this problem?

How to do this? Is there are good tutorial? In the wiki of solr, it is very
bad explained.

2. When I become new data for example: Should I rotate the whole index
again, or can I include the new rows and delete the old rows. What is your
suggestion?

Thanks
Nik



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Switch-from-Sphinx-to-Solr-some-basics-please-tp4001234p4001379.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Switch from Sphinx to Solr - some basics please

2012-08-15 Thread Ahmet Arslan
> Because I have set a post in Stackoverflow, I wan't, that
> there is dublicate
> questions. Can you please read this post:
> 
> http://stackoverflow.com/questions/11956608/sphinx-user-is-switching-to-solr

Your questions require Sphinx knowledge. I suggest you to read these book(s) 
http://lucene.apache.org/solr/books.html
http://www.manning.com/hatcher3/

"I have in Sphinx: min_word_len ... How to use this in Solr?"

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters/#solr.LengthFilterFactory


Switch from Sphinx to Solr - some basics please

2012-08-14 Thread nnikolay
Hi all, 

I am switching from Sphinx to Solr right now and I am looking for some basic
configurations and understanding, so I can switch my project in some weeks.

Because I have set a post in Stackoverflow, I wan't, that there is dublicate
questions. Can you please read this post:

http://stackoverflow.com/questions/11956608/sphinx-user-is-switching-to-solr

I will be very happy if someone can tell me the 6 points, where I have some
difficulties to understand how to make the things right for me.

Thanks
Nik



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Switch-from-Sphinx-to-Solr-some-basics-please-tp4001234.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Some basics

2010-06-16 Thread Otis Gospodnetic
Frank,

Is the following what you are after:

Here is a query for my last name, but misspelled: 
http://search-lucene.com/?q=gospodneticc

But if you look above the results, you will see this text:

  Search results for "gospodnetic" :

... and the search results are indeed for the auto-corrected query.

To get this functionality we built this:

  http://sematext.com/products/dym-researcher/index.html

Regarding your second question:
I don't think there is anything in Solr that allows it to automatically figure 
out which terms are the "more specific" ones and which are the "more general" 
ones.  Perhaps it can base such assumptions about terms based on their 
occurrence frequency in the index, and here TermVectorsComponent can help.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
> From: Frank A 
> To: solr-user@lucene.apache.org
> Sent: Wed, June 2, 2010 9:32:12 PM
> Subject: Some basics
> 
> Hi,

I'm new to SOLR and have some basic questions that hopefully steer me 
> in the
right direction.

- I want my search to "auto" spell check - 
> that is if someone types
"restarant" I'd like the system to automatically 
> search for restaurant.
I've seen the SpellCheckComponent but that doesn't 
> seem to have a simple way
to automatically do the "near" type 
> comparison.  Is the SpellCheckComponent
the wrong one or do I just need 
> to manually handle the situation in my
client code?

- Also, what is 
> the proper analyzer if I want to search a search for "thai
food" or "thai 
> restaurant" to actually match on Thai?  I can't totally
ignore words 
> like food and restaurant but I want to ignore more general
terms and look for 
> specific first (or I should say score them higher).

Any tips on what I 
> should be reading up on will be greatly appreciated.

Thanks.


Re: Some basics

2010-06-14 Thread Chris Hostetter

: - I want my search to "auto" spell check - that is if someone types
: "restarant" I'd like the system to automatically search for restaurant.
: I've seen the SpellCheckComponent but that doesn't seem to have a simple way
: to automatically do the "near" type comparison.  Is the SpellCheckComponent
: the wrong one or do I just need to manually handle the situation in my
: client code?

at the moment you need to handle this in your client -- if you get no 
results back (or too few results based on some expecatation you have) 
but the spellcheck component retunred a suggestion then trigger a 
subsequent search using that suggestion.

: - Also, what is the proper analyzer if I want to search a search for "thai
: food" or "thai restaurant" to actually match on Thai?  I can't totally
: ignore words like food and restaurant but I want to ignore more general
: terms and look for specific first (or I should say score them higher).

the issue isn't so much your analyzer as how you structure your query -- i 
would suggest using the dismax query parser with a very low value for hte 
'mm' param (ie: '1' or something like '10%' if you expect a lot of queries 
with many many words) and a useful "pf" param -- that way two word queries 
will return matches for either word, but docs that match both words will 
score higher, and docs that match the full phrase will score the highest.




-Hoss



Some basics

2010-06-02 Thread Frank A
Hi,

I'm new to SOLR and have some basic questions that hopefully steer me in the
right direction.

- I want my search to "auto" spell check - that is if someone types
"restarant" I'd like the system to automatically search for restaurant.
I've seen the SpellCheckComponent but that doesn't seem to have a simple way
to automatically do the "near" type comparison.  Is the SpellCheckComponent
the wrong one or do I just need to manually handle the situation in my
client code?

- Also, what is the proper analyzer if I want to search a search for "thai
food" or "thai restaurant" to actually match on Thai?  I can't totally
ignore words like food and restaurant but I want to ignore more general
terms and look for specific first (or I should say score them higher).

Any tips on what I should be reading up on will be greatly appreciated.

Thanks.