Re: Use Parallel Search

Em Thu, 03 Feb 2011 11:34:27 -0800

Hello Gustavo,

well, I did not use Nutch at all, but I got some experience with using Solr.


In Solr you could use a multicore-setup where each core points to
another hard-drive of your server. For other Solr-Servers ( and cores as
well ) each core is a seperate index, so to query all drives of one
server you have to do a distributed request to get all results from all
cores (indizes).
You got a little bit Http-overhead, because you have to send six
http-requests per server to get your results.

You could also set up 6 Solr-instances per box or 3 with two cores per
instance, but I do not see any reason to do so.


Could you please explain what you mean with "remote class search"? Is it
a Nutch-specific thing I never heard before?

There is no difference between a Lucene-Index created by Solr and a
Lucene-Index created by Nutch or Lucene itself.
Solr is just a Server-implementation of the Lucene-Framework.

Regards

Am 03.02.2011 19:06, schrieb Gustavo Maia:
> Hello,
>
> Let me give a brief description of my scenario.
> Today I am only using Lucene 2.9.3. I have an index of 30 million documents
> distributed on three machines and each machine with 6 hds (15k rmp).
> The server queries the search index using the remote class search. And each
> machine is made to search using the parallel search (search simultaneously
> in 6 hds).
> So during the search are simulating using the three machines and 18 hds,
> returning me to a very good response time.
>
>
> Today I am studying the SOLR and am interested in knowing more about the
> searches and use of distributed parallel search on the same machine. What
> would be the best scenario using SOLR that is better than I already am using
> today only with lucene?
>   Note: I need to have installed on each machine 6 SOLR instantiate from my
> server? One for each hd? Or would some other alternative way for me to use
> the 6 hds without having 6 instances of SORL server?
>
>   Another question would be if the SOLR would have some limiting size index
> for Hard drive? It would be interesting not index too big because when the
> index increased the longer the search.
>
> Thanks for everything.
>
>
> Gustavo Maia
>

Re: Use Parallel Search

Reply via email to