Hi,

You only have a few pages so your Nutch can run locally on a very small machine 
with ease. You can also put those domains in the same Solr index without 
problems and use filter queries do restrict searches within domains or even 
hosts.

Do not put both Nutch and Solr on the same machine, Nutch will tear down Solr's 
performance when it processes the CrawlDB due to heavy I/O and CPU time 
consumption.

Cheers,
 
-----Original message-----
> From:Bayu Widyasanyata <[email protected]>
> Sent: Tue 15-Jan-2013 16:38
> To: [email protected]
> Subject: nutch/solr design for multi sub-domain websites
> 
> Hi,
> 
> I'm quite new on nutch/solr and just got a big challenges to develop single
> search engine for multi sub-domain websites (e.g. abc.domain.com,
> def.domain.com, etc.).
> 
> The number and facts are as follows:
> - number of portals (with same domain): 30-50 sites
> - number of pages on each site: 300-500 pages (docs)
> - number of PDF files: about 10-20% of total pages (on each site).
> - only 1 server will dedicated for search engine, hence I think no hadoop
> implementation will be.
> 
> My questions are:
> 1. Where can I find the references for this kind challenges?
> 2. Can anyone give best suggestions or strategy?
> 3. Should we create multi solr core? What are the benefits in having
> multiple solr-core?
> I just think don't put eggs in sngle cart.
> 
> Thanks and very appreciated for any enlightenment...
> 
> -- 
> wassalam,
> [bayu]
> 

Reply via email to