In this setup how many milliseconds does each search take you?
Colin McNamara
ID Analytics


-----Original Message-----
From: Dennis Kubes [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, August 09, 2006 7:40 AM
To: [email protected]
Subject: Re: Single DFS or alternative architectures for performance?

You wouldn't want to use the DFS for searching.  You would want to use 
the DFS/MapReduce for creating the index and slicing it up into certain 
segment sizes of say 1-2 million pages.  Then those individual index 
segments would need to be moved to a local file systems that have search

servers running each searching that specific part of the index.  You 
would then have the search client (usually a website) sit in front of 
the search servers and use the searchservers.txt file to specify the 
search servers it connects to.  The search client would aggregate the 
results of the multiple index search servers and return the results to 
the client.

We are currently using 1 million pages per index segment although others

on the list have stated that they have gotten up to 2 million pages 
without problems.  After that the query tends to slow down because of 
the length of time it takes to read individual index segments.  We have 
been running individual servers for each search segments but are  
currently playing around with having a single search server with many 
small disks (say 10 x 20G) with each disk having an index segment.  I  
don't know if that will work though.

Dennis

Murat Ali Bayir wrote:
> Hi everybody,
>
> Does a system with one DFS (crawl, parse, index, and search etc. all 
> on 1 DFS)
> have performance problems at search part? What if 2 DFS were used? One

> for
> search part (getting summary etc.) and the other one is for the other 
> nutch operations
> (fetch, parse, index etc.). Or is there any alternative architectures 
> for systems performing
> all the nutch functions concurrently on one DFS?



-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to