In this setup how many milliseconds does each search take you? Colin McNamara ID Analytics
-----Original Message----- From: Dennis Kubes [mailto:[EMAIL PROTECTED] Sent: Wednesday, August 09, 2006 7:40 AM To: [email protected] Subject: Re: Single DFS or alternative architectures for performance? You wouldn't want to use the DFS for searching. You would want to use the DFS/MapReduce for creating the index and slicing it up into certain segment sizes of say 1-2 million pages. Then those individual index segments would need to be moved to a local file systems that have search servers running each searching that specific part of the index. You would then have the search client (usually a website) sit in front of the search servers and use the searchservers.txt file to specify the search servers it connects to. The search client would aggregate the results of the multiple index search servers and return the results to the client. We are currently using 1 million pages per index segment although others on the list have stated that they have gotten up to 2 million pages without problems. After that the query tends to slow down because of the length of time it takes to read individual index segments. We have been running individual servers for each search segments but are currently playing around with having a single search server with many small disks (say 10 x 20G) with each disk having an index segment. I don't know if that will work though. Dennis Murat Ali Bayir wrote: > Hi everybody, > > Does a system with one DFS (crawl, parse, index, and search etc. all > on 1 DFS) > have performance problems at search part? What if 2 DFS were used? One > for > search part (getting summary etc.) and the other one is for the other > nutch operations > (fetch, parse, index etc.). Or is there any alternative architectures > for systems performing > all the nutch functions concurrently on one DFS? ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
