Hi, I have only 2 million pages (for now) on 4 fast machines (2 cpu 4gb).
Searching for two separate words takes up to 15 seconds while for one word it takes .5 - 1 second. now from reading on the archives searching using NDFS is not recommended, it is too slow. Can any body give a description of their settings please. What I am trying also to figure out is: 1. If NDFS is too slow and all data must be copied to HD FS why use it in the first place? 2. If using NDFS and HD don you get 4 copies of the same data? 3. Assuming the data is 3 TB, how do you split the data to be read by the searcher when not using NDFS? Thank you. ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
