I've been using the current distributed search, but it is a major PITA to take care of - especially when you try and delete old segments, re-merge in new data and keep things "fresh"
Has there been any thought on enhancing this process so it is somewhat centralized, managed within nutch framework and extendable? How best is it to segment your indices? Just split thinks and setup a huge query farm that hopefully can handle the load? Try and break things up based on the PR of data and then as queries are happened you have beefy high PR servers and scale down? How about sorting your data based on terms/words/data? ANyone have any clue on how yahoo/google or any other major search system manages the query load, indices, updating of data and keeps a fast response time? ------------------------------------------------------- This SF.Net email is sponsored by the new InstallShield X. >From Windows to Linux, servers to mobile, InstallShield X is the one installation-authoring solution that does it all. Learn more and evaluate today! http://www.installshield.com/Dev2Dev/0504 _______________________________________________ Nutch-general mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-general
