I've been using the current distributed search, but it
is a major PITA to take care of - especially when you
try and delete old segments, re-merge in new data and
keep things "fresh"

Has there been any thought on enhancing this process
so it is somewhat centralized, managed within nutch
framework and extendable?

How best is it to segment your indices? Just split
thinks and setup a huge query farm that hopefully can
handle the load? Try and break things up based on the
PR of data and then as queries are happened you have
beefy high PR servers and scale down?

How about sorting your data based on terms/words/data?

ANyone have any clue on how yahoo/google or any other
major search system manages the query load, indices,
updating of data and keeps a fast response time?  


-------------------------------------------------------
This SF.Net email is sponsored by the new InstallShield X.
>From Windows to Linux, servers to mobile, InstallShield X is the
one installation-authoring solution that does it all. Learn more and
evaluate today! http://www.installshield.com/Dev2Dev/0504
_______________________________________________
Nutch-general mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to