On Tue, Aug 11, 2009 at 1:14 PM, David Pratt<[email protected]> wrote: > Hi Jim. That is pretty cool. See there are more than 300,000 records at > present. Curious about how this will work when you get into much larger > scale with RAM requirement to perform search since this goes up > substantially with lucene as number of docs goes up. I have have tended to > look at sharding and parallel multisearch as means of horizontally scaling > Lucene by breaking into chunks. This approach is interesting and just > interested how you anticipate scale and performance with document growth.
We haven't had significant RAM requirements with the numbers of documents we have at the moment. Nutch is a more complete solution for search that has support for parallel search, and I imagine that there are other good ways of doing parallel search. Back when JXTA was still something I used it to create parallel distributed search across people's desktops with pretty good results. Combining the search results can end up taking some work, though. Jim -- Jim McCusker Programmer Analyst Krauthammer Lab, Pathology Informatics Yale School of Medicine [email protected] | (203) 785-6330 http://krauthammerlab.med.yale.edu _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
