Hi Jim. That is pretty cool. See there are more than 300,000 records at present. Curious about how this will work when you get into much larger scale with RAM requirement to perform search since this goes up substantially with lucene as number of docs goes up. I have have tended to look at sharding and parallel multisearch as means of horizontally scaling Lucene by breaking into chunks. This approach is interesting and just interested how you anticipate scale and performance with document growth. Many thanks.
Regards David On 11-Aug-09, at 12:05 PM, Jim McCusker wrote: > We have had good performance using Lucene as a search engine in Java > backed by Lustre (mentioned in a previous email): > > http://krauthammerlab.med.yale.edu/imagefinder > > The images are in a hashed directory structure that provides O(1) > access to the image file contents, and the search engine in turn > serves as a flexible hash table that provides O(1) per search term > access to keywords, metadata, and full text. > > Lucene is available at http://lucene.apache.org and is a joy to work > with. > > Jim > > On Mon, Aug 10, 2009 at 12:11 AM, Pranas > Baliuka<[email protected]> wrote: >> Dear Lustre experts/users, >> >> >> >> I looking for optimal solution of the task: >> >> Internet-scale applications must be designed to process high >> volumes of >> transactions. >> >> Describe a design for a system that must process on average 30,000 >> HTTP >> requests per second. >> >> For each request, the system must perform a lookup into a >> dictionary of 50 >> million words, using a key word passed in via the URL query string. >> >> Each response will consist of a string containing the definition of >> the word >> (10 KB or less). >> >> >> >> My initial though was using MySQL/Berkeley DB pointing to SAN, but >> probably >> lower level solution would be more affordable. >> >> Can I use e.g. QFS storage via Java without DB severer instead. Can >> SAN be >> avoided and local HDDs joined to Lustre system? >> >> >> >> Task is hypothetical, but would be nice to get feedback from specific >> technology experts... >> >> Some ideas ;) >> >> >> >> I’ve send similar request to QFS forum and really not sure which >> product >> would fit better. Both works as distributed file systems ... and >> both sounds >> as convenient storage for particular task. >> >> >> >> Thanks, >> >> Pranas >> >> >> >> _______________________________________________ >> Lustre-discuss mailing list >> [email protected] >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> > > > > -- > Jim > -- > Jim McCusker > Programmer Analyst > Krauthammer Lab, Pathology Informatics > Yale School of Medicine > [email protected] | (203) 785-6330 > http://krauthammerlab.med.yale.edu > _______________________________________________ > Lustre-discuss mailing list > [email protected] > http://lists.lustre.org/mailman/listinfo/lustre-discuss _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
