My experience so far: 200k number of indexes were created in 90 mins(including db time), index size is 200m, query a key word on all string fields(30) takes 0.3-1 sec, query a key word on one field takes tens of mill seconds.
-----Original Message----- From: Charlie Jackson [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 26, 2007 8:53 AM To: solr-user@lucene.apache.org Subject: RE: dataset parameters suitable for lucene application My experiences so far with this level of data have been good. Number of records: Maxed out at 8.8 million Database size: friggin huge (100+ GB) Index size: ~24 GB 1) It took me about a day to index 8 million docs using a non-optimized program I wrote. It's non-optimized in the sense that it's not multi-threaded. It batched together groups of about 5,000 docs at a time to be indexed. 2) Search times for a basic search are almost always sub-second. If we toss in some faceting, it takes a little longer, but I've hardly ever seen it go above 1-2 seconds even with the most advanced queries. Hope that helps. Charlie ____________________________________________ -----Original Message----- From: Law, John [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 26, 2007 9:28 AM To: solr-user@lucene.apache.org Subject: dataset parameters suitable for lucene application I am new to the list and new to lucene and solr. I am considering Lucene for a potential new application and need to know how well it scales. Following are the parameters of the dataset. Number of records: 7+ million Database size: 13.3 GB Index Size: 10.9 GB My questions are simply: 1) Approximately how long would it take Lucene to index these documents? 2) What would the approximate retrieval time be (i.e. search response time)? Can someone provide me with some informed guidance in this regard? Thanks in advance, John ______________________________________________ John Law Director, Platform Management ProQuest 789 Eisenhower Parkway Ann Arbor, MI 48106 734-997-4877 [EMAIL PROTECTED] www.proquest.com www.csa.com ProQuest... Start here.