My experience so far:
200k number of indexes were created in 90 mins(including db time), index
size is 200m, query a key word on all string fields(30) takes 0.3-1 sec,
query a key word on one field takes tens of mill seconds.



-----Original Message-----
From: Charlie Jackson [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, September 26, 2007 8:53 AM
To: solr-user@lucene.apache.org
Subject: RE: dataset parameters suitable for lucene application

My experiences so far with this level of data have been good.

Number of records: Maxed out at 8.8 million
Database size: friggin huge (100+ GB)
Index size: ~24 GB

1) It took me about a day to index 8 million docs using a non-optimized
program I wrote. It's non-optimized in the sense that it's not
multi-threaded. It batched together groups of about 5,000 docs at a time
to be indexed.

2) Search times for a basic search are almost always sub-second. If we
toss in some faceting, it takes a little longer, but I've hardly ever
seen it go above 1-2 seconds even with the most advanced queries. 

Hope that helps.


Charlie

____________________________________________

-----Original Message-----
From: Law, John [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, September 26, 2007 9:28 AM
To: solr-user@lucene.apache.org
Subject: dataset parameters suitable for lucene application

I am new to the list and new to lucene and solr. I am considering Lucene
for a potential new application and need to know how well it scales. 

Following are the parameters of the dataset.

Number of records: 7+ million
Database size: 13.3 GB
Index Size:  10.9 GB 

My questions are simply:

1) Approximately how long would it take Lucene to index these documents?
2) What would the approximate retrieval time be (i.e. search response
time)?

Can someone provide me with some informed guidance in this regard?

Thanks in advance,
John

______________________________________________
John Law
Director, Platform Management
ProQuest
789 Eisenhower Parkway
Ann Arbor, MI 48106
734-997-4877
[EMAIL PROTECTED]
www.proquest.com
www.csa.com

ProQuest... Start here.


Reply via email to