As I read your latest post, it's not *searching* that's taking too long, but *indexing*.
Well, 100,000,000 rows is a lot. It'll never be just a few minutes. But I also have to ask whether the most time is being spent actually indexing or fetching from the database? You could time this easily by just skipping the indexing step and measuring say the time to fetch the first 1,000,000 rows and comparing that against fetching *and* indexing. The whole point of a search engine is to trade some up-front time indexing for faster searching. Otherwise, have you looked at this page? http://wiki.apache.org/lucene-java/ImproveIndexingSpeed Also, 200M for 40K rows seems like a lot of disk space. Are you storing all the data in the index? And do you need to? Best Erick On Jan 16, 2008 3:24 AM, coolgeng coolgeng <[EMAIL PROTECTED]> wrote: > firstly, I submit the query like "select * from [tablename]". And in this > table, there are around 30th columns and 40,000 rows data. And I use the > standrandAnalyzer to generate the index. And as my experience, it cost > 200M > disk to store the index. > > for example, I will search the "Name" field in the index and it does work > perfectly. But when I create an index on the 100,000,000 data. It will > cost > too much time to index creation. So is there some good solution to solve > this? > thanks > > > On Jan 15, 2008 10:47 PM, Erick Erickson <[EMAIL PROTECTED]> wrote: > > > You really have to tell us more about what you're trying to do > > to get a meaningful reply. > > > > What do you mean you create the index on a table? Are you > > using some sort of embedded SQL to query the table then > > creat a lucene index? How big is the index? What search > > are you submitting? What does your search code look like? > > > > Best > > Erick > > > > On Jan 14, 2008 10:47 PM, coolgeng coolgeng <[EMAIL PROTECTED]> > wrote: > > > > > Hi guys, > > > Some problems confuse me. When I would like to index some data > from > > a > > > table in database. While I create the index on this table, the > > searching > > > job keeps going . How can I work out it? > > > By the way, the number of data is around 1 hundred million. > > > > > > -- > > > Best Regards > > > Cooper Geng > > > > > > > > > -- > Best Regards > Cooper Geng >