Hey all,
I recently moved our company's external website to use dotLucene, and so far it's been great and is working flawlessly. I have several indices that I use to manage our website. Since our company is in the Book industry I have several indices that are used for various parts of the page. Eg. Our main catalog is searchable and so we have a "Book" index that can be searched by Title, Description, Author, etc. We also have an Author table that can be searched by First name, Last name, bio, etc. Finally we have a BookAuthor relationship table that is used when a Book is searched, the BookAuthor is searched to find out if the Book's authors have other books. The indices are as: Book (primary key: ISBN) - 160,000+ documents Author (primary key: AuthorID) - 60, 000+ documents BookAuthor (contains LinkID), 100, 000+ documents So far things are working great. The book index is about 500MB and is not a big overhead on our system. Now here's where the problem lies. To keep things fresh on the site, we have a nightly job that rebuilds entire index and then copies the data over to the production index folder (it takes about an hour to rebuild entire site and a min or two to copy things over). However, there will be times when the information will need to be updated almost live during the normal day-to-day hours. Say for example a book's description has changed. What I do is I delete the document and then re-add it. Unfortunately deleting and re-adding it to the index takes a few minutes and this is causing issues with information not being available when someone tries to look on the site. Here's the log from our background service that rebuild documents: 20080211 16:59:32 [Engine] [book] Deleting isbn(1554700310). Status: 1 20080211 16:59:32 [Engine] [book] [00:00:00:000] Getting table count 20080211 16:59:34 [Engine] [book] [00:00:02:156] Rows loaded 1 20080211 16:59:34 [Engine] [book] [00:00:02:156] Getting table schema 20080211 16:59:34 [Engine] [book] [00:00:02:218] Getting data reader 20080211 16:59:36 [Engine] [book] [16:59:36:000] Index dump started 20080211 16:59:36 [Engine] [book] [00:00:00:078] Total indexed: 1 20080211 16:59:36 [Engine] [book] [00:00:00:078] Optimizing index 20080211 17:02:23 [Engine] [book] [00:02:46:917] Index finished You can see from the moment it deleted the ISBN from the "book" index to when it finally added it back, it took only 4 seconds. But when the call to Writer.Optimize() is called it takes almost 2-1/2 minutes to optimize the index. Is optimizing the index even necessary at this point? Any help is greatly appreciated. -- Gautam Lad
