Re: Indexing in multi-threaded environment

2005-05-10 Thread Doug Cutting
Chris Lamprecht wrote: I've done exactly what you describe, using N threads where N is the number of processors on the machine, plus one more thread that writes to the file system index (since that is I/O-bound anyway). Since most of the CPU time is tokenizing/stemming/etc, the method works well.

Re: Indexing in multi-threaded environment

2005-05-03 Thread Chris Lamprecht
Hi Sodel, You could use a single queue, where one thread pulls things off the queue and any number of threads put things on the queue. You can index say 1000 documents each to RAMDirectories in multiple threads, then enqueue the RAMDirectories. When the queue reaches a certain size, the single t

RE: Indexing in multi-threaded environment

2005-05-03 Thread Mufaddal Khumri
Hi , The calls to the IndexWriter.addIndexes is synchronized. Your code should not have to do anything more than just calling it. I believe roughly this will be the scenario that you are looking for: - while(there is more data) - spawn a thread to handle creating documents for this data

RE: Indexing in multi-threaded environment

2005-05-03 Thread Peter Veentjer - Anchor Men
You should only give a single thread access to the indexwriter. I have created a indexupdater that stores all the delete and write requests and once and a while a thread (triggered by Quartz) processes the requests in a single batch. another way would be synchronizing the indexupdater and only