Hi, We faced a similar problem. The solution was to give the indexer less work and let worker threads do all the work. They would result in pre-processed/analyzed/tokenized Documents that could be indexed by the writer without any processing.
Wouter > Hi > > the file to be indexed depends on the type of Document / data extractor > .... > > My Document types are usually XML type and every time 2+ Million XML's > are indexed and time taken is less then 5 minuts. > > > > > with regards > karthik > > On Fri, Nov 11, 2011 at 1:17 AM, Ian Lea <ian....@gmail.com> wrote: > >> And how long does it take just to read and parse the files, without >> indexing them? Often that is the problem - nothing to do with lucene. >> >> There is plenty of good advice in >> http://wiki.apache.org/lucene-java/ImproveIndexingSpeed. A good match >> on the subject of your message! >> >> -- >> Ian. >> >> >> On Thu, Nov 10, 2011 at 7:22 PM, Simon Willnauer >> <simon.willna...@googlemail.com> wrote: >> > can you provide more information about your setup? things like how >> > much time does it take to index you documents, how many docs do you >> > index, what are your index writer settings, how many cores do you >> > have, where do you read from and write to (disks). oh and what version >> > of lucene are you using? >> > >> > thanks, >> > >> > simon >> > >> > On Thu, Nov 10, 2011 at 10:40 AM, antony jospeh >> > <antony.joseph.webm...@gmail.com> wrote: >> >> Hi all, >> >> >> >> I have a large number of files in a directory need to be index them. >> All >> >> the files are in specific format need to parse to extract information >> after >> >> that i had to index. >> >> Single thread process one file at a time then i decided to use multi >> >> threads when the main thread that loops the directory and pass the >> file >> >> into pool of worker threads using a queue >> >> all of the which share same index writer, How ever there is no any >> >> significant changes in indexing speed >> >> >> >> Any hints I am doing wrong or any suggestion >> >> >> >> >> >> Thanks >> >> Antony >> >> --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org