Marc:
  In case you're getting hit by disk I/O, you might try using
RAMDirectories instead of FSDirectories.  You can open an IndexWriter
with a RAMdir, do your indexing, and then later write it out to disk.
--David Goodstein


----- Original Message -----
From: Marc Dumontier <[EMAIL PROTECTED]>
Date: Tuesday, January 7, 2003 12:50 pm
Subject: significant performance issues

> Hi all,
> 
> I just started trying to use Lucene to index approximately 13,000 
> XML 
> documents representing biological data..each document is 
> approximately 
> 20-30KB.
> 
> I modified some code from cocoon components to use SAX to parse my 
> documents and create Lucene Documents. This process is very quick.
> 
> The following code is where i started off to write the index to disk.
> 
> writer = new IndexWriter(fsd, analyzer, true);
> 
> Iterator myit = docList.iterator();
>    while(myit.hasNext()) {
>        writer.addDocument((Document)myit.next());
>        System.out.println(++counter);
>     }
> writer.close();
> 
> This is taking much more time than expected. I'm using the 
> StandardAnalyzer, and my XML data is about 20-30Kb per file. The 
> indexing is taking approximately 2-3 seconds per document and as 
> the 
> index grows it gets significantly slower. I'm running this on a 
> 2.4GHz 
> linux machine with 1GB ram.
> 
> I tried a few different stragegies, but i end up with too many 
> files 
> open exceptions.
> 
> I don't think it should progressively slow down in proportion to 
> the 
> size of the index. Is this assumption wrong?
> 
> Am i doing something wrong? is there a way to utilize the memory 
> more 
> and the filesystem less and just dump the index periodically?
> 
> any help would be appreciated..thanks
> 
> Marc Dumontier    
> Intermediate Developer
> Blueprint Initiative
> Mount Sinai Hospital
> http://www.bind.ca
> 
> 
> 
> --
> To unsubscribe, e-mail:   <mailto:lucene-user-
> [EMAIL PROTECTED]>For additional commands, e-mail: 
> <mailto:[EMAIL PROTECTED]>
> 
> 


--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to