Marc:
In case you're getting hit by disk I/O, you might try using
RAMDirectories instead of FSDirectories. You can open an IndexWriter
with a RAMdir, do your indexing, and then later write it out to disk.
--David Goodstein
----- Original Message -----
From: Marc Dumontier <[EMAIL PROTECTED]>
Date: Tuesday, January 7, 2003 12:50 pm
Subject: significant performance issues
> Hi all,
>
> I just started trying to use Lucene to index approximately 13,000
> XML
> documents representing biological data..each document is
> approximately
> 20-30KB.
>
> I modified some code from cocoon components to use SAX to parse my
> documents and create Lucene Documents. This process is very quick.
>
> The following code is where i started off to write the index to disk.
>
> writer = new IndexWriter(fsd, analyzer, true);
>
> Iterator myit = docList.iterator();
> while(myit.hasNext()) {
> writer.addDocument((Document)myit.next());
> System.out.println(++counter);
> }
> writer.close();
>
> This is taking much more time than expected. I'm using the
> StandardAnalyzer, and my XML data is about 20-30Kb per file. The
> indexing is taking approximately 2-3 seconds per document and as
> the
> index grows it gets significantly slower. I'm running this on a
> 2.4GHz
> linux machine with 1GB ram.
>
> I tried a few different stragegies, but i end up with too many
> files
> open exceptions.
>
> I don't think it should progressively slow down in proportion to
> the
> size of the index. Is this assumption wrong?
>
> Am i doing something wrong? is there a way to utilize the memory
> more
> and the filesystem less and just dump the index periodically?
>
> any help would be appreciated..thanks
>
> Marc Dumontier
> Intermediate Developer
> Blueprint Initiative
> Mount Sinai Hospital
> http://www.bind.ca
>
>
>
> --
> To unsubscribe, e-mail: <mailto:lucene-user-
> [EMAIL PROTECTED]>For additional commands, e-mail:
> <mailto:[EMAIL PROTECTED]>
>
>
--
To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>