I have an application that is reading in XML files and indexing them.  Each
XML file is 3K-6K bytes.  This application preloads a database that I will
add to "on the fly" later.  However, all I want it to do initially is take
some existing files and create the initial index as quick as I can.  

Since I want to index "on the fly" later, I set the merge factor to 10.  I'm
assuming that I can't create the index initially with one merge factor
(e.g., 100) and then change the merge factor later (true?).

What I see is that it takes 1-3 seconds per xml file to do the index.  This
means I'm indexing around 150k bytes per minute.  I also notice that the CPU
utilization rarely exceeds 5% (looking at task manager on a Windows box).  I
use Xerces to read in the files (SAX interface) and I don't close or
optimize the index between stories nor do I sleep anyplace.  I've looked at
the page fault numbers and they aren't changing much.  I guess I would have
expected that I would have pretty much pegged the CPU and seen much faster
indexing.

Any ideas/suggestions? 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to