I have an application that is reading in XML files and indexing them. Each XML file is 3K-6K bytes. This application preloads a database that I will add to "on the fly" later. However, all I want it to do initially is take some existing files and create the initial index as quick as I can.
Since I want to index "on the fly" later, I set the merge factor to 10. I'm assuming that I can't create the index initially with one merge factor (e.g., 100) and then change the merge factor later (true?). What I see is that it takes 1-3 seconds per xml file to do the index. This means I'm indexing around 150k bytes per minute. I also notice that the CPU utilization rarely exceeds 5% (looking at task manager on a Windows box). I use Xerces to read in the files (SAX interface) and I don't close or optimize the index between stories nor do I sleep anyplace. I've looked at the page fault numbers and they aren't changing much. I guess I would have expected that I would have pretty much pegged the CPU and seen much faster indexing. Any ideas/suggestions? --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
