Kevin A. Burton wrote:

Killeen, Tom wrote:

I am attempting to create approx 10 different Lucene indexes. I'm trying to
create them at the same time by running multiple processes and each index is
written to a new directory. Once I create more than one process - the
performance is very, very slow.


If they are IDE disks you should NOT try to build multiple indexes at once since your disks will bottleneck.

If you are using SCSI/Fibre disks you should be able to write out multiple indexes at once but you should benchmark it to make sure that it is in fact faster.

It is a downloadable application, so I don't know what the disks are ;-) I guess a common scenario is that someone indexes local and network directories. Then the parallel execution probably doesn't hurt, it might actually help. Of course it would be smarter to index server-side, but that is not yet implemented.


For the little application we wrote the indexing speed doesn't seem too important anyway. And it allows updating the index which means that benchmarking is harder -- although it probably also means that accessing multiple locations on the same drive hurts more, since the CPU is most likely used less in comparison to the disks if a document is up-to-date. But I can live with it, so I rather keep it for now. Maybe one day we change from the threaded model we have now to a queued model, but I am not sure if our users would be happy with it, since it means they either have to wait for the just requested update until everything else is finished, or we would have to add controls for the processing order, which bloats the UI.

Peter


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to