Sounds like more of a producer/consumer problem than a Lucene.net problem. 
Here's some untested pseudo-code showing how to create a Task pool that has a 
configurable size of 4 workers but 8-16 might be better on your hardware. Tasks 
are quickly submitted to the pool then the pool works on them 4 Tasks in a time 
until all Tasks to complete:

https://pastebin.com/g0QKhCb1

According to 
https://lucenenet.apache.org/docs/3.0.3/class_lucene_1_1_net_1_1_index_1_1_index_writer.html#details
 IndexWriter is thread-safe. Note that I reduced locking on the counter by only 
updating it at the end of each small batch, not after each Document was added. 
Batch size could change from 5000 to 2500 for more frequent status updates.

On 2023/01/09 01:27:42 BradelSablink wrote:
> I use Lucene.net 3.0.3 in an audio fingerprinting project and was wondering 
> how I could improve the indexing speed? It takes ~1 week to make indexes of 
> subfingerprints for 7+ million songs on a 32 core system with 64GB ram. I see 
> that only 1 CPU core is doing 100% of the indexing. How can I use multiple 
> cores to speed up indexing? Or maybe there's a better way to speed it up? I'm 
> a Lucene.net novice compared to all of you so thank you for any help. The 
> area in question where indexing is slow: 
> https://github.com/nelemans1971/AudioFingerprinting/blob/master/CreateInversedFingerprintIndex/Worker.cs#L237

Reply via email to