Hi I think I've hit a bug in ConcurrentMergeScheduler, but I'd like those who are more familiar with the code to review it. I ran TestStressSort.testSort() and started to get AIOOB exceptions from MergeThread, the CPU spiked to 98-100% and did not end for a couple of minutes, until I was able to regain control and kill the process (looks like an infinite loop).
To reproduce it all you need is to add the following line to PQ.initialize(): size = maxSize, and then you'll get the aforementioned exceptions. I did it acceindentally, but I'm sure there's a way to reproduce it with a JUnit test or something so that it will happen consistently. When I debugged-trace the test, I noticed that MergeThread are just spawned forever. The reason is this: In CMS.merge(IndexWriter) there's a 'while (true)' loop which does 'while (mergeThreadCount() >= maxThreadCount)' and if false just spawns a new MergeThread. On the other hand, in MergeThread.run there's a try-finally which executes whatever it needs to execute and in the finally block removes this thread from the list of threads. That causes CMS to spawn a new thread, which will hit another exception, remove itself from the queue and CMS will spawn a new thread. That puts the code into an infinite loop. That sounds like a bug to me ... I think that if MergeThread hits any exception, the merge should fail? Anyway, the exception is added to an exceptions List, which is a private member of CMS but is never chceked by CMS. Perhaps merge(IndexWriter) should check if the exceptions list is not empty and fail the merge in such case? Anyway, I'll fix PQ's code now to continue my work, but if you want to reproduce it, it's as easy as adding size = maxSize to initialize() and run TestStressSort. I don't mind to open an issue and fix it (though I'm not sure what should the fix be at the moment, but I'll figure it out), but it will have to wait, so if you know the code and can put a patch together quickly, don't wait up for me :) Shai