Hi

I think I've hit a bug in ConcurrentMergeScheduler, but I'd like those who
are more familiar with the code to review it. I ran
TestStressSort.testSort() and started to get AIOOB exceptions from
MergeThread, the CPU spiked to 98-100% and did not end for a couple of
minutes, until I was able to regain control and kill the process (looks like
an infinite loop).

To reproduce it all you need is to add the following line to
PQ.initialize(): size = maxSize, and then you'll get the aforementioned
exceptions. I did it acceindentally, but I'm sure there's a way to reproduce
it with a JUnit test or something so that it will happen consistently.

When I debugged-trace the test, I noticed that MergeThread are just spawned
forever. The reason is this: In CMS.merge(IndexWriter) there's a 'while
(true)' loop which does 'while (mergeThreadCount() >= maxThreadCount)' and
if false just spawns a new MergeThread. On the other hand, in
MergeThread.run there's a try-finally which executes whatever it needs to
execute and in the finally block removes this thread from the list of
threads. That causes CMS to spawn a new thread, which will hit another
exception, remove itself from the queue and CMS will spawn a new thread.
That puts the code into an infinite loop.

That sounds like a bug to me ... I think that if MergeThread hits any
exception, the merge should fail? Anyway, the exception is added to an
exceptions List, which is a private member of CMS but is never chceked by
CMS. Perhaps merge(IndexWriter) should check if the exceptions list is not
empty and fail the merge in such case?

Anyway, I'll fix PQ's code now to continue my work, but if you want to
reproduce it, it's as easy as adding size = maxSize to initialize() and run
TestStressSort.

I don't mind to open an issue and fix it (though I'm not sure what should
the fix be at the moment, but I'll figure it out), but it will have to wait,
so if you know the code and can put a patch together quickly, don't wait up
for me :)

Shai

Reply via email to