jpountz commented on issue #12916: URL: https://github.com/apache/lucene/issues/12916#issuecomment-1866215329
> Should I take your PR and check with beasting on Policeman's Ryzen to see if I can reproduce with the above command? That would be great. I have a Ryzen too, though a AMD Ryzen 9 3900X. I'm not sure what factors influence the reproducibility of this bug. The good news is that the new test on #12959 fails 100% of the time for me without the fix. > Is there an explanation why it works better when making the method synchronized as suggested by the OpenJ9 people? My understanding is that the way that the bug manifests looks like this: - 2 threads T1 and T2 doing indexing, - a `ConcurrentyApproximatePriorityQueue` with concurrency = 2, ie. 2 sub queues, initially empty And then the following sequence of events: 1 T1 tries to pull a DWPT from the queue, it's empty so `DocumentsWriterPerThreadPool` creates a new DWPT. There is now 1 DWPT. 2 T2 tries to pull a DWPT from the queue, it's empty (because the only one created so far is taken by T1) so `DocumentsWriterPerThreadPool` creates a new DWPT. There are now 2 DWPTs. 3 T2 indexes a document 4 T1 indexes a document 5 T2 puts the DWPT back into the sub queue at index 1 6 T1 puts the DWPT back into the sub queue at index 0 7 T1 starts pulling a DWPT from the queue 8 T2 pulls the DWPT from the sub queue at index 0 9 T1 checks the sub queue at index 0, it's empty (since T2 just took it) so it keeps going 10 T2 indexes a document 11 T2 puts the DWPT back into the sub queue at index 0 12 T2 pulls the DWPT from the sub queue at index 1 13 T1 checks the sub queue at index 1, it's empty (since T2 just took it) so it keeps going 14 T1 checked all sub queues and could not get a DWPT, so `DocumentsWriterPerThreadPool` creates a one. There are now 3 DWPTs even though there are only 2 indexing threads. Making the method synchronized helps by forcing T2 to wait until T1 is done pulling a DWPT from the queue before starting doing the same. The PR tries to go a bit more granular to limit contention. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org