Ilan: Please raise a new JIRA and attach any fixes there, a Git PR or patch whichever you prefer. Your analysis will get lost in the noise for SOLR-12801. And thanks for digging! We usually title them something like “harden MultiThreadedOCPTest” or “fix failing MultiThreadedOCPTest” or similar.
I haven’t really looked at the code you’re analyzing, it’s the weekend after all ;). So ignore this question if it makes no sense. Anyway, I’m always suspicious of using delays for just this kind of reason; depending on the hardware something may fail sometimes. Is there any way to make the sequencing more of a positive lock? Unfortunately that can be tricky, I’ve wound up doing things like serializing all tasks which…er…isn’t a good fix ;). But it sounds like your “overkill” section is along those lines? Up to you. Meanwhile, I’ve started beasting that test on my spare machine. If you’re curious about what that is: https://gist.github.com/markrmiller/dbdb792216dc98b018ad But don’t worry about that part. The point is I can run the test multiple times, with N running in parallel, hopefully making my machine exhibit the problem sometime over the night. If I can get it to fail before but not after your fix, it provides some reassurance that your fix is working. Not totally certain of course. Otherwise, we’ll just commit your fixes and see if Hoss’ rollups stop showing it. Thanks again! Erick > On May 30, 2020, at 1:42 PM, Ilan Ginzburg <ilans...@gmail.com> wrote: > > MultiThreadedOCPTest --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org