Re: Why MultiThreadedOCPTest (sometimes) fails

Erick Erickson Sat, 30 May 2020 14:26:04 -0700

Ilan:

Please raise a new JIRA and attach any fixes there, a Git PR or patch whichever 
you prefer. Your analysis will get lost in the noise for SOLR-12801. And thanks 
for digging! We usually title them something like “harden MultiThreadedOCPTest” 
or “fix failing MultiThreadedOCPTest” or similar.

I haven’t really looked at the code you’re analyzing, it’s the weekend after 
all ;). So ignore this question if it makes no sense. Anyway, I’m always 
suspicious of using delays for just this kind of reason; depending on the 
hardware something may fail sometimes. Is there any way to make the sequencing 
more of a positive lock? Unfortunately that can be tricky, I’ve wound up doing 
things like serializing all tasks which…er…isn’t a good fix ;). But it sounds 
like your “overkill” section is along those lines? 

Up to you.

Meanwhile, I’ve started beasting that test on my spare machine. If you’re 
curious about what that is: 
https://gist.github.com/markrmiller/dbdb792216dc98b018ad

But don’t worry about that part. The point is I can run the test multiple 
times, with N running in parallel, hopefully making my machine exhibit the 
problem sometime over the night. If I can get it to fail before but not after 
your fix, it provides some reassurance that your fix is working. Not totally 
certain of course. Otherwise, we’ll just commit your fixes and see if Hoss’ 
rollups stop showing it.

Thanks again!
Erick

> On May 30, 2020, at 1:42 PM, Ilan Ginzburg <ilans...@gmail.com> wrote:
> 
> MultiThreadedOCPTest

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Why MultiThreadedOCPTest (sometimes) fails

Reply via email to