[ https://issues.apache.org/jira/browse/SOLR-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14705052#comment-14705052 ]
Yonik Seeley commented on SOLR-7836: ------------------------------------ bq. pulls out the problematic open searcher in ulog.add to a separate method. There are a few areas with complex synchronization that should not be changed unless one is confident about understanding why all the synchronization was there in the first place. Having the tests pass isn't a high enough bar for these areas because of the difficulty in actually getting a test to expose subtle race conditions or thread safety issues. This comes back to my original "get it back in my head" - I don't fee comfortable messing with this stuff either until I've really internalized the bigger picture again... and it doesn't last ;-) For the specific case above, one can't just take what was one synchronized block and break it up into two. It certainly creates race conditions and breaks the invariants we try to keep. The specific invariant here is that if it's not in the tlog maps, then it is guaranteed to be in the realtime reader. Hopefully some of our tests would fail with this latest patch... but it's hard stuff to test. I worked up a patch that passed down the IndexWriter (it needs to be passed *all* the way down to SolrCore.openSearcher to actually avoid deadlocks). That ended up changing more code than I'd like... so now I'm working up a patch to make IW locking re-entrant. That approach should be less fragile going forward (i.e. less likely to easily introduce a deadlock through seemingly unrelated changes). > Possible deadlock when closing refcounted index writers. > -------------------------------------------------------- > > Key: SOLR-7836 > URL: https://issues.apache.org/jira/browse/SOLR-7836 > Project: Solr > Issue Type: Bug > Reporter: Erick Erickson > Assignee: Erick Erickson > Fix For: Trunk, 5.4 > > Attachments: SOLR-7836-reorg.patch, SOLR-7836-synch.patch, > SOLR-7836.patch, SOLR-7836.patch, SOLR-7836.patch, deadlock_3.res.zip, > deadlock_5_pass_iw.res.zip, deadlock_test > > > Preliminary patch for what looks like a possible race condition between > writerFree and pauseWriter in DefaultSorlCoreState. > Looking for comments and/or why I'm completely missing the boat. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org