[
https://issues.apache.org/jira/browse/SOLR-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Erick Erickson updated SOLR-7836:
---------------------------------
Attachment: SOLR-7836-synch.patch
I modified the three refactored methods in DirectUpdateHandler2 to be:
{code}
//old code
if (ulog != null) ulog.add(cmd);
//new code
synchronized (solrCoreState.getUpdateLock()) {
if (ulog != null) ulog.add(cmd);
}
{code}
and 200 iterations later no failures from TestStressReorder (I'll try the new
test code momentarily, but it'll take longer).
[[email protected]] [[email protected]]
Here's what I _don't_ like about this. None of these three operations (the
three new methods: addAndDelete, doNormalUpdate and allowDuplicateUpdate)
couples actually changing the index with the write to the ulog. Now, only the
code in addAndDelete did before, except that we may have gotten away with this
because the indexWriter was grabbed at the very top of addDoc0() and
effectively locked other operations out. Maybe.
But if the ulog.add goes within the synch on the updatelock in addAndDelete,
there's certainly a deadlock so putting it back the way it was isn't really an
option.
I'm starting to wonder if this isn't a bit backwards. Rather than going at this
piecemeal, what about synchronizing on updateLock at the top of addDoc0 _and_
in CoreContainer.reload() which drives one of the failure cases for deadlock?
Not sure I really like the idea, but I'll give it a (local) test....
Meanwhile, I'll check the current fix in since it's certainly better pending
more experimentation.
> Possible deadlock when closing refcounted index writers.
> --------------------------------------------------------
>
> Key: SOLR-7836
> URL: https://issues.apache.org/jira/browse/SOLR-7836
> Project: Solr
> Issue Type: Bug
> Reporter: Erick Erickson
> Assignee: Erick Erickson
> Fix For: Trunk, 5.4
>
> Attachments: SOLR-7836-synch.patch, SOLR-7836.patch, SOLR-7836.patch,
> SOLR-7836.patch
>
>
> Preliminary patch for what looks like a possible race condition between
> writerFree and pauseWriter in DefaultSorlCoreState.
> Looking for comments and/or why I'm completely missing the boat.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]