Hoss Man created LUCENE-8692:
--------------------------------

             Summary: IndexWriter.getTragicException() nay not reflect all 
corrupting exceptions (notably: NoSuchFileException)
                 Key: LUCENE-8692
                 URL: https://issues.apache.org/jira/browse/LUCENE-8692
             Project: Lucene - Core
          Issue Type: Bug
            Reporter: Hoss Man


Backstory...

Solr has a "LeaderTragicEventTest" which uses MockDirectoryWrapper's 
{{corruptFiles}} to introduce corruption into the "leader" node's index and 
then assert that this solr node gives up it's leadership of the shard and 
another replica takes over.

This can currently fail sporadically (but usually reproducibly - seeSOLR-13237) 
due to the leader not giving up it's leadership even after the corruption 
causes an update/commit to fail.  Solr's leadership code makes this decision 
after encountering an exception from the IndexWriter based on wether 
{{IndexWriter.getTragicException()}} is (non-)null.

----

While investigating this, I created an isolated Lucene-Core equivilent test 
that demonstrates the same basic situation:

* Gradually cause corruption on an index untill (otherwise) valid execution of 
IW.add() + IW.commit() calls throw an exception to the IW client.
* assert that if an exception is thrown to the IW client, 
{{getTragicException()}} is now non-null.

It's fairly easy to make my new test fail reproducibly -- in every situation 
I've seen the underlying exception is a {{NoSuchFileException}} (ie: the 
randomly introduced corruption was to delete some file).




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to