Julian Sedding created OAK-9515:
-----------------------------------
Summary: IndexCopier$DeleteOldDirOnClose#close() should handle
missing file
Key: OAK-9515
URL: https://issues.apache.org/jira/browse/OAK-9515
Project: Jackrabbit Oak
Issue Type: Bug
Components: lucene
Affects Versions: 1.22.4
Reporter: Julian Sedding
{{IndexCopier$DeleteOldDirOnClose#close()}} catches {{IOException}}, however,
it uses {{FileUtils#sizeOf(File)}}, which throws an
{{IllegalArgumentException}} if the file does not exist.
{noformat}
22.07.2021 10:23:21.166 *WARN* [oak-lucene-5018]
org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexProviderService Error
occurred in asynchronous processing
java.lang.IllegalArgumentException:
/path/to/repository/index/indexname-1626941798326/data does not exist
at org.apache.commons.io.FileUtils.sizeOf(FileUtils.java:2541)
[org.apache.commons.io:2.6.0]
at
org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopier$DeleteOldDirOnClose.close(IndexCopier.java:468)
[org.apache.jackrabbit.oak-lucene:1.22.4]
at
org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnReadDirectory$2.run(CopyOnReadDirectory.java:304)
[org.apache.jackrabbit.oak-lucene:1.22.4]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{noformat}
The [relevant block of
code|https://github.com/apache/jackrabbit-oak/blob/c6ddcc55bee3de915459af01e91edad32d538f3d/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/IndexCopier.java#L465-L475]
attempts to delete and old index directory. So skipping this if the directory
does not exist seems appropriate.
Unfortunately the log files I analysed don't reach far back into the past, so I
cannot say under which circumstances it can happen that the directory is
already removed. What I know is that a few days later an out-of-disk situation
was noticed, which was likely caused by repeated re-indexing runs. Furthermore,
I have seen a {{SegmentNotFoundException}} for one specific segment. Once after
a "successful" reindexing run (i.e. {{IndexUpdate}} reports "Reindexing
completed"), during {{AsyncIndexUpdate#mergeWithConcurrencyCheck()}}. But most
occurrences look like queries running into the missing segment.
Surely the issue described here is not the root-cause of the problem I am
analysing. But at the very least it adds unnecessary noise.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)