Hello, We are facing a strange situation in our application as described below:
*Using*: - Python 3.8.10 - Pylucene 6.5.0 - Java 8 (1.8.0_181) - Runs on Linux and Windows (error seen on Windows) We suddenly get the following *error*: 2022-02-10 09:58:09.253215: ERROR : writer | Failed to get index (D:\i\202202) writer, Exception: org.apache.lucene.index.CorruptIndexException: Unexpected file read error while reading index. (resource=BufferedChecksumIndexInput(MMapIndexInput(path="D:\i\202202\segments_fo"))) After this, no further indexing happens - trying to open the index for writing throws the above error - and the index writer does not open. FYI, our code contains the following *settings*: index_path = "D:\i\202202" index_directory = FSDirectory.open(Paths.get(index_path)) iconfig = IndexWriterConfig(wrapper_analyzer) iconfig.setOpenMode(IndexWriterConfig.OpenMode.CREATE_OR_APPEND) iconfig.setRAMBufferSizeMB(16.0) writer = IndexWriter(index_directory, iconfig) *Repairing* We tried 'repairing' the index with the following command / tool: java -cp lucene-core-6.5.0.jar:lucene-backward-codecs-6.5.0.jar org.apache.lucene.index.CheckIndex "D:\i\202202" -exorcise This however returns saying "No problems found with the index." *Work around* We have to manually delete the problematic segment file: D:\i\202202\segments_fo after which the application starts again... until the next corruption. We can't spot a specific pattern. *Two questions:* 1. Can we handle this situation programmatically, so that no manual intervention is needed? 2. Any reason why we are facing the corruption issue in the first place? Before this we were using Pylucene 4.10 and we didn't face this problem - the application logic is the same. Also, while the application runs on both Linux and Windows, so far we have observed this situation only on various Windows platforms. Would really appreciate some assistance. Thanks in advance. Regards, Antony