I do have a set of old lucene index files, probably 15-20 years old. Every
index directory contains
four files, something like these...(size is different)

_0.cfx 47,           942 KB
_s.cfs                 1,78,687 KB
segments.gen    1 KB
segments_2       1 KB

When I try to open an index using PyLucene 10.0.0 or java 10.0.0

searcher = IndexSearcher(DirectoryReader.open(directory))

DirectoryReader throws an error

org.apache.lucene.index.IndexFormatTooOldException: Format version is not
supported (resource
BufferedChecksumIndexInput(NIOFSIndexInput(path="D:\database\segments_2"))):
-9 (needs to be between 1071082519 and 1071082519). This version of Lucene
only supports indexes created with release 9.0 and later.

I started opening the index using older versions of Lucene (java based not
PyLucene) one by one and finally the index was opened
using Lucene 2.9.4. The slight difference in the code is

File oldIndexDir = new File(INDEX_PATH);
Directory directory = FSDirectory.open(oldIndexDir);
IndexSearcher searcher = new IndexSearcher(directory, true); //
read-only=true

What I need is to upgrade the indexes so that they can be opened using the
latest 10.0.0 version. One logical solution
I can think of is to export the data to sqlite database first and later
create a new index by reading the data back using the latest version.

I am new to Lucene and recently started diving in. Is there any solution
offered by API for this problem?

I would prefer a solution based on PyLucene but if it's not possible then a
java based solution would also be fine.

Prashant

Reply via email to