Hi All,
I have searched the Internet and codebase of Jackrabbit about
recoverability of the Lucene Indexing in a cluster scenario, however I'm
not certain whether it is really recoverable. I hope someone can
enlighten me.
To make failover of the Jackrabbit machine possible we have our files
for indexes of each node and of the FileDataStore on a network share.
We use JNDIDatabaseJournal for clustering two nodes on the same machine.
The version of Jackrabbit is Fri Jan 11 14:41:29 EET 2008 version=1.4
(according to the pom.properties inside Jackrabbit-Core)
As far as I understand from JCR-204
(http://issues.apache.org/jira/browse/JCR-204), which is still open,
some measures have been taken to make indexes recoverable.
Also JCR-905 (closed) and JCR-778 (closed) seem related.
In the past we have had issues with Jackrabbit that the connection to
the network share was unstable and the index became corrupted, we try to
avoid that (by moving it to a SAN with iSCSI), but as reindexing the
entire repository takes a lot of time, as we also index the content with
almost all text extractors (See:
http://jackrabbit.apache.org/api/1.4/org/apache/jackrabbit/extractor/package-summary.html)
we would like to know whether Jackrabbit can completely recover from
this kind of situation. (BTW: We solve this by restarting the AppServer
Jackrabbit is running on, and then the auto recover kicks in, I guess
this one:
http://svn.apache.org/viewvc/jackrabbit/branches/1.3/jackrabbit-core/src/main/java/org/apache/jackrabbit/core/query/lucene/Recovery.java?view=log&pathrev=544247#rev544247
)
If it can recover, why is JCR-204 still open? If it cannot recover, we
would have to use a local disk and we cannot cluster the machine
anymore, and (if I can find time) I'll try and fix the issue.
Regards,
Johannes