Use Lucene SimpleFSDirectory by default?

Unico Hommes Thu, 09 Oct 2014 04:19:28 -0700

Hi all,

We have recently discovered a problem with the way Lucene index files
are accessed in Jackrabbit 2 when running with the default repository
configuration.


For reading and writing its indexes Lucene can use different
implementations. By default  implementations are preferred based on
the java.nio package. These implementations however have the following
drawback:

"Accessing this class either directly or indirectly from a thread
while it's interrupted can close the underlying file descriptor
immediately if at the same time the thread is blocked on IO. The file
descriptor will remain closed and subsequent access to [this class]
will throw a ClosedChannelException."

So if the application at one point sets an interrupt on a thread that
at the same time is accessing a Lucene index file, the file channel is
closed and subsequent access will throw an exception. If that happens,
the application needs to be reinitialized before the Lucene index can
be used again.

Refer to the discussion of this issue on the lucene developer list at [1].

To say the least, this is very brittle behaviour.

Fortunately Jackrabbit SearchIndex can be configured with a parameter
to use a different Lucene IO access implementation that does not
suffer from this issue (useSimpleFSDirectory which causes Jackrabbit
to use a FSDirectory that is based on access via
java.io.RandomAccessFile).

My question to the community is: would it be wise to make this the
default? Applications using Jackrabbit which run into this issue will
simply go down. This is a big price to pay for increased performance.

1. http://marc.info/?l=lucene-dev&m=126466874929360&w=4

--
Unico

Use Lucene SimpleFSDirectory by default?

Reply via email to