Temp File Semantics

Mattheis, Erik W. Fri, 02 Jan 2009 08:38:29 -0800

We¹ve been testing Jackrabbit for a few months now and are starting to roll
it out into production. We recently noticed that during heavy load temp
files (e.g. - dbRecord12345.tmp) are not getting cleaned up. This occurs
when the temp filesystem fills up and a we get an exception like this:


org.apache.jackrabbit.core.data.DataStoreException: Can not read identifier
240b723f5a9fc1969f8f653382daf6a5aefff5f2: No space left on device: No space
left on device

It seems at this point that Jackrabbit abandons the files and does not make
any attempt to remove them from the temp directory. Obviously, this leaves
the system in an unusable state until we manually purge the temp files.

I assume the files I'm seeing are intended to be transient cache files since
our data store is on a central DB and I don't see any files lingering during
normal operation.

So, I have a few questions:

1 - Can I configure the temp directory used by Jackrabbit? I don't want to
set the temp directory globally, just for Jackrabbit.

2 - Is there a way to recover from this more gracefully and purge the temp
files? Ideally I can wait for space to free up on the device when Jackrabbit
purges other transient files from the temp space.

3 - Why create so many short-lived files? I notice that when the problem
occurs, there are many copies of the exact same file in the temp directory.
It seems more appropriate to keep a single cache file around for a given
item and manage the local cache at a set maximum size using an LRU
algorithm.

4 - If I switch to a filesystem-based datastore instead of a DB-based one,
will Jackrabbit still create these transient temp files, or will it simply
stream from the existing filesystem?

Any advice would be greatly appreciated!

--
Erik

Temp File Semantics

Reply via email to