I'm implementing at the moment a MultiDataStore and want to know if
someone else is interested so i would provide a patch.

The background:
We are storing a huge amount of files into jackrabbit (at the moment about 2 
TB).
We are using a DBDataStore running against a high available OracleCluster. The 
problem is 
that we must keep files up to 80 years for government requirements. So the 
costs will
increase every year for the backend. The plan is now to move files time based 
from one 
datastore to a other. The archive DataStore is mapped to a cheaper backend like 
a 
SATA RAID or a Tape Library. 

This will be done with a DataStoreJanitor. It would move files based on the 
modified date
to the other datastore in a background process.

The MultiDataStore is only a DataStore Wrapper with two DataStores in it. The 
append 
will work against the primary DataStore and the read will first look inside the 
primary and
if not found there it will use the archive DataStore. The GarabageCollector 
would remove 
only files from the archive DataStore.

The configuration could look like:
<MultiDataStore class="org.apache.jackrabbit.core.data.MultiDataStore" >
  <primary>
    <DataStore class="org.apache.jackrabbit.core.data.db.DbDataStore">
    ...
    </DataStore>
  </primary>
  <archive>
    <DataStore class="org.apache.jackrabbit.core.data.FileDataStore"> 
    ...
    </DataStore>
  </archive>
</MultiDataStore>

greets
claus

Reply via email to