Dominique,

We have over the past 3-4 years moved away from database persistence for the body of files since a number of Universities have 1TB of data or more.

I have no problem putting the metadata in the DB, but if we put the bodies in as well the DBA's throw a fit, is just about bearable for Oracle although shifting backups starts to become a problem, but we have seen some interesting results when a few 100G goes into a MySQL db under innodb, not least query times.

So the question becomes, how bad transactionally is having a DB based PersistanceManager and content (the BLOBS) on the filesystem?

I might be getting confused at this point, and confusing you with my lack of knowledge and terminology.... so in my Workspace definition I am using

<PersistenceManager class="org.sakaiproject.jcr.jackrabbit.sakai.SakaiPersistanceManager">
            <param name="schema" value="${db.dialect}"/>
            <param name="schemaObjectPrefix" value="jcr_${wsp.name}_"/>
            <param name="externalBLOBs" value="${content.filesystem}"/>
        </PersistenceManager>

Where SakaiPersistanceManager simple overrides the getConnection() method of the standard DB persistence manager.

The DB is a standalone mysql, or Oracle instance.

Any pointers would be extremely helpful.

Thanks
Ian


Dominique Pfister wrote:
Hi Ian,

On 4/27/07, Ian Boston <[EMAIL PROTECTED]> wrote:
One quick question, which parts of the repository filesystem {rep.home}
should be in shared space and local space on the cluster node, I'm using
content on filesystem.

In a clustered environment, using content on filesystem is not
recommended: since the journal does only contain the modified item's
id, but not the content itself, all nodes have to save the content in
the same location. Changes made by one node in the cluster should be
isolated from other nodes until the change is actually committed, a
condition the filesystem based persistence managers do not fulfill.

I'd rather take a database based persistence manager, where the
database is running standalone and not embedded. If you already use
the DatabaseJournal with a JDBC datasource, it would probably make
sense to use the same database to save your repository data.

Kind regards
Dominique



Ian

Dominique Pfister wrote:
> Hi Ian,
>
> On 4/27/07, Ian Boston <[EMAIL PROTECTED]> wrote:
>> I want to extend reimplemente DatabaseJounal (core.cluster), but there
>> is a dependency on RecordInput which is a protected class (or at least
>> default scope).
>>
>> So you cant extend AbstractDatabaseJournal except in the same package
>> (perhaps thats the answer)
>>
>> Is there a reason for this, or was it an oversight.
>
> This is definitely an oversight. Ideally, DatabaseJournal should have
> a protected method named "getConnection", that may be overridden to
> change the way a connection is acquired. I will file a bug for this.
>
>> The reason I want to extend as I am embedding Jackrabbit into Sakai
>> (www.sakaiproject.org) and I would prefer to use a DataSource rather
>> DriverManager delivered connection .... even if I get the connection and
>> keep it.
>
> For the time being, if using a DataSource is an absolute must, there
> is nothing else I can suggest than checking out the source code from
> svn, applying the required changes directly to your local copy and
> building a new, customized version.
>
> Kind regards
> Dominique



Reply via email to