On 3/15/11 11:12 AM, Rakesh Vidyadharan wrote:
On 15 Mar 2011, at 03:19, Michael Wechner wrote:

On 3/15/11 12:19 AM, Rakesh Vidyadharan wrote:
On 14 Mar 2011, at 15:10, [email protected] wrote:

Hi,

Are there any organizations/companies that use jackrabbit as their
production content management system? Can somebody name a few? and how many
files might there be in their system?

And which approach is better db blob storage or file system storage and what
are the pros/cons of each?

Thanks,
KS.
http://press.uchicago.edu/ - Built using Magnolia 4.4.2 which uses JackRabbit 
1.6.4 as the data store.

We are using the file system blob store.  The blob store tends to create a ton 
of directories which makes file system backup/restore quite slow.
Did you consider introducing a "fail over environment"? We had similar problems, but by 
"mirroring" the data (and application) we don't have the problem of a slow restore in the 
first place, but rather just switch the environment (and then do the restore for the system which 
was previously the master). Hence the backup
is only for the worst case if the master and mirror should be down for whatever 
reasons.

Cheers

Michael
Yes.  In fact Magnolia uses distinct author and public instances concept (with 
the ability to run multiple public instances per author instance).  In our 
case, we have one author and two public instances, which means our data is 
always replicated across three servers.  However, we still need to maintain 
backup processes in case of catastrophic failures, and the recovery process is 
only for such a case.

right and it's good if people are aware that such a restore can take quite a long time such that
they are not surprised when such a catastrophic failure happens.
That said, the existence of mirrors does not remove the issue with so many 
directories and files. Simple archiving and unarchiving (once in a while we 
need to do that to update our development and qa instances) tend to take much 
longer than they should because of this issue.  It is not a deal breaker, but 
it could have been better.

you mean the actual implementation of the replication could be better? Beside improving the actual implementation the question is what are the alternatives, e.g.

http://en.wikipedia.org/wiki/Replication_%28computer_science%29

?

Also for example re QA and development environments an incremental approach could also work by replicating incrementally as well instead always starting from scratch and after using it, just reverting the changes (like for example with SVN). But you are right a "fresh checkout" will always take some time.

Cheers

Michael
Rakesh

Reply via email to