Re: Scalability concerns, Alfresco performance tests

David Nuescheler Mon, 04 Dec 2006 05:23:13 -0800

Hi Andreas,

Now, a news message [1] on TheServerSide about benchmarks provided
by Alfresco to prove the superiority

ermhh.... let's say "state" not "prove" ;)

...of their JCR implementation raises some concerns.

I guess that this may exactly have been the intention ;)

Also, the term "JCR implementation" may not be technically
accurate, maybe someone could point me to an updated
version of this:
http://wiki.alfresco.com/w/index.php?title=JSR-170_Compliance

A post in the thread claims that Jackrabbit isn't suited for
large-scale scenarios and faces some problems in the transactional
handling of some 100.000 nodes (Kev Smith, [2]):

While Kev possibly has reasons to believe that, I don't.
(Unless he talks about some 100k nodes a single transaction
and a given memory size.)

"From what we've seen, Alfresco is comparable to JackRabbit for small
case scenarios - but Alfresco is much more scalable [...]"
Do you agree to this statement? If yes - are these problems related
to the persistence manager abstraction? Is this a known issue, and
will it be addressed?

I do not even remotely agree with this statement.
Jackrabbit has been built to scale freely in size.

I have a hard time understanding this argument since both Jackrabbit
and Alfresco can use the same RDBMS as the persistence layer, so
at least on the persistence layer there should not be a substantial
difference. Thoughts?

"We tried to load up JackRabbit with millions of nodes but always ran
into blocker issues after about 2 million or so objects. Also when
loading up JackRabbit, the load needed to be carefully performed in
small chunks e.g. trying to load in 100,000 nodes at a time would cause
PermGenSpace errors (even with a HUGE permgenspace!) and potentially
place the repo into a non-recoverable state."
I'm not sure if this will really be an issue for our usage
scenario (except maybe from restoring backups), but I'm very
interested in your opinions.

That's true, the size of the non-binary portions of a commit are
"currently" memory constrained.
"Backup/Restore" operations in my experience usually happen on the
persistence layer, which means that restore operation (obviously) does
not go through the normal user API. I actually would go as far as stating
that it would be close to abuse of the API to go through the transient layer
to restore an entire content repository.
We are currently working on a solution for that, but since nobody had
a pressing need, it had a relatively low priority. If this is a pressing issue
for your project feel free to file a JIRA issue.

regards,
david

Re: Scalability concerns, Alfresco performance tests

Reply via email to