Hi Jukka,

Performance
-----------
To better prepare for solving such issues it would be good for us to
have some standard performance benchmarks in place. They wouldn't need
to be very complex, even something that simply populates a large
workspace and retrieves all the stored content would be a good start
as long as the test is repeatable and produces usable reports.

That sounds like a great idea. The advantage of this is that if people
have specific performance requirements for use cases that are more
complex than the simple write 1 workspace / read 1 workspace, the
tests could be easily extended to support these too.

One very general underlying issue that I see as a major performance
bottleneck in the current Jackrabbit design is the reliance on
sequential operation in many critical areas of the codebase. Perhaps
the most glaring issue is the requirement to synchronize
DatabasePersistenceManager.store() even though the content being
stored is almost embarrassingly parallel. In the age of multicore
processors found even on laptop computers we should be looking at all
opportunities to parallelize the code.

On a similar topic, I think addressing JCR-672 and dealing with the
deadlock problems in Jackrabbit once and for all :-) would be a great
idea. This has been an ongoing problem for a long time now, but each
attempt to address it seems to come to the conclusion that the work
required is a bit too big and scary to do easily. Perhaps if we added
it to a roadmap for a specific version, then we could bite the bullet
and take it on?

On the topic of performance, one thing that I'd like to get included
if possible (perhaps this flows from some of the JSR-283 versioning
changes) is a model of versioning that doesn't duplicate all data
between workspaces, but rather allows workspaces to store pointers
into a central version history. This would vastly increase the
performance of cross-workspace operations, a major headache for us.

Miro

Reply via email to