Re: Query Performance and Optimization

Michael Neale Tue, 06 Mar 2007 23:33:26 -0800

Hi Marcel - yes it would be interesting - I guess to get the most out of it,
the node type definitions would have to come into play to generate DDL for
the database - so the node type definitions will map to a more "tuned"
database schema - of course some concepts may not work that way, like
hierarchies, or "nt:unstructured" in which case it would need to use the
current style.


As for fulltext - database support varies with each vendor, so I would
hazard a guess that lucene would still need to be part of it (that is the
approach that the newer versions of hibernate have taken - take full text
out of the hands of the database).

The DDL generation kind of scares me, in terms of complexity, but I think
its necessary to let RDBMS "do its thing" so to speak?
ORM tools can certainly help here - can avoid programmatically generating
DDL by instead generating a meta model that ORM tools work off - just a
thought (let the ORMs generate DB specific schemas).

I know RDBMS are a proven way to scale up - but as for content, I am a
novice, so I am happy to follow the lead of those in the know in how best to
help jackrabbit scale. So far I have not been that "whelmed" by the query
performance - I am using the SQL dialect cause its familiar, but I think its
familiarity makes me want to do things that it is perhaps not optimised for,
maybe that is my problem.

I should and will join the dev list, so as to not pollute the user list with
ponderings over jackrabbit internals ;)

Thoughts?

Michael.

On 3/6/07, Marcel Reutegger <[EMAIL PROTECTED]> wrote:

Michael Neale wrote:
> I know from previous discussions that it is a design decision of
Jackrabbit
> to not exlcusively work with RDBMS - if it was, I would be all in favour
of
> leaning on it to do the hardwork.

please note that it is possible to exclusively use an RDBMS for storing
and
querying content, though you have to create your own persistence manager
and
query handler. the jackrabbit core does not force you to separate the
store and
the index.

but you are right that it was a design decision to allow separation if you
want
to. because jackrabbit initially only had plain file based persistence
managers
and because lucene provides very good fulltext indexing we decided to go
with
lucene.

coming back to the RDBMS only approach. you would have to implement a
persistence manager that stores nodes and properties in a way that allows
the
database to use its indexes. then create a query handler that translates
an
abstract query tree into a SQL statement based on the database schema.

there are some obstacles you will have to overcome (or actually the
database):
1) handle node hierarchies (e.g. get all ancestors of a certain node)
2) provide fulltext indexing

I think this would be a very useful extension for jackrabbit. so, if
anyone is
interested in implementing this, I'm very curious how well it performs
compared
to the current implementation using lucene.

regards
  marcel

Re: Query Performance and Optimization

Reply via email to