Re: Query Performance and Optimization

Michael Neale Wed, 07 Mar 2007 15:35:52 -0800

On 3/7/07, Stefan Guggisberg <[EMAIL PROTECTED]> wrote:


On 3/7/07, Michael Neale <[EMAIL PROTECTED]> wrote:
> Hi Marcel - yes it would be interesting - I guess to get the most out of
it,
> the node type definitions would have to come into play to generate DDL
for
> the database - so the node type definitions will map to a more "tuned"
> database schema - of course some concepts may not work that way, like

i guess by "tuned" you mean a normalized schema. why do you think that
such a normalized schema would improve performance?



Mainly allowing the RDBMS to perform queries - natively.

hierarchies, or "nt:unstructured" in which case it would need to use the
> current style.
>
> As for fulltext - database support varies with each vendor, so I would
> hazard a guess that lucene would still need to be part of it (that is
the
> approach that the newer versions of hibernate have taken - take full
text
> out of the hands of the database).
>
> The DDL generation kind of scares me, in terms of complexity, but I
think
> its necessary to let RDBMS "do its thing" so to speak?

why?



Mainly for queries. if we have a  node type def that has something:title,
something:size etc... then if they map to  columns in a table called
something_title, something_age we can get the RDBMS to do indexing. However,
this is turning jackrabbit into a kind of ORM itself - probably not one of
the aims ;)

ORM tools can certainly help here - can avoid programmatically generating
> DDL by instead generating a meta model that ORM tools work off - just a
> thought (let the ORMs generate DB specific schemas).
>
> I know RDBMS are a proven way to scale up - but as for content, I am a
> novice, so I am happy to follow the lead of those in the know in how
best to
> help jackrabbit scale. So far I have not been that "whelmed" by the
query
> performance - I am using the SQL dialect cause its familiar, but I think
its
> familiarity makes me want to do things that it is perhaps not optimised
for,
> maybe that is my problem.
>
> I should and will join the dev list, so as to not pollute the user list
with
> ponderings over jackrabbit internals ;)
>
> Thoughts?
>
> Michael.
>
> On 3/6/07, Marcel Reutegger <[EMAIL PROTECTED]> wrote:
> >
> > Michael Neale wrote:
> > > I know from previous discussions that it is a design decision of
> > Jackrabbit
> > > to not exlcusively work with RDBMS - if it was, I would be all in
favour
> > of
> > > leaning on it to do the hardwork.
> >
> > please note that it is possible to exclusively use an RDBMS for
storing
> > and
> > querying content, though you have to create your own persistence
manager
> > and
> > query handler. the jackrabbit core does not force you to separate the
> > store and
> > the index.
> >
> > but you are right that it was a design decision to allow separation if
you
> > want
> > to. because jackrabbit initially only had plain file based persistence
> > managers
> > and because lucene provides very good fulltext indexing we decided to
go
> > with
> > lucene.
> >
> > coming back to the RDBMS only approach. you would have to implement a
> > persistence manager that stores nodes and properties in a way that
allows
> > the
> > database to use its indexes. then create a query handler that
translates
> > an
> > abstract query tree into a SQL statement based on the database schema.
> >
> > there are some obstacles you will have to overcome (or actually the
> > database):
> > 1) handle node hierarchies (e.g. get all ancestors of a certain node)
> > 2) provide fulltext indexing
> >
> > I think this would be a very useful extension for jackrabbit. so, if
> > anyone is
> > interested in implementing this, I'm very curious how well it performs
> > compared
> > to the current implementation using lucene.
> >
> > regards
> >   marcel
> >
>

Re: Query Performance and Optimization

Reply via email to