On 8/26/07, Dan Langille <[EMAIL PROTECTED]> wrote:
> On 26 Aug 2007 at 0:52, demerphq wrote:
[...]
> > > Can you explain the pros and cons over something like a parent-child
> > > relationship?  That each, each file entry has a link to the parent
> > > directory.  This would be a self referential table, with some entries
> > > being files, some being directories.  This is what I use for the
> > > FreshPorts database.
> > >
> > > See http://news.freshports.org/2007/08/03/freshports-database-primer/
> >
> > If lookup time is the objective then parent pointers are not the ideal
> > way to go, especially in SQL. Look up Celkos algorithms on storing a
> > nested heirarchy that allows containment queries using indexes.
> > (Nested set representation).
>
> I know Joe... did some minor stuff with him years ago.  If you say a
> simple self-referencing table won't do it, OK.  It works for
> FreshPorts.

I didnt say "wont do it", I said "not ideal" in reference to lookup times.

> I don't know enough about what you are trying to get into the result
> set to comment.  I don't have the time to review the email archive.
> I could comment on a design document which details the result set
> requirements.
>
>
> > The idea is fairly straight forward, each record has a Left and Right
> > value, any children will have left and right values such that P.Left
> > <= C.left and C.right <=P.right. An index on both means that full
> > heirarchical queries can be done efficiently. The trade off of course
> > is that insert/updates are more expensive, however if I understand it
> > right the scenario we are discussing is fetch heavy, in which case
> > using the nested set representation will probably be a net win.
>
> Sounds like a tree to me... B-tree almost...

It is a tree representation. Just as parent pointers (aka adjacency
lists) and adjacency matrixes can be used to represent trees. I think
nested sets are restricted to DAG's however, unlike parent pointers.
It provides efficient ways to do:

Select all and only children of a single node in a single query:

select C.* from tree P, tree C where P.left<= C.right and C.left<=
P.right and P.id=?

Select all and only parents of a single node in a single query:

select P.* from tree P, tree C where P.left<= C.right and C.left<=
P.right and C.id=?

[In the article I reference below Celko uses two between clauses for
the equivalent lookup instead of using the overlapping clause that I
have here,  you'd have benchmark them to see which performs better
although id guess the overlapping clause as it only does two
comparisons rather than the four required for two between clauses.]

BTW, It is often coupled with parent pointer representation as well as
the latter makes it easy to say "whos my parent" or to construct
client side tree from a returned result set in code.

The negative side of the nested set representation is that it requires
a fair amount of work on inserts, but fetches are fast. It also allows
efficient operations like "find all people with N people underneath
them" as well as "find all leaf nodes" etc. Both are tasks that aren't
easy with parent pointer representation.

There is an article by Celko online (if not more than one), a quick
google found:

http://www.dbmsmag.com/9603d06.html

Although im more familiar with it via the book "Joe Celko's Trees and
Hierarchies in SQL for Smarties"

http://www.amazon.com/Hierarchies-Smarties-Kaufmann-Management-Systems/dp/1558609202

Anyway, hope this is useful. :-)

Cheers,
yves

-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Bacula-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to