On 8/26/07, Dan Langille <[EMAIL PROTECTED]> wrote: > On 26 Aug 2007 at 0:52, demerphq wrote: [...] > > > Can you explain the pros and cons over something like a parent-child > > > relationship? That each, each file entry has a link to the parent > > > directory. This would be a self referential table, with some entries > > > being files, some being directories. This is what I use for the > > > FreshPorts database. > > > > > > See http://news.freshports.org/2007/08/03/freshports-database-primer/ > > > > If lookup time is the objective then parent pointers are not the ideal > > way to go, especially in SQL. Look up Celkos algorithms on storing a > > nested heirarchy that allows containment queries using indexes. > > (Nested set representation). > > I know Joe... did some minor stuff with him years ago. If you say a > simple self-referencing table won't do it, OK. It works for > FreshPorts.
I didnt say "wont do it", I said "not ideal" in reference to lookup times. > I don't know enough about what you are trying to get into the result > set to comment. I don't have the time to review the email archive. > I could comment on a design document which details the result set > requirements. > > > > The idea is fairly straight forward, each record has a Left and Right > > value, any children will have left and right values such that P.Left > > <= C.left and C.right <=P.right. An index on both means that full > > heirarchical queries can be done efficiently. The trade off of course > > is that insert/updates are more expensive, however if I understand it > > right the scenario we are discussing is fetch heavy, in which case > > using the nested set representation will probably be a net win. > > Sounds like a tree to me... B-tree almost... It is a tree representation. Just as parent pointers (aka adjacency lists) and adjacency matrixes can be used to represent trees. I think nested sets are restricted to DAG's however, unlike parent pointers. It provides efficient ways to do: Select all and only children of a single node in a single query: select C.* from tree P, tree C where P.left<= C.right and C.left<= P.right and P.id=? Select all and only parents of a single node in a single query: select P.* from tree P, tree C where P.left<= C.right and C.left<= P.right and C.id=? [In the article I reference below Celko uses two between clauses for the equivalent lookup instead of using the overlapping clause that I have here, you'd have benchmark them to see which performs better although id guess the overlapping clause as it only does two comparisons rather than the four required for two between clauses.] BTW, It is often coupled with parent pointer representation as well as the latter makes it easy to say "whos my parent" or to construct client side tree from a returned result set in code. The negative side of the nested set representation is that it requires a fair amount of work on inserts, but fetches are fast. It also allows efficient operations like "find all people with N people underneath them" as well as "find all leaf nodes" etc. Both are tasks that aren't easy with parent pointer representation. There is an article by Celko online (if not more than one), a quick google found: http://www.dbmsmag.com/9603d06.html Although im more familiar with it via the book "Joe Celko's Trees and Hierarchies in SQL for Smarties" http://www.amazon.com/Hierarchies-Smarties-Kaufmann-Management-Systems/dp/1558609202 Anyway, hope this is useful. :-) Cheers, yves -- perl -Mre=debug -e "/just|another|perl|hacker/" ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Bacula-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/bacula-devel
