On 8/25/07, Dan Langille <[EMAIL PROTECTED]> wrote:
> On 25 Aug 2007 at 10:51, Eric Bollengier wrote:
>
> > On Saturday 25 August 2007 10:31:04 you wrote:
> > > Hello Marc,
> > >
> > > Thanks for the information.  I think this is something that is probably
> > > well worth the effort to put in the core Bacula code.  I'm going to study
> > > it a bit to see how it might be organized into a project.  Code like this
> > > might also help a lot for the simple restores that Bacula currently does 
> > > by
> > > speeding up the retrieval.
> > >
> > > Perhaps Eric could send me the SQL that creates the tables so that I can
> > > see more clearly all the fields.
> >
> > Sql definitions :
> >
> >     CREATE TABLE brestore_knownjobid
> >     (
> >      JobId int4 NOT NULL,
> >      CONSTRAINT brestore_knownjobid_pkey PRIMARY KEY (JobId)
> >     );
> >
> >    CREATE TABLE brestore_pathhierarchy
> >    (
> >      PathId int4 NOT NULL,
> >      PPathId int4 NOT NULL,
> >      CONSTRAINT brestore_pathhierarchy_pkey PRIMARY KEY (PathId)
> >    );
> >
> >    CREATE INDEX brestore_pathhierarchy_ppathid
> >                           ON brestore_pathhierarchy (PPathId);
> >
> >     CREATE TABLE brestore_pathvisibility
> >     (
> >       PathId int4 NOT NULL,
> >       JobId int4 NOT NULL,
> >       Size int8 DEFAULT 0,                -- used for statistics
> >       Files int4 DEFAULT 0,               -- used for statistics
> >       CONSTRAINT brestore_pathvisibility_pkey PRIMARY KEY (JobId, PathId)
> >     );
> >
> >     CREATE INDEX brestore_pathvisibility_jobid
> >                           ON brestore_pathvisibility (JobId);
> >
> >
> > For example, this is how we list files in a directory :
> >
> > SELECT File.FilenameId, listfiles.id, listfiles.Name, File.LStat, File.JobId
> >  FROM
> >       (SELECT Filename.Name, max(File.FileId) as id
> >        FROM File, Filename
> >        WHERE File.FilenameId = Filename.FilenameId
> >          AND Filename.Name != ''
> >          AND File.PathId IN (     10,20,20202,20202        )
> >          AND File.JobId IN (        1,2,3,4           )
> >        GROUP BY Filename.Name
> >        ORDER BY Filename.Name) AS listfiles,
> > File
> > WHERE File.FileId = listfiles.id
> >
> > I think, i will wrote a new class (something like Bvfs) which will
> > do basic operations like in a real filesystem.
> >
> > ch_dir(pathid)
> >       Change current directory to pathid
> > ls_dirs()
> >       List all directories in the current directory (pathid and jobid)
> > ls_files()
> >       List all files in a the current directory (pathid and jobid)
> > up_dir()
> >       Change to parent directory
> > pwd()
> >       Get the current pathid
> > get_pathid(path)
> >       Return pathid from a given path
> > get_root()
> >       Get root pathid
> > get_all_file_versions(...)
> >       Get all versions for a file
>
> Can you explain the pros and cons over something like a parent-child
> relationship?  That each, each file entry has a link to the parent
> directory.  This would be a self referential table, with some entries
> being files, some being directories.  This is what I use for the
> FreshPorts database.
>
> See http://news.freshports.org/2007/08/03/freshports-database-primer/

If lookup time is the objective then parent pointers are not the ideal
way to go, especially in SQL. Look up Celkos algorithms on storing a
nested heirarchy that allows containment queries using indexes.
(Nested set representation).

The idea is fairly straight forward, each record has a Left and Right
value, any children will have left and right values such that P.Left
<= C.left and C.right <=P.right. An index on both means that full
heirarchical queries can be done efficiently. The trade off of course
is that insert/updates are more expensive, however if I understand it
right the scenario we are discussing is fetch heavy, in which case
using the nested set representation will probably be a net win.

Just a thought.

Yves

-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Bacula-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to