Re: [Bacula-devel] Query changes in the catalog browser and indexes

Dan Langille Sat, 25 Aug 2007 16:04:30 -0700

On 26 Aug 2007 at 0:52, demerphq wrote:

> On 8/25/07, Dan Langille <[EMAIL PROTECTED]> wrote:
> > On 25 Aug 2007 at 10:51, Eric Bollengier wrote:
> >
> > > On Saturday 25 August 2007 10:31:04 you wrote:
> > > > Hello Marc,
> > > >
> > > > Thanks for the information.  I think this is something that is probably
> > > > well worth the effort to put in the core Bacula code.  I'm going to 
> > > > study
> > > > it a bit to see how it might be organized into a project.  Code like 
> > > > this
> > > > might also help a lot for the simple restores that Bacula currently 
> > > > does by
> > > > speeding up the retrieval.
> > > >
> > > > Perhaps Eric could send me the SQL that creates the tables so that I can
> > > > see more clearly all the fields.
> > >
> > > Sql definitions :
> > >
> > >     CREATE TABLE brestore_knownjobid
> > >     (
> > >      JobId int4 NOT NULL,
> > >      CONSTRAINT brestore_knownjobid_pkey PRIMARY KEY (JobId)
> > >     );
> > >
> > >    CREATE TABLE brestore_pathhierarchy
> > >    (
> > >      PathId int4 NOT NULL,
> > >      PPathId int4 NOT NULL,
> > >      CONSTRAINT brestore_pathhierarchy_pkey PRIMARY KEY (PathId)
> > >    );
> > >
> > >    CREATE INDEX brestore_pathhierarchy_ppathid
> > >                           ON brestore_pathhierarchy (PPathId);
> > >
> > >     CREATE TABLE brestore_pathvisibility
> > >     (
> > >       PathId int4 NOT NULL,
> > >       JobId int4 NOT NULL,
> > >       Size int8 DEFAULT 0,                -- used for statistics
> > >       Files int4 DEFAULT 0,               -- used for statistics
> > >       CONSTRAINT brestore_pathvisibility_pkey PRIMARY KEY (JobId, PathId)
> > >     );
> > >
> > >     CREATE INDEX brestore_pathvisibility_jobid
> > >                           ON brestore_pathvisibility (JobId);
> > >
> > >
> > > For example, this is how we list files in a directory :
> > >
> > > SELECT File.FilenameId, listfiles.id, listfiles.Name, File.LStat, 
> > > File.JobId
> > >  FROM
> > >       (SELECT Filename.Name, max(File.FileId) as id
> > >        FROM File, Filename
> > >        WHERE File.FilenameId = Filename.FilenameId
> > >          AND Filename.Name != ''
> > >          AND File.PathId IN (     10,20,20202,20202        )
> > >          AND File.JobId IN (        1,2,3,4           )
> > >        GROUP BY Filename.Name
> > >        ORDER BY Filename.Name) AS listfiles,
> > > File
> > > WHERE File.FileId = listfiles.id
> > >
> > > I think, i will wrote a new class (something like Bvfs) which will
> > > do basic operations like in a real filesystem.
> > >
> > > ch_dir(pathid)
> > >       Change current directory to pathid
> > > ls_dirs()
> > >       List all directories in the current directory (pathid and jobid)
> > > ls_files()
> > >       List all files in a the current directory (pathid and jobid)
> > > up_dir()
> > >       Change to parent directory
> > > pwd()
> > >       Get the current pathid
> > > get_pathid(path)
> > >       Return pathid from a given path
> > > get_root()
> > >       Get root pathid
> > > get_all_file_versions(...)
> > >       Get all versions for a file
> >
> > Can you explain the pros and cons over something like a parent-child
> > relationship?  That each, each file entry has a link to the parent
> > directory.  This would be a self referential table, with some entries
> > being files, some being directories.  This is what I use for the
> > FreshPorts database.
> >
> > See http://news.freshports.org/2007/08/03/freshports-database-primer/
> 
> If lookup time is the objective then parent pointers are not the ideal
> way to go, especially in SQL. Look up Celkos algorithms on storing a
> nested heirarchy that allows containment queries using indexes.
> (Nested set representation).


I know Joe... did some minor stuff with him years ago.  If you say a 
simple self-referencing table won't do it, OK.  It works for 
FreshPorts.

I don't know enough about what you are trying to get into the result 
set to comment.  I don't have the time to review the email archive.  
I could comment on a design document which details the result set 
requirements.

> The idea is fairly straight forward, each record has a Left and Right
> value, any children will have left and right values such that P.Left
> <= C.left and C.right <=P.right. An index on both means that full
> heirarchical queries can be done efficiently. The trade off of course
> is that insert/updates are more expensive, however if I understand it
> right the scenario we are discussing is fetch heavy, in which case
> using the nested set representation will probably be a net win.

Sounds like a tree to me... B-tree almost...

-- 
Dan Langille - http://www.langille.org/
Available for hire: http://www.freebsddiary.org/dan_langille.php



-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Bacula-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Re: [Bacula-devel] Query changes in the catalog browser and indexes

Reply via email to