[Bacula-devel] [Fwd: Re: Bat version browser]

Dirk Bartley Tue, 16 Jun 2009 19:43:29 -0700

Greetings

below is a previous message of a topic I would like to revive.

I'm wondering if it's a week after stabilizing the release yet.  I've
installed bweb and just started taking a look at it:
The use of it.
The updating process adding new data to the database.
The queries it does of the database.

Dirk

-------- Forwarded Message --------
From: Dirk Bartley <[email protected]>
To: Eric Bollengier <[email protected]>
Cc: Kern Sibbald <[email protected]>
Subject: Re: Bat version browser
Date: Wed, 25 Mar 2009 12:30:09 -0400

Well, I don't have questions as much as statements.  I think I have a
relatively clear outline of what Eric is succeeding at but surely not
the specifics.  He has a procedure that is executed after his last
nightly job is complete.  This procedure runs and populates a table
establishing parent and child relationships of directories.  If I
remember right, it only has to be run for recently added or removed
jobs??

This would be quite useful when writing any gui interface!!!!!!!  Tree
structure or otherwise.  I could put a few more exclamation points here.
I'd probably modify my interface to populate only the roots of a backup
then use mouse clicks to populate one directory at a time.  What takes a
while now is querying for the entire list of directories.  With an
indexed parent child relationship,  WOW, that would be nice.

The question I would ask would be better a week after stablizing an
initial release to discuss what could be changed for future releases.  I
am quite not crazy about having to run a script before doing a restore.
Not that I don't appreciate the work Eric has done, I just think it
might be difficult task to explain to someone that they can't do a good
restore until a script runs.   I guess I could ask how long it takes to
run.   So how long does it take to run?????  I imagine it probably does
not matter if it takes an hour while your sleeping.

I think a while ago there was a quite a very long thread about what we
could do to the schema to get information like that in the database.

The biggest problem I had was in being able to get a good index on the
very large file table.  The file table has no column for job.  So
therefore, there is no way to index and select all files for a given
list of jobs.  There was a limit to the size of the list of jobs before
postgres would just choose to do a sequential search of the file table,
and boy would that take a while.  Go get a cup of coffee.  It seems to
be working well for me now, might be the result of a postgres update.

If I had my choice, bacula's director would maintain parent child
relationships between directories like Eric does.  It would also have in
the file table a column for parent directory.  This may bring up the
option of having a separate table for directories.  It would also allow
for an index on the selection of files in a directory and directories in
a directory.

I think this conversation may be best for a week after stabilizing a new
release.

The other point is that I'm not sure wheather that would even solve the
problem at hand.  The issue which prompted this is that right now, when
a file is the root of a backup, no parent directory is added to the file
table.  To solve this, I think there should always be a parent
directory, even if a file is the root of the fileset.  Also useful would
be a table of roots of a backup.

Dirk

On Wed, 2009-03-25 at 10:34 +0100, Eric Bollengier wrote:
> Hello,
> 
> Le Wednesday 25 March 2009 08:03:40 Kern Sibbald, vous avez écrit :
> > Hello Eric,
> >
> > Dirk is adding help file content to bat, and in seeing the diffs, it
> > reminded me of one problem bat has that I think you and Marc have resolved
> > in bweb, and that is knowing what the root(s) of the backup tree is.  Here
> > are Dirk's comments:
> >
> > +<h2>A Version Browser Limitation</h2>
> > +
> > +There is an important limitation of the version browser.  If a fileset
> > +specifically points to a file instead of a directory that contains files,
> > it +will not be seen in the version browser.  This is due to the way that
> > the +version browser searches for directories first, then for files
> > contained in +those directories.  A common example is with the catalog job.
> > +It by default points directly to one file created in a databse dump.
> >
> > Could you help Dirk resolve this problem by explaining how you and Marc
> > corrected the problem, assuming it is the same problem you solved, and
> > point him to the necessary SQL?
> 
> Well, this is not just a SQL procedure, it takes 150 lines of perl (that will 
> give between 1500 and 3000 lines of C++, maybe less with QT). I will try to 
> explain how it works.
> 
> First, you need to have a way to query the directory structure quickly. With 
> a 
> PathId, you should be able to get the parent Path or all possible subpath. We 
> need to be able to use the JOIN operator with the result.
> 
> At this time, you can do it yourself, but you have to use multiple SELECT and 
> this is very slow. You can also use the LIKE operator, but it's also a bad 
> idea.
> 
> I have defined a global table for that
> brestore_pathhierarchy (PPathId, PathId)
> 
> For all systems, /usr/local/ parent is always /usr/, so this table is not too 
> big and is quite stable.
> 
> It is filled by "hand", the bresto.pl shows the code for that, you will find :
>  - a function parent_dir() that returns the parent path (that splits the last 
> part)
>  - a function build_path_hierarchy() that checks that a path is present in 
> the 
> table and inserts it if not
>  - a function return_pathid_from_path() that checks and inserts a missing 
> path 
> in the Path table (for example File=/usr/local/bin will create 
> Path=/usr/local/bin/ but not /usr/ and  /usr/local/)
> 
> => Warning, this is not compatible with the dbcheck option that delete "non 
> used" PathId, if the user deletes "/", the pathhierarchy table will be 
> corruped, and you will have to rebuild it.
> 
> The perl code uses a cache to reduce database work.
> 
> 
> 
> 
> After, you need to have the list of all path included in a backup. At this 
> time, if you backup "File=/etc/passwd", the File table will contain
> 
> FileId     Filename         Path            FileIndex
> 1           passwd           /etc/            1111
> 
> 
> You won't find the "/etc/" record (and not "/" too)
> 
> FileId     Filename         Path        FileIndex
> 1           ''                     /etc/         2121          <---------
> 2           passwd           /etc/         1111
> 
> At this time, in bweb, i have a special table for that, which contains all 
> path from a backup. I use this table because i don't want insert "fake" 
> records in the File table.
> 
> bweb_pathvisibility (PathId, JobId)
> 
> With this two tables and a simple JOIN, we will be able to list all 
> subdirectories from a specific Path.
> 
> First, this table is filled with the File table (cf bresto.pl:187)
> After that, we need to check that all parent path are in this table too. We 
> start by updating the brestore_pathhierarchy table by selecting missing 
> records (cf bresto.pl:200) and we call build_path_hierarchy() on them.
> 
> After that we can insert missing path in brestore_pathvisibility() (cf 
> bresto.pl:228).
> 
> Now, with this two tables, you can query very quickly (with a single SELECT) 
> the backup content. (see ls_files() bresto.pl:348 and ls_dirs() 
> bresto.pl:387)
> 
> You can do ls_dirs('/'), or ls_files('/etc/')
> 
> This "cache" can be updated with a batch or trigged when the user want to 
> restore a file.
> 
> Now, time for questions...
> 
> Bye
> 
> > Many thanks,
> >
> > Kern
> 
> 

------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
Bacula-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-devel

[Bacula-devel] [Fwd: Re: Bat version browser]

Reply via email to