Greetings below is a previous message of a topic I would like to revive.
I'm wondering if it's a week after stabilizing the release yet. I've installed bweb and just started taking a look at it: The use of it. The updating process adding new data to the database. The queries it does of the database. Dirk -------- Forwarded Message -------- From: Dirk Bartley <[email protected]> To: Eric Bollengier <[email protected]> Cc: Kern Sibbald <[email protected]> Subject: Re: Bat version browser Date: Wed, 25 Mar 2009 12:30:09 -0400 Well, I don't have questions as much as statements. I think I have a relatively clear outline of what Eric is succeeding at but surely not the specifics. He has a procedure that is executed after his last nightly job is complete. This procedure runs and populates a table establishing parent and child relationships of directories. If I remember right, it only has to be run for recently added or removed jobs?? This would be quite useful when writing any gui interface!!!!!!! Tree structure or otherwise. I could put a few more exclamation points here. I'd probably modify my interface to populate only the roots of a backup then use mouse clicks to populate one directory at a time. What takes a while now is querying for the entire list of directories. With an indexed parent child relationship, WOW, that would be nice. The question I would ask would be better a week after stablizing an initial release to discuss what could be changed for future releases. I am quite not crazy about having to run a script before doing a restore. Not that I don't appreciate the work Eric has done, I just think it might be difficult task to explain to someone that they can't do a good restore until a script runs. I guess I could ask how long it takes to run. So how long does it take to run????? I imagine it probably does not matter if it takes an hour while your sleeping. I think a while ago there was a quite a very long thread about what we could do to the schema to get information like that in the database. The biggest problem I had was in being able to get a good index on the very large file table. The file table has no column for job. So therefore, there is no way to index and select all files for a given list of jobs. There was a limit to the size of the list of jobs before postgres would just choose to do a sequential search of the file table, and boy would that take a while. Go get a cup of coffee. It seems to be working well for me now, might be the result of a postgres update. If I had my choice, bacula's director would maintain parent child relationships between directories like Eric does. It would also have in the file table a column for parent directory. This may bring up the option of having a separate table for directories. It would also allow for an index on the selection of files in a directory and directories in a directory. I think this conversation may be best for a week after stabilizing a new release. The other point is that I'm not sure wheather that would even solve the problem at hand. The issue which prompted this is that right now, when a file is the root of a backup, no parent directory is added to the file table. To solve this, I think there should always be a parent directory, even if a file is the root of the fileset. Also useful would be a table of roots of a backup. Dirk On Wed, 2009-03-25 at 10:34 +0100, Eric Bollengier wrote: > Hello, > > Le Wednesday 25 March 2009 08:03:40 Kern Sibbald, vous avez écrit : > > Hello Eric, > > > > Dirk is adding help file content to bat, and in seeing the diffs, it > > reminded me of one problem bat has that I think you and Marc have resolved > > in bweb, and that is knowing what the root(s) of the backup tree is. Here > > are Dirk's comments: > > > > +<h2>A Version Browser Limitation</h2> > > + > > +There is an important limitation of the version browser. If a fileset > > +specifically points to a file instead of a directory that contains files, > > it +will not be seen in the version browser. This is due to the way that > > the +version browser searches for directories first, then for files > > contained in +those directories. A common example is with the catalog job. > > +It by default points directly to one file created in a databse dump. > > > > Could you help Dirk resolve this problem by explaining how you and Marc > > corrected the problem, assuming it is the same problem you solved, and > > point him to the necessary SQL? > > Well, this is not just a SQL procedure, it takes 150 lines of perl (that will > give between 1500 and 3000 lines of C++, maybe less with QT). I will try to > explain how it works. > > First, you need to have a way to query the directory structure quickly. With > a > PathId, you should be able to get the parent Path or all possible subpath. We > need to be able to use the JOIN operator with the result. > > At this time, you can do it yourself, but you have to use multiple SELECT and > this is very slow. You can also use the LIKE operator, but it's also a bad > idea. > > I have defined a global table for that > brestore_pathhierarchy (PPathId, PathId) > > For all systems, /usr/local/ parent is always /usr/, so this table is not too > big and is quite stable. > > It is filled by "hand", the bresto.pl shows the code for that, you will find : > - a function parent_dir() that returns the parent path (that splits the last > part) > - a function build_path_hierarchy() that checks that a path is present in > the > table and inserts it if not > - a function return_pathid_from_path() that checks and inserts a missing > path > in the Path table (for example File=/usr/local/bin will create > Path=/usr/local/bin/ but not /usr/ and /usr/local/) > > => Warning, this is not compatible with the dbcheck option that delete "non > used" PathId, if the user deletes "/", the pathhierarchy table will be > corruped, and you will have to rebuild it. > > The perl code uses a cache to reduce database work. > > > > > After, you need to have the list of all path included in a backup. At this > time, if you backup "File=/etc/passwd", the File table will contain > > FileId Filename Path FileIndex > 1 passwd /etc/ 1111 > > > You won't find the "/etc/" record (and not "/" too) > > FileId Filename Path FileIndex > 1 '' /etc/ 2121 <--------- > 2 passwd /etc/ 1111 > > At this time, in bweb, i have a special table for that, which contains all > path from a backup. I use this table because i don't want insert "fake" > records in the File table. > > bweb_pathvisibility (PathId, JobId) > > With this two tables and a simple JOIN, we will be able to list all > subdirectories from a specific Path. > > First, this table is filled with the File table (cf bresto.pl:187) > After that, we need to check that all parent path are in this table too. We > start by updating the brestore_pathhierarchy table by selecting missing > records (cf bresto.pl:200) and we call build_path_hierarchy() on them. > > After that we can insert missing path in brestore_pathvisibility() (cf > bresto.pl:228). > > Now, with this two tables, you can query very quickly (with a single SELECT) > the backup content. (see ls_files() bresto.pl:348 and ls_dirs() > bresto.pl:387) > > You can do ls_dirs('/'), or ls_files('/etc/') > > This "cache" can be updated with a batch or trigged when the user want to > restore a file. > > Now, time for questions... > > Bye > > > Many thanks, > > > > Kern > > ------------------------------------------------------------------------------ Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects _______________________________________________ Bacula-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/bacula-devel
