Ok I've got a better handle on this. First, a side issue that anyone dealing with PG needs to know about.
PG uses MVCC. This allows unlimited transactions in flight and is lock-free. So it has great concurrency. But there is a draw back. Edit a row, and we get another version of that row. Any indexes on that table have to updated, even if the field you are updateing is not indexed (!!!!!!!!!) PG 8.3 has an optimisation (Heap Only Tuples) but that only helps when the number of updates << number of rows. Fortunately bacula doesn't do that, but you might, while developing. I proposed two additional tables for general attributes. However when I think of use cases the only fields of interest are size, ctime and mtime and we can add thes to table 'file'. ALTER TABLE file add column size bigint default 0; ALTER TABLE file add column ctime timestamp without time zone; ALTER TABLE file add column mtime timestamp without time zone; There is no need to remove lstat so compatibility is assured. Speed impact on database inserts is too small to notice. Speed impact on retrieving the fields from the fd, trivial, since the files are already stat'd. Space impact is minor. Benefits. Its now possible to estimate a restore and possible to find versions of files based on mtime. Adhoc queries are now more useful. Proposal. That this DB change be done with the next major release of bacula. Code updates to use it can then be done at leisure without user-visible changes. --John On Mon, 22 Sep 2008 08:43:48 John Huttley wrote: > I've been running some tests and I have some postgresql performance > problems. > On the file table with 2.7M records in it, updating a non-indexed text > field is taking 570sec with a huge IO load. I can do a complete db load in > only 94sec! > > > It may take me a while to figure this out. I'll report back when i have > some believable figures. > > > --john > > John Huttley wrote: > > Kern, I think you are on the wrong foot here. You mention 100G as if > > it has some significance. It doesn't. > > You can be sure that if the DB is size N then they will have 5xN > > storage available. The % change and speed impact is likely > > to be the same for any N where N > the system ram. > > > > > > > > We don't actually know what the space change will be or the speed > > impact on inserts. > > > > I'll run some tests tonight and see if I can get some numbers. > > > > --John > > > > Kern Sibbald wrote: > >> On Saturday 20 September 2008 13:40:29 Yuri Timofeev wrote: > >> > >> <Snip> > >> > >>> But for me it is obvious that we should remove LStat and make multiple > >>> fields (FileSize, atime, mtime, etc) instead Lstat. > >> > >> That might be nice for people who want to access the database directly, > >> but for someone who has a 100GB Bacula database, it would be rather > >> catastrophic. > >> > >> Concerning the possibility of normalizing some of the subfields of the > >> LStat record -- that is a possibility, but someone other than myself > >> would need to do some careful testing on the size (should shrink the DB) > >> and the performance of the DB particularly on multi-GB database. > ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Bacula-devel mailing list Bacula-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-devel