On Fri, 2006-09-01 at 22:41 +0200, Arno Lehmann wrote:
> Hello,
> 
> On 9/1/2006 7:33 PM, Peter Sjoberg wrote:
> > I wonder if it's anything I can do to improve the performance when
> > inserting the attributes.
> > 
> > I'm backing up several servers but one specifically is about 86Gb data
> > and 1.6Million files. The data backup finish after a few hours (<6) but
> > then it (I assume) updates the database with all the attributes and that
> > takes >18h!
> > 
> > Sysinfo:
> > The server is a Dual PIII 800MHz with 2G memory
> > The db is around 500mb
> > The spooled attributes file (I'm spooling attributes and data) is abotu
> > 480Mb.
> > mysql version: 4.0.18
> > 
> > During the db update the mysqld proces is using 100% of one cpu.
> > I did change /etc/my.cnf to the "large" template and made sure it has
> > thread_concurrency=4 but I guess the inserts are single threaded so it
> > still ends up running on one cpu.
> > 
> > I looked a little on variables etc and it seems like the pounding is
> > SELECT and INSERT statments.
> 
> Apart from what Kern said about the technical side of modifying Bacula 
I would love to but missing time and skill so I'm about 99% sure I will
NOT do it.
(But I will try to find more in the archives)

> I 
> do think there must be different approaches. For example, I have a 
> backup system that stores 84 GB in about 42000 files with one job.
> 
> (by the way, does anyone know how to count the number of files on a 
> ReiserFS filesystem, short of doing something like "ls -R /home | egrep 
> -v '^(\.+|)$' | wc -l"? It has no inode count for df -i)
> 
> This is definitely less than what you do, but this runs on much slower 
> hardware. I do have the Bacula server and the catalog database on 
> different machines, though. The catalog server is an Athlon 500 with 512 
> MB RAM, usually rather loaded (loadavg usually close to 1). Despooling 
> the attributes takes some time, but not that long.
> 
> > Short of getting faster CPU or be more patient, what can I look
> > at/change to get it finished a little faster?
> 
> You could try more MySQL tuning, you could put the database onto its own 
> disk or a different machine.
This is a little more down the line of what I was looking for.

I did some more research and I think I found something on this side. 
One select that it seems to be working hard on is
  SELECT PathId from Path WHERE Path='/some/very/long/path';

I copied the bacula db to a different spot on the same system and did
some testing.
I did "EXPLAIN" on it and it told me that it spent 0.5 seconds to scan
>5000 rows and return 1 row. Checking more I see that the index for the
Path column is the first 50 characters of a BLOB.
Besides lots of files on this system the top part of the path is many
times the same, at least the first 50 characters.
I dropped that index and recreated it with 255 characters (as it is in
the latest table makes) and now it spent 0.0 seconds to scanned one row
and return one.
When the backups finished sometime this weekend I will recreate the
index on the production db and then I hope it will go a little faster
for next full backup.

My guess is that at the time when I created this db the index size was
set to 50 but since then (I have upgraded a few times) it changed but of
course the db wasn't changed.

> 
> In fact, I do wonder if I ever saw MySQL eating CPU time that much - 
> here, I/O is usually the bottleneck when handling large amounts of data. 
> Even with 512 MB, it's rather typical that more than 300MB are sed for 
> buffers.
> 
> In short - try tweaking MySQL itself, or the tables Bacula uses - 
> indexes might eat lots of time. You could try to run a backup after 
> removing some of the indexes on the File table, for example. The 
> database gurus surely know more :-)
> 
> Arno
> 


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to