On Thu, 2007-09-27 at 09:19 +0200, Arno Lehmann wrote:
> Hi,
> 
> 27.09.2007 01:17,, Ross Boylan wrote::
> > I've been having really slow backups (13 hours) when I backup a large
> > mail spool.  I've attached a run report.  There are about 1.4M files
> > with a compressed size of 4G.  I get much better throughput (e.g.,
> > 2,000KB/s vs 86KB/s for this job!) with other jobs.
> 
> 2MB/s is still not especially fast for a backup to disk, I think. So 
> your storage disk might also be a factor here.
> 
> > First, does it sound as if something is wrong?  I suspect the number of
> > files is the key thing, and the mail  spool has lots of little files
> > (it's used by Cyrus).  Is this just life when you have lots of little
> > files?
> > 
> > Second, how can I figure out what the problem is?  I do have some
> > suspicions, but first some basics:
> > ------------------------------------------------
> > everything is running on the same box
> > 3GHz P4 with one SATA drive as the main drive and 4 older drives, one of
> > which is the backup target.
> > No noticeable CPU load or disk activity during the backup.  I was
> > compressing, but that doesn't show up noticeably for CPU use.
> 
> How much memory, and how is the memory usage during backups?
2G of RAM.  I'll have to watch it to determine how much is in use.
> 
> > Debian GNU/Linux 2.6.18 with postgresql 8.1, bacula 2.2.13.
> > Disk is managed by evms, using LVM.
> > The partion being backed up is ext3, and the backup is going to disk (a
> > different physical disk, IDE) using Reiser.
> 
> That's definitely a good thing.
> 
> > I am not using snapshotting because that feature is broken right now
> > (nothiing to  do with bacula).  I shut down the cyrus server during the
> > backup (desspite some errors in the log around my attempted shutdown, it
> > seemed to have worked).
> > 
> > My suspicion is that the TCP/IP transactions are all getting delayed
> > (maybe to batch for sending) in a way that usually isn't noticeable, but
> > is noticeable when doing lots of quick exchanges locally.
> 
> I don't know anything about issues with TCP delays, and I know Bacula 
> installations running smoothly on all sorts of hardware and different 
> OSes.
> 
> I rather suspect the catalog to be the bottle-neck.
> 
> Verifying this might be as easy as running vmstat while the job is 
> backed up and seeing if there is lots of iowait happening - this does 
> not necessarily show as hard disk activity.
Would tcp induced delays also show up as iowait?
> 
> Are your database and the mail spool on the same disk? This might 
> explain the slowness you encounter.
Yes.
> 
> In this case, I'd suggest to upgrade to Bacula 2.2.4. For two reasons, 
> actually: There is a serious bug that will hit you one day, and which 
> is fixed in the current version. Second, the new batch inserts feature 
> would gain lots of speed if the database throughput really is the 
> bottle neck for you.
I see 2.2.4 is in Debian unstable, so I should be able to pull it in.
That would be great if it speeds things up.
> 
> >  Not only are
> > my bacula components using TCP (I think), but I'm communicating with
> > postgres by TCP (I couldn't get authentication working properly with
> > unix domain sockets).
> > 
> > While populating the cyrus server I also encountered very slow
> > transaction speeds.  I think the TCP problem was the cause, though I
> > don't have definite confirmation.  I ran multiple jobs in parallel to
> > populate the cyrus server to get around the slowness of the individual
> > parts (I think that at least rules out things like db contention or disk
> > contention as culprits in that case).
> 
> As I don't know about the TCP delay I can't comment on this...
> 
> > Unfortunately, AFAIK the tcp delay is not tuneable on Linux; it is with
> > BSD.
> > 
> > Here are some relevant parts of bacula-dir.conf:
> > 
> > JobDefs {
> >   Name = "CornDefaults"
> >   Type = Backup
> >   Level = Incremental
> >   Client = corn-fd
> >   Storage = File2Storage
> >   Messages = Standard
> >   Pool = Default
> >   Full Backup Pool = Full
> >   Differential Backup Pool = Differential
> >   Incremental Backup Pool = Incremental
> >   Priority = 10
> >   Write Bootstrap = "/usr/local/var/spool/bacula/%n.bsr"
> > }
> > 
> > ######## Cyrus
> > ## really this needs more care: use snapshot, dump db to ascii
> 
> As far as I know, it's sufficient to dump cyrus' database. Given that 
> dump and a backup of your mail files, a correct cyrus database can be 
> easily regenerated. Snapshots would be a good thing, perhaps, but 
> you'd still have to explicitly dump the database as there is no 
> guarantee that the disk files of the database are always in a 
> consistent state.
cyrus recommends the ascii dump to guard against version changes that
would render the binary unusable.
http://cyrusimap.web.cmu.edu/twiki/bin/view/Cyrus/Backup has more.
You're right: snapshots alone will not assure integrity.
.....
> 
> I'm really unsure about TCP problems, but the situation more or less 
> looks like the catalog backend would be your problem. Could you try to 
> have the catalog db on another machine?
I've only got the one for now.

Thanks.
Ross


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to