Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-11-19 Thread Mikael Fridh
On Thu, Nov 18, 2010 at 10:27 PM, Alan Brown a...@mssl.ucl.ac.uk wrote:
 On 13/11/10 04:46, Gary R. Schmidt wrote:
 You mean looks increasingly *unlikely* don't you?  As InnoDB is the
 default in MySQL 5.5...

 Yes it is, but take a look at what Oracle's been doing to the other
 opensource projects it inherited.

 It says a lot when core mysql developers fork a new project.

 It says a lot more when this happens across a number of projects
 including the entire Open Office developer team.

That's quite an exaggeration, although I am definitely not feeling
good about the amount of bad will Oracle managed to inspire in me in
this short amount of time.

Frankly, the only thing going for them in my book right now IS MySQL 5.5.

   I suspect there to be at least one person involved in this discussion
 who has *religion* in relation to database engines...

 Nothing to do with religion - and FWIW, stating that postgresql requires
 a DBA is a clear case of FUD.

 My point of view comes from running both engines on the same hardware
 and observing the loads involved. Personally I'd prefer to be running
 mysql but it was clear postgres ran faster and had lower memory
 foorprints for our use than innodb. Others have reported the same thing
 over the years.

The FUD stops here, this is pointless in the case of (where this
discussion started) restore performance on a MySQL back-end. The SQL
queries are not at all written with performance for MySQL in mind. And
frankly, the file selection process shouldn't even pull the entire
file list from the database at once, it should be done kindof like the
.Bvfs API is built, with the Hierarchy in mind. Pulling 5 million
files out in one flat list is equally stupid to (or, rather in this
case, a simplification) storing 5 million files in an unhashed
directory structure.

As soon as you see subqueries like these run against a MySQL server it
is obvious it was not designed for MySQL and/or performance.

Frankly I don't know at this point how to make it better without
restructuring the database and actually avoiding pulling out millions
of millions of records at once.
Hierarchies are definitely the way to go in this case, as it was a
question of restore selection. Remember, this was about file
selection, you don't actually need the full list of million(s) if
you're only going to choose a subset. But you do need the ability to
traverse.

Here's the query which is the foundation of this entire thread, tidied
up a bit. Anyone who's ever dealt with MySQL can see that this is not
going to look good in EXPLAIN:

SELECT
Path.Path, Filename.Name, Temp.FileIndex, Temp.JobId, LStat, MD5
FROM
(
SELECT
FileId, Job.JobId AS JobId, FileIndex, File.PathId AS
PathId, File.FilenameId AS FilenameId, LStat, MD5
FROM
Job,
File,
(
SELECT
MAX(JobTDate) AS JobTDate, PathId, FilenameId
FROM
(
SELECT JobTDate, PathId, FilenameId
FROM File
JOIN Job
USING (JobId)
WHERE File.JobId IN (38,39)
UNION ALL
SELECT JobTDate, PathId, FilenameId
FROM BaseFiles
JOIN File USING (FileId)
JOIN Job ON (BaseJobId = Job.JobId)
WHERE BaseFiles.JobId IN (38,39)
) AS tmp
GROUP BY
PathId, FilenameId
) AS T1
WHERE
(Job.JobId IN
(
SELECT DISTINCT BaseJobId
FROM BaseFiles
WHERE JobId IN (38,39)
)
OR
Job.JobId IN (38,39)
)
AND T1.JobTDate = Job.JobTDate
AND Job.JobId = File.JobId
AND T1.PathId = File.PathId
AND T1.FilenameId = File.FilenameId
) AS Temp
JOIN Filename ON (Filename.FilenameId = Temp.FilenameId)
JOIN Path ON (Path.PathId = Temp.PathId)
WHERE FileIndex  0 ORDER BY Temp.JobId, FileIndex ASC;

--
Mikael

--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-11-19 Thread Alan Brown
On Fri, 19 Nov 2010, Mikael Fridh wrote:

 The FUD stops here, this is pointless in the case of (where this
 discussion started) restore performance on a MySQL back-end.

In terms of restore performance, you're right. Better optimised queries
would speed things up, but probably not by much (see below: Bacula-dir is
the major factor on large restores)

 Pulling 5 million files out in one flat list is equally stupid to (or,
 rather in this case, a simplification) storing 5 million files in an
 unhashed directory structure.

I _have_ users who do that. (arrgh!)

 As soon as you see subqueries like these run against a MySQL server it
 is obvious it was not designed for MySQL and/or performance.

In both cases (mysql or postgres) the actual query is relatively fast for
1-2 million file backups, but then bacula-dir itself grinds on the results
for quite some time and it's memory-intensive while doing it.

I've pointed Kern at alternatives to red/black tables which will probably
speed that side up, but optimised queries are always a good idea.

 Frankly I don't know at this point how to make it better without
 restructuring the database and actually avoiding pulling out millions
 of millions of records at once.

If you can make a better mousetrap, Kern (and a lot of other people) will
probably thank you - even if it takes a major revision release to change
the database format.

FWIW, the EXPLAIN is just as ugly on postgres.

AB




--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-11-18 Thread Alan Brown
On 13/11/10 04:46, Gary R. Schmidt wrote:
 You mean looks increasingly *unlikely* don't you?  As InnoDB is the
 default in MySQL 5.5...

Yes it is, but take a look at what Oracle's been doing to the other 
opensource projects it inherited.

It says a lot when core mysql developers fork a new project.

It says a lot more when this happens across a number of projects 
including the entire Open Office developer team.

  I suspect there to be at least one person involved in this discussion
 who has *religion* in relation to database engines...


Nothing to do with religion - and FWIW, stating that postgresql requires 
a DBA is a clear case of FUD.

My point of view comes from running both engines on the same hardware 
and observing the loads involved. Personally I'd prefer to be running 
mysql but it was clear postgres ran faster and had lower memory 
foorprints for our use than innodb. Others have reported the same thing 
over the years.

(Longer term I'm concerned about what Oracle may do with Mysql as we 
have a number of databases installed on various machines machines doing 
various things for various groups  space scientists are difficult to 
deal with at the best of times, let alone if they have to change tools.)

 Frankly, I'd rather there were reliable connectors and queries available
 for Oracle and DB2, rather than this childish prattle over MySQL and
 PostGRES.

It'd be nice, but it's not going to happen unless someone who wants 
them, writes it (or pays for it)





--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-11-13 Thread Gavin McCullagh
Hi,

On Fri, 12 Nov 2010, Bob Hetzel wrote:

 I'm starting to think the issue might be linked to some kernels or linux 
 distros.  I have two bacula servers here.  One system is a year and a half 
 old (12 GB RAM), has with a File table having approx 40 million File 
 records.  That system has had the slowness issue (building the directory 
 tree on restores took about an hour) running first Ubuntu 9.04 or 9.10 and 
 now RedHat 6 beta.  The kernel currently is at 2.6.32-44.1.el6.x86_64.  I 
 haven't tried downgrading, instead I tweaked the source code to use the old 
 3.0.3 query and recompiled--I don't use Base jobs or Accurate backups so 
 that's safe for me.
 
 The other system is 4 yrs or so old, with less memory (8GB), slower cpus, 
 slower hard drives, etc., and in fairness only 35 million File records. 
 This one builds the directory tree in approx 10 seconds, but is running 
 Centos 5.5.  The kernel currently is at 2.6.18-194.11.3.el5.

That's an interesting thought.  It would be interesting to make an exact
comparison, something like:

 - run a restore with the slow query log on, capture the query text 
 - run the query manually in mysql
 - dump the mysql database
 - restore the mysql database on the older server
 - run the query there

It sounds like your database is quite large so this might be too
awkward in practice?  Strictly speaking the freshly sequentially written
database might have a slight unfair advantage, but if the results are
radically different then that would be useful to know.

Gavin


--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-11-13 Thread Dan Langille
On 11/12/2010 11:46 PM, Gary R. Schmidt wrote:

 Frankly, I'd rather there were reliable connectors and queries available
 for Oracle and DB2

My usual conclusion when something does not exist is that nobody [with 
the ability to create it] wants them.

  rather than this childish prattle over MySQL and PostGRES.

Anyone can stop it any time.

-- 
Dan Langille - http://langille.org/

--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-11-12 Thread Mikael Fridh
On Thu, Nov 11, 2010 at 3:47 PM, Gavin McCullagh gavin.mccull...@gcd.ie wrote:
 On Mon, 08 Nov 2010, Gavin McCullagh wrote:

 We seem to have the correct indexes on the file table.  I've run optimize 
 table
 and it still takes 14 minutes to build the tree on one of our bigger clients.
 We have 51 million entries in the file table.

 I thought I should give some mroe concrete information:

 I don't suppose this is news to anyone but here's the mysql slow query log to
 correspond:

 # Time: 10 14:24:49
 # u...@host: bacula[bacula] @ localhost []
 # Query_time: 1139.657646  Lock_time: 0.000471 Rows_sent: 4263403  
 Rows_examined: 50351037
 SET timestamp=1289485489;
 SELECT Path.Path, Filename.Name, Temp.FileIndex, Temp.JobId, LStat, MD5 FROM 
 ( SELECT FileId, Job.JobId AS JobId, FileIndex, File.PathId AS PathId, 
 File.FilenameId AS FilenameId, LStat, MD5 FROM Job, File, ( SELECT 
 MAX(JobTDate) AS JobTDate, PathId, FilenameId FROM ( SELECT JobTDate, PathId, 
 FilenameId FROM File JOIN Job USING (JobId) WHERE File.JobId IN 
 (9944,9950,9973,9996) UNION ALL SELECT JobTDate, PathId, FilenameId FROM 
 BaseFiles JOIN File USING (FileId) JOIN Job  ON    (BaseJobId = Job.JobId) 
 WHERE BaseFiles.JobId IN (9944,9950,9973,9996) ) AS tmp GROUP BY PathId, 
 FilenameId ) AS T1 WHERE (Job.JobId IN ( SELECT DISTINCT BaseJobId FROM 
 BaseFiles WHERE JobId IN (9944,9950,9973,9996)) OR Job.JobId IN 
 (9944,9950,9973,9996)) AND T1.JobTDate = Job.JobTDate AND Job.JobId = 
 File.JobId AND T1.PathId = File.PathId AND T1.FilenameId = File.FilenameId ) 
 AS Temp JOIN Filename ON (Filename.FilenameId = Temp.FilenameId) JOIN Path ON 
 (Path.PathId = Temp.PathId) WHERE FileIndex  0 ORDER BY Temp.JobId, 
 FileIndex ASC;

Could you please do an EXPLAIN on this query?
I know it's going to look awful but I'm curious anyway.
subqueries like these and SELECT DISTINCT are usually a recipe for
disastrous querytimes in MySQL.

 I've spent some time with the mysqltuner.pl script but to no avail thus far.
 There's 6GB RAM so it suggests a key buffer size of 4GB which I've set at
 4.1GB.

Tuning's not going to get any of those 50 million traversed rows
disappear. Only a differently optimized query plan will.

 This is an Ubuntu Linux server running MySQL v5.1.41.  The mysql data is on an
 MD software RAID 1 array on 7200rpm SATA disks.  The tables are MyISAM (which 
 I
 had understood to be quicker than innodb in low concurrency situations?).  The
 tuner script is suggesting I should disable innodb as we're not using it which
 I will do though I wouldn't guess that will make a massive difference.

No, it will not help.

--
Mikael

--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-11-12 Thread Gavin McCullagh
Hi,

On Fri, 12 Nov 2010, Mikael Fridh wrote:

 On Thu, Nov 11, 2010 at 3:47 PM, Gavin McCullagh gavin.mccull...@gcd.ie 
 wrote:

  # Time: 10 14:24:49
  # u...@host: bacula[bacula] @ localhost []
  # Query_time: 1139.657646  Lock_time: 0.000471 Rows_sent: 4263403  
  Rows_examined: 50351037
  SET timestamp=1289485489;
  SELECT Path.Path, Filename.Name, Temp.FileIndex, Temp.JobId, LStat, MD5 
  FROM ( SELECT FileId, Job.JobId AS JobId, FileIndex, File.PathId AS PathId, 
  File.FilenameId AS FilenameId, LStat, MD5 FROM Job, File, ( SELECT 
  MAX(JobTDate) AS JobTDate, PathId, FilenameId FROM ( SELECT JobTDate, 
  PathId, FilenameId FROM File JOIN Job USING (JobId) WHERE File.JobId IN 
  (9944,9950,9973,9996) UNION ALL SELECT JobTDate, PathId, FilenameId FROM 
  BaseFiles JOIN File USING (FileId) JOIN Job  ON    (BaseJobId = Job.JobId) 
  WHERE BaseFiles.JobId IN (9944,9950,9973,9996) ) AS tmp GROUP BY PathId, 
  FilenameId ) AS T1 WHERE (Job.JobId IN ( SELECT DISTINCT BaseJobId FROM 
  BaseFiles WHERE JobId IN (9944,9950,9973,9996)) OR Job.JobId IN 
  (9944,9950,9973,9996)) AND T1.JobTDate = Job.JobTDate AND Job.JobId = 
  File.JobId AND T1.PathId = File.PathId AND T1.FilenameId = File.FilenameId 
  ) AS Temp JOIN Filename ON (Filename.FilenameId = Temp.FilenameId) JOIN 
  Path ON (Path.PathId = Temp.PathId) WHERE FileIndex  0 ORDER BY 
  Temp.JobId, FileIndex 
 ASC;
 
 Could you please do an EXPLAIN on this query?

I prefixed the query by the word EXPLAIN and ran it:

mysql source bacularestorequery.sql
+++++-++-+-+-+-+
| id | select_type| table  | type   | possible_keys 
  | key| key_len | ref | rows| Extra
   |
+++++-++-+-+-+-+
|  1 | PRIMARY| derived2 | ALL| NULL  
  | NULL   | NULL| NULL| 4277605 | Using where; 
Using filesort |
|  1 | PRIMARY| Filename   | eq_ref | PRIMARY   
  | PRIMARY| 4   | Temp.FilenameId |   1 |  
   |
|  1 | PRIMARY| Path   | eq_ref | PRIMARY   
  | PRIMARY| 4   | Temp.PathId |   1 |  
   |
|  2 | DERIVED| derived3 | ALL| NULL  
  | NULL   | NULL| NULL| 4277605 |  
   |
|  2 | DERIVED| File   | ref| 
PathId,FilenameId,JobId,jobid_index | FilenameId | 8   | 
T1.FilenameId,T1.PathId |   4 | Using where |
|  2 | DERIVED| Job| eq_ref | PRIMARY   
  | PRIMARY| 4   | bacula.File.JobId   |   1 | Using where  
   |
|  6 | DEPENDENT SUBQUERY | NULL   | NULL   | NULL  
  | NULL   | NULL| NULL|NULL | no matching 
row in const table  |
|  3 | DERIVED| derived4 | ALL| NULL  
  | NULL   | NULL| NULL| 4302683 | Using 
temporary; Using filesort |
|  4 | DERIVED| Job| range  | PRIMARY   
  | PRIMARY| 4   | NULL|   4 | Using where  
   |
|  4 | DERIVED| File   | ref| JobId,jobid_index 
  | JobId  | 4   | bacula.Job.JobId|   41816 | Using index  
   |
|  5 | UNION  | NULL   | NULL   | NULL  
  | NULL   | NULL| NULL|NULL | no matching 
row in const table  |
| NULL | UNION RESULT   | union4,5 | ALL| NULL
| NULL   | NULL| NULL|NULL |
 |
+++++-++-+-+-+-+
12 rows in set (16 min 15.79 sec)

I presume that's what you're looking for?

 Tuning's not going to get any of those 50 million traversed rows
 disappear. Only a differently optimized query plan will.

Well, if the above helps and/or if you'd like me to run an alternative proposed
query I'm happy to.  I must confess it would take me quite a few hours to
actually understand that query.

Gavin



--
Centralized Desktop Delivery: Dell and VMware Reference Architecture

Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-11-12 Thread Alan Brown
Mikael Fridh wrote:

 Tuning's not going to get any of those 50 million traversed rows
 disappear. Only a differently optimized query plan will.

This applies across both mysql and postgresql...

 This is an Ubuntu Linux server running MySQL v5.1.41.  The mysql data is on 
 an
 MD software RAID 1 array on 7200rpm SATA disks.  The tables are MyISAM 
 (which I
 had understood to be quicker than innodb in low concurrency situations?).  
 The
 tuner script is suggesting I should disable innodb as we're not using it 
 which
 I will do though I wouldn't guess that will make a massive difference.
 
 No, it will not help.

Disbling innodb won't help right now, but switching to innodb would be a 
good idea in the near future as myISAM runs into problems around the 50 
million entry mark (assuming Oracle don't remove innodb from future 
versions of Mysql, as looks increasingly likely...)





--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-11-12 Thread Henrik Johansen
'Alan Brown' wrote:
Mikael Fridh wrote:

 Tuning's not going to get any of those 50 million traversed rows
 disappear. Only a differently optimized query plan will.

This applies across both mysql and postgresql...

 This is an Ubuntu Linux server running MySQL v5.1.41.  The mysql data is on 
 an
 MD software RAID 1 array on 7200rpm SATA disks.  The tables are MyISAM 
 (which I
 had understood to be quicker than innodb in low concurrency situations?).  
 The
 tuner script is suggesting I should disable innodb as we're not using it 
 which
 I will do though I wouldn't guess that will make a massive difference.

 No, it will not help.

Disbling innodb won't help right now, but switching to innodb would be a
good idea in the near future as myISAM runs into problems around the 50
million entry mark (assuming Oracle don't remove innodb from future
versions of Mysql, as looks increasingly likely...)

InnoDB is the default storage engine for MySQL 5.5




--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

-- 
Med venlig hilsen / Best Regards

Henrik Johansen
hen...@scannet.dk
Tlf. 75 53 35 00

ScanNet Group
A/S ScanNet 

--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-11-12 Thread Bob Hetzel


 From: Gavin McCullagh gavin.mccull...@gcd.ie
 Subject: Re: [Bacula-users] Tuning for large (millions of files)
   backups?
 To: bacula-users@lists.sourceforge.net
 Message-ID: 2010144733.gz20...@gcd.ie
 Content-Type: text/plain; charset=us-ascii

 On Mon, 08 Nov 2010, Gavin McCullagh wrote:

  We seem to have the correct indexes on the file table.  I've run optimize 
  table
  and it still takes 14 minutes to build the tree on one of our bigger 
  clients.
  We have 51 million entries in the file table.
 I thought I should give some mroe concrete information:

 I don't suppose this is news to anyone but here's the mysql slow query log to
 correspond:

 # Time: 10 14:24:49
 # u...@host: bacula[bacula] @ localhost []
 # Query_time: 1139.657646  Lock_time: 0.000471 Rows_sent: 4263403  
 Rows_examined: 50351037
 SET timestamp=1289485489;
 SELECT Path.Path, Filename.Name, Temp.FileIndex, Temp.JobId, LStat, MD5 FROM 
 ( SELECT FileId, Job.JobId AS JobId, FileIndex, File.PathId AS PathId, 
 File.FilenameId AS FilenameId, LStat, MD5 FROM Job, File, ( SELECT 
 MAX(JobTDate) AS JobTDate, PathId, FilenameId FROM ( SELECT JobTDate, PathId, 
 FilenameId FROM File JOIN Job USING (JobId) WHERE File.JobId IN 
 (9944,9950,9973,9996) UNION ALL SELECT JobTDate, PathId, FilenameId FROM 
 BaseFiles JOIN File USING (FileId) JOIN Job  ON(BaseJobId = Job.JobId) 
 WHERE BaseFiles.JobId IN (9944,9950,9973,9996) ) AS tmp GROUP BY PathId, 
 FilenameId ) AS T1 WHERE (Job.JobId IN ( SELECT DISTINCT BaseJobId FROM 
 BaseFiles WHERE JobId IN (9944,9950,9973,9996)) OR Job.JobId IN 
 (9944,9950,9973,9996)) AND T1.JobTDate = Job.JobTDate AND Job.JobId = 
 File.JobId AND T1.PathId = File.PathId AND T1.FilenameId = File.FilenameId ) 
 AS Temp JOIN Filename ON (Filename.FilenameId = Temp.FilenameId) JOIN Path ON 
 (Path.PathId = Temp.PathId) WHERE FileIndex  0 
ORDE!
 R BY Temp.JobId, FileIndex ASC;


 I've spent some time with the mysqltuner.pl script but to no avail thus far.
 There's 6GB RAM so it suggests a key buffer size of 4GB which I've set at
 4.1GB.

 This is an Ubuntu Linux server running MySQL v5.1.41.  The mysql data is on an
 MD software RAID 1 array on 7200rpm SATA disks.  The tables are MyISAM (which 
 I
 had understood to be quicker than innodb in low concurrency situations?).  The
 tuner script is suggesting I should disable innodb as we're not using it which
 I will do though I wouldn't guess that will make a massive difference.

 There are no fragmented tables currently.

 Gavin


I'm starting to think the issue might be linked to some kernels or linux 
distros.  I have two bacula servers here.  One system is a year and a half 
old (12 GB RAM), has with a File table having approx 40 million File 
records.  That system has had the slowness issue (building the directory 
tree on restores took about an hour) running first Ubuntu 9.04 or 9.10 and 
now RedHat 6 beta.  The kernel currently is at 2.6.32-44.1.el6.x86_64.  I 
haven't tried downgrading, instead I tweaked the source code to use the old 
3.0.3 query and recompiled--I don't use Base jobs or Accurate backups so 
that's safe for me.

The other system is 4 yrs or so old, with less memory (8GB), slower cpus, 
slower hard drives, etc., and in fairness only 35 million File records. 
This one builds the directory tree in approx 10 seconds, but is running 
Centos 5.5.  The kernel currently is at 2.6.18-194.11.3.el5.

I'm still convinced that this one slow MySQL query could be changed to 
allow MySQL to better optimize it.  I started with the same my.cnf file 
settings and then tried tweaking them because the newer computer has more 
ram but that didn't help.

Is anybody up to the task of rewriting that query?

--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-11-11 Thread Alan Brown
Gavin McCullagh wrote:
 On Tue, 09 Nov 2010, Alan Brown wrote:
 
 and it still takes 14 minutes to build the tree on one of our bigger 
 clients.
 We have 51 million entries in the file table.

 Add individual indexes for Fileid,  Jobid  and Pathid

 Postgres will work with the combined index for individual table queries,
 but mysql won't.
 
 The following are the indexes on the file table:
 
 mysql SHOW INDEXES FROM File;
 +---++--+--+-+---+-+--++--++-+
 | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation 
 | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
 +---++--+--+-+---+-+--++--++-+
 | File  |  0 | PRIMARY  |1 | FileId  | A 
 |55861148 | NULL | NULL   |  | BTREE  | |
 | File  |  1 | PathId   |1 | PathId  | A 
 |  735015 | NULL | NULL   |  | BTREE  | |
 | File  |  1 | FilenameId   |1 | FilenameId  | A 
 | 2539143 | NULL | NULL   |  | BTREE  | |
 | File  |  1 | FilenameId   |2 | PathId  | A 
 |13965287 | NULL | NULL   |  | BTREE  | |
 | File  |  1 | JobId|1 | JobId   | A 
 |1324 | NULL | NULL   |  | BTREE  | |
 | File  |  1 | JobId|2 | PathId  | A 
 | 2940060 | NULL | NULL   |  | BTREE  | |
 | File  |  1 | JobId|3 | FilenameId  | A 
 |55861148 | NULL | NULL   |  | BTREE  | |
 | File  |  1 | jobid_index  |1 | JobId   | A 
 |1324 | NULL | NULL   |  | BTREE  | |
 | File  |  1 | pathid_index |1 | PathId  | A 
 |  735015 | NULL | NULL   |  | BTREE  | |
 +---++--+--+-+---+-+--++--++-+
 
 I added the last two per your instructions.  Building the tree took about 14
 minutes without these indexes and takes about 17-18 minutes having added
 them.  

What tuning (if any) have you performed on your my.cnf and how much 
memory do you have?

 Have I done something wrong?  As FileId is a primary key, it doesn't seem
 like I should need an extra index on that one -- is that wrong?

It doesn't need an extra index.

You've also got a duplicate pathid indeax which can be deleted.

This kind of thing is why it makes more sense to switch to postgres when 
  mysql databases get large.




--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-11-11 Thread Henrik Johansen
'Alan Brown' wrote:
Gavin McCullagh wrote:
 On Tue, 09 Nov 2010, Alan Brown wrote:

 and it still takes 14 minutes to build the tree on one of our bigger 
 clients.
 We have 51 million entries in the file table.

 Add individual indexes for Fileid,  Jobid  and Pathid

 Postgres will work with the combined index for individual table queries,
 but mysql won't.

 The following are the indexes on the file table:

 mysql SHOW INDEXES FROM File;
 +---++--+--+-+---+-+--++--++-+
 | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation 
 | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
 +---++--+--+-+---+-+--++--++-+
 | File  |  0 | PRIMARY  |1 | FileId  | A 
 |55861148 | NULL | NULL   |  | BTREE  | |
 | File  |  1 | PathId   |1 | PathId  | A 
 |  735015 | NULL | NULL   |  | BTREE  | |
 | File  |  1 | FilenameId   |1 | FilenameId  | A 
 | 2539143 | NULL | NULL   |  | BTREE  | |
 | File  |  1 | FilenameId   |2 | PathId  | A 
 |13965287 | NULL | NULL   |  | BTREE  | |
 | File  |  1 | JobId|1 | JobId   | A 
 |1324 | NULL | NULL   |  | BTREE  | |
 | File  |  1 | JobId|2 | PathId  | A 
 | 2940060 | NULL | NULL   |  | BTREE  | |
 | File  |  1 | JobId|3 | FilenameId  | A 
 |55861148 | NULL | NULL   |  | BTREE  | |
 | File  |  1 | jobid_index  |1 | JobId   | A 
 |1324 | NULL | NULL   |  | BTREE  | |
 | File  |  1 | pathid_index |1 | PathId  | A 
 |  735015 | NULL | NULL   |  | BTREE  | |
 +---++--+--+-+---+-+--++--++-+

 I added the last two per your instructions.  Building the tree took about 14
 minutes without these indexes and takes about 17-18 minutes having added
 them.

What tuning (if any) have you performed on your my.cnf and how much
memory do you have?

 Have I done something wrong?  As FileId is a primary key, it doesn't seem
 like I should need an extra index on that one -- is that wrong?

It doesn't need an extra index.

You've also got a duplicate pathid indeax which can be deleted.

This kind of thing is why it makes more sense to switch to postgres when
  mysql databases get large.

I have had about as much of this as I can take now so please, stop spreading
FUD about MySQL.

When it comes to Bacula there is only one valid concern - Postgres has
certain statement constructs which allow certain queries to be performed
faster - that's about it.

It am not buying the postulation that postgres is largely self-tuning,
especially not when dealing with large datasets. 

If you prefer postgres, that's totally fine but please stop telling
people that MySQL is unusable for large DB deployments because this
simply is untrue.



--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

-- 
Med venlig hilsen / Best Regards

Henrik Johansen
hen...@scannet.dk
Tlf. 75 53 35 00

ScanNet Group
A/S ScanNet 

--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-11-11 Thread Gavin McCullagh
Hi,

On Thu, 11 Nov 2010, Alan Brown wrote:

 What tuning (if any) have you performed on your my.cnf and how much
 memory do you have?

Thus far I haven't spent much time on this and haven't tuned MySQL.  The
slow build an annoyance, but not a killer so I've not really got around to
it.  The server has 6GB RAM (running an X86_64 kernel).

 Have I done something wrong?  As FileId is a primary key, it doesn't seem
 like I should need an extra index on that one -- is that wrong?
 
 It doesn't need an extra index.

Grand.

 You've also got a duplicate pathid indeax which can be deleted.

Ah, I didn't spot that, thanks.

 This kind of thing is why it makes more sense to switch to postgres
 when  mysql databases get large.

I see.  Well, as long as I'm not missing some simple tweak to make MySQL
run quicker I guess I'll plan to do that.

Gavin

-- 
Gavin McCullagh
Senior System Administrator
IT Services
Griffith College 
South Circular Road
Dublin 8
Ireland
Tel: +353 1 4163365
http://www.gcd.ie
http://www.gcd.ie/brochure.pdf
http://www.gcd.ie/opendays

This E-mail is from Griffith College.
The E-mail and any files transmitted with it are confidential and may be
privileged and are intended solely for the use of the individual or entity
to whom they are addressed. If you are not the addressee you are prohibited
from disclosing its content, copying it or distributing it otherwise than to
the addressee. If you have received this e-mail in error, please immediately
notify the sender by replying to this e-mail and delete the e-mail from your
computer.

Bellerophon Ltd, trades as Griffith College (registered in Ireland No.
60469) with its registered address as Griffith College Campus, South
Circular Road, Dublin 8, Ireland.


--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-11-11 Thread Alan Brown
Henrik Johansen wrote:

 I have had about as much of this as I can take now so please, stop spreading
 FUD about MySQL.

Have you used Mysql with datasets in excess of 100-200 million objects?

I have. Our current database holds about 400 million File table entries.

MySQL requires significant tuning and kernel tweakery, plus uses a lot 
more memory than postgres does for the same dataset.

For Bacula users, it's a lot _easier_ to use Postgres on a large 
installation than it is to use MySQL.

I held off switching to Postgres for a long time because I was 
unfamiliar with it, however having done so I'm glad that I did - it's 
required virtually zero tweaking since it was set up and runs 
approximately twice as fast as MySQL did, with a ram footprint about 
half the size of MySQL's.

Small datasets are fine with MySQL and will probably work better. Ours 
was brilliant up to about 50 million entries and then required tuning.

This discussion is about appropriate tools for the job.

If you wish to usefully contribute to the thread then provide some 
assistance to the OP regarding tuning his MySQL for optimum performance.





--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-11-11 Thread Henrik Johansen
'Alan Brown' wrote:
Henrik Johansen wrote:

 I have had about as much of this as I can take now so please, stop spreading
 FUD about MySQL.

Have you used Mysql with datasets in excess of 100-200 million objects?

Sure - our current Bacula deployment consists of 3 catalog servers with
the smallest DB having ~380 million rows. We have other MySQL DB's in
production that are considerably larger and so do Facebook, Twitter,
Flickr, YouTube, Wikipedia and so on ...

I have. Our current database holds about 400 million File table entries.

MySQL requires significant tuning and kernel tweakery, plus uses a lot
more memory than postgres does for the same dataset.

Almost all large MySQL servers we have run Solaris - absolutely no kernel
tweaking required.

For Bacula users, it's a lot _easier_ to use Postgres on a large
installation than it is to use MySQL.

Large installations usually have DBA's ? Personally I find it a *lot*
easier to apply a few configuration tweaks to a product that I have 8+
years of production experience with than throwing in the towel and
starting from scratch with an entirely different product ...

I held off switching to Postgres for a long time because I was
unfamiliar with it, however having done so I'm glad that I did - it's
required virtually zero tweaking since it was set up and runs
approximately twice as fast as MySQL did, with a ram footprint about
half the size of MySQL's.

MySQL, or more specifically InnoDB, needs a bit of love before performing
well, I'll admit to that. The upcoming MySQL 5.5 will change much of
this however. 

Small datasets are fine with MySQL and will probably work better. Ours
was brilliant up to about 50 million entries and then required tuning.

This discussion is about appropriate tools for the job.

Yes - and I still consider MySQL to be a highly appropriate tool for the
job. Perhaps the MySQL force is particularly strong in me, who knows.

If you wish to usefully contribute to the thread then provide some
assistance to the OP regarding tuning his MySQL for optimum performance.

Re-read the thread - I believe that I already have done so.





-- 
Med venlig hilsen / Best Regards

Henrik Johansen
hen...@scannet.dk
Tlf. 75 53 35 00

ScanNet Group
A/S ScanNet 

--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-11-11 Thread Gavin McCullagh
On Mon, 08 Nov 2010, Gavin McCullagh wrote:

 We seem to have the correct indexes on the file table.  I've run optimize 
 table
 and it still takes 14 minutes to build the tree on one of our bigger clients.
 We have 51 million entries in the file table.

I thought I should give some mroe concrete information:

I don't suppose this is news to anyone but here's the mysql slow query log to
correspond:

# Time: 10 14:24:49
# u...@host: bacula[bacula] @ localhost []
# Query_time: 1139.657646  Lock_time: 0.000471 Rows_sent: 4263403  
Rows_examined: 50351037
SET timestamp=1289485489;
SELECT Path.Path, Filename.Name, Temp.FileIndex, Temp.JobId, LStat, MD5 FROM ( 
SELECT FileId, Job.JobId AS JobId, FileIndex, File.PathId AS PathId, 
File.FilenameId AS FilenameId, LStat, MD5 FROM Job, File, ( SELECT 
MAX(JobTDate) AS JobTDate, PathId, FilenameId FROM ( SELECT JobTDate, PathId, 
FilenameId FROM File JOIN Job USING (JobId) WHERE File.JobId IN 
(9944,9950,9973,9996) UNION ALL SELECT JobTDate, PathId, FilenameId FROM 
BaseFiles JOIN File USING (FileId) JOIN Job  ON(BaseJobId = Job.JobId) 
WHERE BaseFiles.JobId IN (9944,9950,9973,9996) ) AS tmp GROUP BY PathId, 
FilenameId ) AS T1 WHERE (Job.JobId IN ( SELECT DISTINCT BaseJobId FROM 
BaseFiles WHERE JobId IN (9944,9950,9973,9996)) OR Job.JobId IN 
(9944,9950,9973,9996)) AND T1.JobTDate = Job.JobTDate AND Job.JobId = 
File.JobId AND T1.PathId = File.PathId AND T1.FilenameId = File.FilenameId ) AS 
Temp JOIN Filename ON (Filename.FilenameId = Temp.FilenameId) JOIN Path ON 
(Path.PathId = Temp.PathId) WHERE FileIndex  0 ORDER BY Temp.JobId, FileIndex 
ASC;


I've spent some time with the mysqltuner.pl script but to no avail thus far.
There's 6GB RAM so it suggests a key buffer size of 4GB which I've set at
4.1GB.

This is an Ubuntu Linux server running MySQL v5.1.41.  The mysql data is on an
MD software RAID 1 array on 7200rpm SATA disks.  The tables are MyISAM (which I
had understood to be quicker than innodb in low concurrency situations?).  The
tuner script is suggesting I should disable innodb as we're not using it which
I will do though I wouldn't guess that will make a massive difference.

There are no fragmented tables currently.

Gavin


--
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-11-09 Thread Alan Brown
On 08/11/10 22:21, Gavin McCullagh wrote:

 Right you are

 http://wiki.bacula.org/doku.php?id=faq#restore_takes_a_long_time_to_retrieve_sql_results_from_mysql_catalog

 There is still an element of move to postgresql though


With good reason. I did resist moving to pgsql for quite a while but it 
does work better.


  We seem to have the correct indexes on the file table. I've run 
optimize table

 and it still takes 14 minutes to build the tree on one of our bigger clients.
 We have 51 million entries in the file table.


Add individual indexes for Fileid,  Jobid  and Pathid

Postgres will work with the combined index for individual table queries, but 
mysql won't.

AB





--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book Blueprint to a 
Billion shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-11-09 Thread Gavin McCullagh
On Tue, 09 Nov 2010, Alan Brown wrote:

 and it still takes 14 minutes to build the tree on one of our bigger clients.
 We have 51 million entries in the file table.
 
 
 Add individual indexes for Fileid,  Jobid  and Pathid
 
 Postgres will work with the combined index for individual table queries,
 but mysql won't.

The following are the indexes on the file table:

mysql SHOW INDEXES FROM File;
+---++--+--+-+---+-+--++--++-+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | 
Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+---++--+--+-+---+-+--++--++-+
| File  |  0 | PRIMARY  |1 | FileId  | A |  
  55861148 | NULL | NULL   |  | BTREE  | |
| File  |  1 | PathId   |1 | PathId  | A |  
735015 | NULL | NULL   |  | BTREE  | |
| File  |  1 | FilenameId   |1 | FilenameId  | A |  
   2539143 | NULL | NULL   |  | BTREE  | |
| File  |  1 | FilenameId   |2 | PathId  | A |  
  13965287 | NULL | NULL   |  | BTREE  | |
| File  |  1 | JobId|1 | JobId   | A |  
  1324 | NULL | NULL   |  | BTREE  | |
| File  |  1 | JobId|2 | PathId  | A |  
   2940060 | NULL | NULL   |  | BTREE  | |
| File  |  1 | JobId|3 | FilenameId  | A |  
  55861148 | NULL | NULL   |  | BTREE  | |
| File  |  1 | jobid_index  |1 | JobId   | A |  
  1324 | NULL | NULL   |  | BTREE  | |
| File  |  1 | pathid_index |1 | PathId  | A |  
735015 | NULL | NULL   |  | BTREE  | |
+---++--+--+-+---+-+--++--++-+

I added the last two per your instructions.  Building the tree took about 14
minutes without these indexes and takes about 17-18 minutes having added
them.  

Have I done something wrong?  As FileId is a primary key, it doesn't seem
like I should need an extra index on that one -- is that wrong?

Thanks
Gavin



--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book Blueprint to a 
Billion shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-11-08 Thread Alan Brown
Ondrej PLANKA (Ignum profile) wrote:

 We have several 10+ million file jobs - all run without problem (backup
 and restore).
 
 I am aware of the fact that a lot of Bacula users run PG  ( Bacula
 Systems also does recommend PG for larger setups ) but nevertheless
 MySQL has served us very well so far.

Mysql works well - if tuned, but tuning is a major undertaking when 
things get large/busy and may take several iterations.

The main advantage I've found to Postgress is that when you have a very 
large database (100+ million entries) it has a far smaller memory/CPU 
footprint (Half the memory and about 1/4 the cpu load).

It's also largely self-tuning - which lets me get on with the business 
of running backups instead of monitoring database performance...






--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book Blueprint to a 
Billion shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-11-08 Thread Gavin McCullagh
On Mon, 08 Nov 2010, Alan Brown wrote:

 Mysql works well - if tuned, but tuning is a major undertaking when 
 things get large/busy and may take several iterations.

Some time back there was an issue with Bacula (v5?) which seemed to come
down to a particular query associated (I think) with restores taking a very
long time with large datasets on MySQL, but taking a reasonable time on
Postgresql.

From what I read of the conversation, there didn't seem to be any tuning
solution on MySQL.  The answer from several people was switch to
Postgres.  We're using MySQL for Bacula right now but for this reason I've
had it in mind to move.  

Is this still the case or is there a solution now to those MySQL issues.

When we do restores, building the tree takes a considerable time now.  I
haven't had a lot of time to look at it, but suspected it might be down to
this issue.

Gavin


--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book Blueprint to a 
Billion shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-11-08 Thread Alan Brown
Gavin McCullagh wrote:
 On Mon, 08 Nov 2010, Alan Brown wrote:
 
 Mysql works well - if tuned, but tuning is a major undertaking when 
 things get large/busy and may take several iterations.

 When we do restores, building the tree takes a considerable time now.  I
 haven't had a lot of time to look at it, but suspected it might be down to
 this issue.

That's a classic symptom of not having the right indexes on the File table.

This is in the FAQ somewhere and it's the issue you mentioned as 
affecting mysql but not postgresql (Their index handling is slightly 
different).

The other big contributor to long tree builds is insufficient ram. Keep 
an eye on swap and the bacula-dir process whilst building the tree. If 
it starts thrashing, you need more memory.




--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book Blueprint to a 
Billion shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-11-08 Thread Gavin McCullagh
Hi Alan,

On Mon, 08 Nov 2010, Alan Brown wrote:

 When we do restores, building the tree takes a considerable time now.  I
 haven't had a lot of time to look at it, but suspected it might be down to
 this issue.
 
 That's a classic symptom of not having the right indexes on the File table.
 
 This is in the FAQ somewhere and it's the issue you mentioned as
 affecting mysql but not postgresql (Their index handling is slightly
 different).

Right you are

http://wiki.bacula.org/doku.php?id=faq#restore_takes_a_long_time_to_retrieve_sql_results_from_mysql_catalog

There is still an element of move to postgresql though

  Moving from MySQL to PostgreSQL should make it work much better due to
   different (more optimized) queries and different SQL engine. 
   
http://bugs.bacula.org/view.php?id=1472http://bugs.bacula.org/view.php?id=1472

We seem to have the correct indexes on the file table.  I've run optimize table
and it still takes 14 minutes to build the tree on one of our bigger clients.
We have 51 million entries in the file table.

 The other big contributor to long tree builds is insufficient ram.
 Keep an eye on swap and the bacula-dir process whilst building the
 tree. If it starts thrashing, you need more memory.

I think we should be okay on that score but it's something to watch out
for.

Thanks,
Gavin


--
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book Blueprint to a 
Billion shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-11-01 Thread Henrik Johansen
'Ondrej PLANKA (Ignum profile)' wrote:
Thanks :)
Which type of MySQL storage engine are you using? MyISAM or InnoDB for
large Bacula system?
Can you please copy/paste your MySQL configuration? I mean my.cnf file

Please re-read this thread and you should find what you are looking for.

Thanks, Ondrej.


Henrik Johansen napsal(a):
 'Ondrej PLANKA (Ignum profile)' wrote:

 Hello Henrik,

 what are you using? MySQL?


 Yes - all our catalog servers run MySQL.

 I forgot to mention this in my last post - we are Bacula System
 customers and they have proved to very supportive and competent.

 If you are thinking about doing large scale backups with Bacula I can
 only encourage you to get a support subscription - it is worth every
 penny.



 Thanks, Ondrej.

 'Mingus Dew' wrote:

 Henrik,
 Have you had any problems with slow queries during backup or restore
 jobs? I'm thinking about http://bugs.bacula.org/view.php?id=1472
 specifically, and considering that the bacula.File table already has 73
 million rows in it and I haven't even successfully ran the big job
 yet.

 Not really.

 We have several 10+ million file jobs - all run without problem (backup
 and restore).

 I am aware of the fact that a lot of Bacula users run PG  ( Bacula
 Systems also does recommend PG for larger setups ) but nevertheless
 MySQL has served us very well so far.


 Just curious as a fellow Solaris deployer...

 Thanks,
 Shon

 On Fri, Oct 8, 2010 at 3:30 PM, Henrik Johansen
 hen...@scannet.dkmailto:hen...@scannet.dk 
 mailto:hen...@scannet.dk%3cmailto:hen...@scannet.dk wrote:
 'Mingus Dew' wrote:
 All,
 I am running Bacula 5.0.1 on Solaris 10 x86. I'm currently running
 MySQL 4.1.22 for the database server. I do plan on upgrading to a
 compatible version of MySQL 5, but migrating to PostgreSQL isn't an
 option at this time.

 I am trying to backup to tape a very large number of files for a
 client. While the data size is manageable at around 2TB, the number of
 files is incredibly large.
 The first of the jobs had 27 million files and initially failed because
 the batch table became Full. I changed the myisam_data_pointer size
 to a value of 6 in the config.

 This job was then able to run successfully and did not take too long.

 I have another job which has 42 million files. I'm not sure what that
 equates to in rows that need to be inserted, but I can say that I've
 not been able to successfully run the job, as it seems to hang for
 over 30 hours in a Dir inserting attributes status. This causes
 other jobs to backup in the queue and once canceled I have to restart
 Bacula.

 I'm looking for way to boost performance of MySQL or Bacula (or both)
 to get this job completed.

 You *really* need to upgrade to MySQL 5 and change to InnoDB - there is no
 way in hell that MySQL 4 + MyISAM is going to perform decent in your
 situation.
 Solaris 10 is a Tier 1 platform for MySQL so the latest versions are
 always available from http://www.mysql.com in the native pkg format so 
 there really
 is no excuse.

 We run our Bacula Catalog MySQl servers on Solaris (OpenSolaris) so
 perhaps I can give you some pointers.

 Our smallest Bacula DB is currently ~70 GB (381,230,610 rows).

 Since you are using Solaris 10 I assume that you are going to run MySQL
 off ZFS - in that case you need to adjust the ZFS recordsize for the
 filesystem that is going to hold your InnoDB datafiles to match the
 InnoDB block size.

 If you are using ZFS you should also consider getting yourself a fast
 SSD as a SLOG (or to disable the ZIL entirely if you dare) - all InnoDB
 writes to datafiles are O_SYNC and benefit *greatly* from an SSD in
 terms of write / transaction speed.

 If you have enough CPU power to spare you should try turning on
 compression for the ZFS filesystem holding the datafiles - it also can
 accelerate DB writes / reads but YMMV.

 Lastly, our InnoDB related configuration from my.cnf :

 # InnoDB options skip-innodb_doublewrite
 innodb_data_home_dir = /tank/db/
 innodb_log_group_home_dir = /tank/logs/
 innodb_support_xa = false
 innodb_file_per_table = true
 innodb_buffer_pool_size = 20G
 innodb_flush_log_at_trx_commit = 2
 innodb_log_buffer_size = 128M
 innodb_log_file_size = 512M
 innodb_log_files_in_group = 2
 innodb_max_dirty_pages_pct = 90



 Thanks,
 Shon

 --
 Beautiful is writing same markup. Internet Explorer 9 supports
 standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
 Spend less time writing and  rewriting code and more time creating great
 experiences on the web. Be a part of the beta today.
 http://p.sf.net/sfu/beautyoftheweb

 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.netmailto:Bacula-users@lists.sourceforge.net
  
 mailto:bacula-us...@lists.sourceforge.net%3cmailto:Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users


 --
 Med 

Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-11-01 Thread Thomas Mueller
Am Mon, 01 Nov 2010 06:15:18 +0100 schrieb Ondrej PLANKA (Ignum profile):

 Thanks :)
 Which type of MySQL storage engine are you using? MyISAM or InnoDB for
 large Bacula system?
 Can you please copy/paste your MySQL configuration? I mean my.cnf file
 
 Thanks, Ondrej.

I would use InnoDB. 

a good startingpoint to optimize mysql is http://mysqltuner.pl

- Thomas


--
Nokia and ATT present the 2010 Calling All Innovators-North America contest
Create new apps  games for the Nokia N8 for consumers in  U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store 
http://p.sf.net/sfu/nokia-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-10-31 Thread Ondrej PLANKA (Ignum profile)
Hello Henrik,

what are you using? MySQL?

Thanks, Ondrej.

'Mingus Dew' wrote:
Henrik,
Have you had any problems with slow queries during backup or restore
jobs? I'm thinking about http://bugs.bacula.org/view.php?id=1472
specifically, and considering that the bacula.File table already has 73
million rows in it and I haven't even successfully ran the big job
yet.

Not really.

We have several 10+ million file jobs - all run without problem (backup
and restore).

I am aware of the fact that a lot of Bacula users run PG  ( Bacula
Systems also does recommend PG for larger setups ) but nevertheless
MySQL has served us very well so far.


Just curious as a fellow Solaris deployer...

Thanks,
Shon

On Fri, Oct 8, 2010 at 3:30 PM, Henrik Johansen
hen...@scannet.dkmailto:hen...@scannet.dk 
mailto:hen...@scannet.dk%3cmailto:hen...@scannet.dk wrote:
'Mingus Dew' wrote:
All,
I am running Bacula 5.0.1 on Solaris 10 x86. I'm currently running
MySQL 4.1.22 for the database server. I do plan on upgrading to a
compatible version of MySQL 5, but migrating to PostgreSQL isn't an
option at this time.

I am trying to backup to tape a very large number of files for a
client. While the data size is manageable at around 2TB, the number of
files is incredibly large.
The first of the jobs had 27 million files and initially failed because
the batch table became Full. I changed the myisam_data_pointer size
to a value of 6 in the config.

This job was then able to run successfully and did not take too long.

I have another job which has 42 million files. I'm not sure what that
equates to in rows that need to be inserted, but I can say that I've
not been able to successfully run the job, as it seems to hang for
over 30 hours in a Dir inserting attributes status. This causes
other jobs to backup in the queue and once canceled I have to restart
Bacula.

I'm looking for way to boost performance of MySQL or Bacula (or both)
to get this job completed.

You *really* need to upgrade to MySQL 5 and change to InnoDB - there is no
way in hell that MySQL 4 + MyISAM is going to perform decent in your
situation.
Solaris 10 is a Tier 1 platform for MySQL so the latest versions are
always available from http://www.mysql.com in the native pkg format so there 
really
is no excuse.

We run our Bacula Catalog MySQl servers on Solaris (OpenSolaris) so
perhaps I can give you some pointers.

Our smallest Bacula DB is currently ~70 GB (381,230,610 rows).

Since you are using Solaris 10 I assume that you are going to run MySQL
off ZFS - in that case you need to adjust the ZFS recordsize for the
filesystem that is going to hold your InnoDB datafiles to match the
InnoDB block size.

If you are using ZFS you should also consider getting yourself a fast
SSD as a SLOG (or to disable the ZIL entirely if you dare) - all InnoDB
writes to datafiles are O_SYNC and benefit *greatly* from an SSD in
terms of write / transaction speed.

If you have enough CPU power to spare you should try turning on
compression for the ZFS filesystem holding the datafiles - it also can
accelerate DB writes / reads but YMMV.

Lastly, our InnoDB related configuration from my.cnf :

# InnoDB options skip-innodb_doublewrite
innodb_data_home_dir = /tank/db/
innodb_log_group_home_dir = /tank/logs/
innodb_support_xa = false
innodb_file_per_table = true
innodb_buffer_pool_size = 20G
innodb_flush_log_at_trx_commit = 2
innodb_log_buffer_size = 128M
innodb_log_file_size = 512M
innodb_log_files_in_group = 2
innodb_max_dirty_pages_pct = 90



Thanks,
Shon

--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb

___
Bacula-users mailing list
Bacula-users@lists.sourceforge.netmailto:Bacula-users@lists.sourceforge.net 
mailto:bacula-us...@lists.sourceforge.net%3cmailto:Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


--
Med venlig hilsen / Best Regards

Henrik Johansen
hen...@scannet.dkmailto:hen...@scannet.dk 
mailto:hen...@scannet.dk%3cmailto:hen...@scannet.dk
Tlf. 75 53 35 00

ScanNet Group
A/S ScanNet


--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb

___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net mailto:Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


-- 
Med venlig hilsen / Best 

Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-10-31 Thread Henrik Johansen
'Ondrej PLANKA (Ignum profile)' wrote:
Hello Henrik,

what are you using? MySQL?

Yes - all our catalog servers run MySQL.

I forgot to mention this in my last post - we are Bacula System
customers and they have proved to very supportive and competent.

If you are thinking about doing large scale backups with Bacula I can
only encourage you to get a support subscription - it is worth every
penny.


Thanks, Ondrej.

'Mingus Dew' wrote:
Henrik,
Have you had any problems with slow queries during backup or restore
jobs? I'm thinking about http://bugs.bacula.org/view.php?id=1472
specifically, and considering that the bacula.File table already has 73
million rows in it and I haven't even successfully ran the big job
yet.

Not really.

We have several 10+ million file jobs - all run without problem (backup
and restore).

I am aware of the fact that a lot of Bacula users run PG  ( Bacula
Systems also does recommend PG for larger setups ) but nevertheless
MySQL has served us very well so far.


Just curious as a fellow Solaris deployer...

Thanks,
Shon

On Fri, Oct 8, 2010 at 3:30 PM, Henrik Johansen
hen...@scannet.dkmailto:hen...@scannet.dk 
mailto:hen...@scannet.dk%3cmailto:hen...@scannet.dk wrote:
'Mingus Dew' wrote:
All,
I am running Bacula 5.0.1 on Solaris 10 x86. I'm currently running
MySQL 4.1.22 for the database server. I do plan on upgrading to a
compatible version of MySQL 5, but migrating to PostgreSQL isn't an
option at this time.

I am trying to backup to tape a very large number of files for a
client. While the data size is manageable at around 2TB, the number of
files is incredibly large.
The first of the jobs had 27 million files and initially failed because
the batch table became Full. I changed the myisam_data_pointer size
to a value of 6 in the config.

This job was then able to run successfully and did not take too long.

I have another job which has 42 million files. I'm not sure what that
equates to in rows that need to be inserted, but I can say that I've
not been able to successfully run the job, as it seems to hang for
over 30 hours in a Dir inserting attributes status. This causes
other jobs to backup in the queue and once canceled I have to restart
Bacula.

I'm looking for way to boost performance of MySQL or Bacula (or both)
to get this job completed.

You *really* need to upgrade to MySQL 5 and change to InnoDB - there is no
way in hell that MySQL 4 + MyISAM is going to perform decent in your
situation.
Solaris 10 is a Tier 1 platform for MySQL so the latest versions are
always available from http://www.mysql.com in the native pkg format so there 
really
is no excuse.

We run our Bacula Catalog MySQl servers on Solaris (OpenSolaris) so
perhaps I can give you some pointers.

Our smallest Bacula DB is currently ~70 GB (381,230,610 rows).

Since you are using Solaris 10 I assume that you are going to run MySQL
off ZFS - in that case you need to adjust the ZFS recordsize for the
filesystem that is going to hold your InnoDB datafiles to match the
InnoDB block size.

If you are using ZFS you should also consider getting yourself a fast
SSD as a SLOG (or to disable the ZIL entirely if you dare) - all InnoDB
writes to datafiles are O_SYNC and benefit *greatly* from an SSD in
terms of write / transaction speed.

If you have enough CPU power to spare you should try turning on
compression for the ZFS filesystem holding the datafiles - it also can
accelerate DB writes / reads but YMMV.

Lastly, our InnoDB related configuration from my.cnf :

# InnoDB options skip-innodb_doublewrite
innodb_data_home_dir = /tank/db/
innodb_log_group_home_dir = /tank/logs/
innodb_support_xa = false
innodb_file_per_table = true
innodb_buffer_pool_size = 20G
innodb_flush_log_at_trx_commit = 2
innodb_log_buffer_size = 128M
innodb_log_file_size = 512M
innodb_log_files_in_group = 2
innodb_max_dirty_pages_pct = 90



Thanks,
Shon

--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb

___
Bacula-users mailing list
Bacula-users@lists.sourceforge.netmailto:Bacula-users@lists.sourceforge.net 
mailto:bacula-us...@lists.sourceforge.net%3cmailto:Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


--
Med venlig hilsen / Best Regards

Henrik Johansen
hen...@scannet.dkmailto:hen...@scannet.dk 
mailto:hen...@scannet.dk%3cmailto:hen...@scannet.dk
Tlf. 75 53 35 00

ScanNet Group
A/S ScanNet


--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code 

Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-10-31 Thread Ondrej PLANKA (Ignum profile)
Thanks :)
Which type of MySQL storage engine are you using? MyISAM or InnoDB for 
large Bacula system?
Can you please copy/paste your MySQL configuration? I mean my.cnf file

Thanks, Ondrej.


Henrik Johansen napsal(a):
 'Ondrej PLANKA (Ignum profile)' wrote:
   
 Hello Henrik,

 what are you using? MySQL?
 

 Yes - all our catalog servers run MySQL.

 I forgot to mention this in my last post - we are Bacula System
 customers and they have proved to very supportive and competent.

 If you are thinking about doing large scale backups with Bacula I can
 only encourage you to get a support subscription - it is worth every
 penny.


   
 Thanks, Ondrej.

 'Mingus Dew' wrote:
 
 Henrik,
 Have you had any problems with slow queries during backup or restore
 jobs? I'm thinking about http://bugs.bacula.org/view.php?id=1472
 specifically, and considering that the bacula.File table already has 73
 million rows in it and I haven't even successfully ran the big job
 yet.
   
 Not really.

 We have several 10+ million file jobs - all run without problem (backup
 and restore).

 I am aware of the fact that a lot of Bacula users run PG  ( Bacula
 Systems also does recommend PG for larger setups ) but nevertheless
 MySQL has served us very well so far.

 
 Just curious as a fellow Solaris deployer...

 Thanks,
 Shon

 On Fri, Oct 8, 2010 at 3:30 PM, Henrik Johansen
 hen...@scannet.dkmailto:hen...@scannet.dk 
 mailto:hen...@scannet.dk%3cmailto:hen...@scannet.dk wrote:
 'Mingus Dew' wrote:
 All,
 I am running Bacula 5.0.1 on Solaris 10 x86. I'm currently running
 MySQL 4.1.22 for the database server. I do plan on upgrading to a
 compatible version of MySQL 5, but migrating to PostgreSQL isn't an
 option at this time.

 I am trying to backup to tape a very large number of files for a
 client. While the data size is manageable at around 2TB, the number of
 files is incredibly large.
 The first of the jobs had 27 million files and initially failed because
 the batch table became Full. I changed the myisam_data_pointer size
 to a value of 6 in the config.

 This job was then able to run successfully and did not take too long.

 I have another job which has 42 million files. I'm not sure what that
 equates to in rows that need to be inserted, but I can say that I've
 not been able to successfully run the job, as it seems to hang for
 over 30 hours in a Dir inserting attributes status. This causes
 other jobs to backup in the queue and once canceled I have to restart
 Bacula.

 I'm looking for way to boost performance of MySQL or Bacula (or both)
 to get this job completed.

 You *really* need to upgrade to MySQL 5 and change to InnoDB - there is no
 way in hell that MySQL 4 + MyISAM is going to perform decent in your
 situation.
 Solaris 10 is a Tier 1 platform for MySQL so the latest versions are
 always available from http://www.mysql.com in the native pkg format so 
 there really
 is no excuse.

 We run our Bacula Catalog MySQl servers on Solaris (OpenSolaris) so
 perhaps I can give you some pointers.

 Our smallest Bacula DB is currently ~70 GB (381,230,610 rows).

 Since you are using Solaris 10 I assume that you are going to run MySQL
 off ZFS - in that case you need to adjust the ZFS recordsize for the
 filesystem that is going to hold your InnoDB datafiles to match the
 InnoDB block size.

 If you are using ZFS you should also consider getting yourself a fast
 SSD as a SLOG (or to disable the ZIL entirely if you dare) - all InnoDB
 writes to datafiles are O_SYNC and benefit *greatly* from an SSD in
 terms of write / transaction speed.

 If you have enough CPU power to spare you should try turning on
 compression for the ZFS filesystem holding the datafiles - it also can
 accelerate DB writes / reads but YMMV.

 Lastly, our InnoDB related configuration from my.cnf :

 # InnoDB options skip-innodb_doublewrite
 innodb_data_home_dir = /tank/db/
 innodb_log_group_home_dir = /tank/logs/
 innodb_support_xa = false
 innodb_file_per_table = true
 innodb_buffer_pool_size = 20G
 innodb_flush_log_at_trx_commit = 2
 innodb_log_buffer_size = 128M
 innodb_log_file_size = 512M
 innodb_log_files_in_group = 2
 innodb_max_dirty_pages_pct = 90



 Thanks,
 Shon

 --
 Beautiful is writing same markup. Internet Explorer 9 supports
 standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
 Spend less time writing and  rewriting code and more time creating great
 experiences on the web. Be a part of the beta today.
 http://p.sf.net/sfu/beautyoftheweb

 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.netmailto:Bacula-users@lists.sourceforge.net
  
 mailto:bacula-us...@lists.sourceforge.net%3cmailto:Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users


 --
 Med venlig hilsen / Best Regards

 Henrik Johansen
 

Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-10-14 Thread Mingus Dew
Henrik,
 Have you had any problems with slow queries during backup or restore
jobs? I'm thinking about
http://bugs.bacula.org/view.php?id=1472specifically, and considering
that the bacula.File table already has 73
million rows in it and I haven't even successfully ran the big job yet.

Just curious as a fellow Solaris deployer...

Thanks,
Shon

On Fri, Oct 8, 2010 at 3:30 PM, Henrik Johansen hen...@scannet.dk wrote:

 'Mingus Dew' wrote:

 All,
 I am running Bacula 5.0.1 on Solaris 10 x86. I'm currently running
 MySQL 4.1.22 for the database server. I do plan on upgrading to a
 compatible version of MySQL 5, but migrating to PostgreSQL isn't an
 option at this time.

 I am trying to backup to tape a very large number of files for a
 client. While the data size is manageable at around 2TB, the number of
 files is incredibly large.
 The first of the jobs had 27 million files and initially failed because
 the batch table became Full. I changed the myisam_data_pointer size
 to a value of 6 in the config.

 This job was then able to run successfully and did not take too long.

 I have another job which has 42 million files. I'm not sure what that
 equates to in rows that need to be inserted, but I can say that I've
 not been able to successfully run the job, as it seems to hang for
 over 30 hours in a Dir inserting attributes status. This causes
 other jobs to backup in the queue and once canceled I have to restart
 Bacula.

 I'm looking for way to boost performance of MySQL or Bacula (or both)
 to get this job completed.


 You *really* need to upgrade to MySQL 5 and change to InnoDB - there is no
 way in hell that MySQL 4 + MyISAM is going to perform decent in your
 situation.
 Solaris 10 is a Tier 1 platform for MySQL so the latest versions are
 always available from www.mysql.com in the native pkg format so there
 really
 is no excuse.

 We run our Bacula Catalog MySQl servers on Solaris (OpenSolaris) so
 perhaps I can give you some pointers.

 Our smallest Bacula DB is currently ~70 GB (381,230,610 rows).

 Since you are using Solaris 10 I assume that you are going to run MySQL
 off ZFS - in that case you need to adjust the ZFS recordsize for the
 filesystem that is going to hold your InnoDB datafiles to match the
 InnoDB block size.

 If you are using ZFS you should also consider getting yourself a fast
 SSD as a SLOG (or to disable the ZIL entirely if you dare) - all InnoDB
 writes to datafiles are O_SYNC and benefit *greatly* from an SSD in
 terms of write / transaction speed.

 If you have enough CPU power to spare you should try turning on
 compression for the ZFS filesystem holding the datafiles - it also can
 accelerate DB writes / reads but YMMV.

 Lastly, our InnoDB related configuration from my.cnf :

 # InnoDB options skip-innodb_doublewrite
 innodb_data_home_dir = /tank/db/
 innodb_log_group_home_dir = /tank/logs/
 innodb_support_xa = false
 innodb_file_per_table = true
 innodb_buffer_pool_size = 20G
 innodb_flush_log_at_trx_commit = 2
 innodb_log_buffer_size = 128M
 innodb_log_file_size = 512M
 innodb_log_files_in_group = 2
 innodb_max_dirty_pages_pct = 90



 Thanks,
 Shon


 --
 Beautiful is writing same markup. Internet Explorer 9 supports
 standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
 Spend less time writing and  rewriting code and more time creating great
 experiences on the web. Be a part of the beta today.
 http://p.sf.net/sfu/beautyoftheweb


  ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users



 --
 Med venlig hilsen / Best Regards

 Henrik Johansen
 hen...@scannet.dk
 Tlf. 75 53 35 00

 ScanNet Group
 A/S ScanNet
--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-10-14 Thread Henrik Johansen
'Mingus Dew' wrote:
Henrik,
Have you had any problems with slow queries during backup or restore
jobs? I'm thinking about http://bugs.bacula.org/view.php?id=1472
specifically, and considering that the bacula.File table already has 73
million rows in it and I haven't even successfully ran the big job
yet.

Not really.

We have several 10+ million file jobs - all run without problem (backup
and restore).

I am aware of the fact that a lot of Bacula users run PG  ( Bacula
Systems also does recommend PG for larger setups ) but nevertheless
MySQL has served us very well so far.


Just curious as a fellow Solaris deployer...

Thanks,
Shon

On Fri, Oct 8, 2010 at 3:30 PM, Henrik Johansen
hen...@scannet.dkmailto:hen...@scannet.dk wrote:
'Mingus Dew' wrote:
All,
I am running Bacula 5.0.1 on Solaris 10 x86. I'm currently running
MySQL 4.1.22 for the database server. I do plan on upgrading to a
compatible version of MySQL 5, but migrating to PostgreSQL isn't an
option at this time.

I am trying to backup to tape a very large number of files for a
client. While the data size is manageable at around 2TB, the number of
files is incredibly large.
The first of the jobs had 27 million files and initially failed because
the batch table became Full. I changed the myisam_data_pointer size
to a value of 6 in the config.

This job was then able to run successfully and did not take too long.

I have another job which has 42 million files. I'm not sure what that
equates to in rows that need to be inserted, but I can say that I've
not been able to successfully run the job, as it seems to hang for
over 30 hours in a Dir inserting attributes status. This causes
other jobs to backup in the queue and once canceled I have to restart
Bacula.

I'm looking for way to boost performance of MySQL or Bacula (or both)
to get this job completed.

You *really* need to upgrade to MySQL 5 and change to InnoDB - there is no
way in hell that MySQL 4 + MyISAM is going to perform decent in your
situation.
Solaris 10 is a Tier 1 platform for MySQL so the latest versions are
always available from www.mysql.com in the native pkg format so there really
is no excuse.

We run our Bacula Catalog MySQl servers on Solaris (OpenSolaris) so
perhaps I can give you some pointers.

Our smallest Bacula DB is currently ~70 GB (381,230,610 rows).

Since you are using Solaris 10 I assume that you are going to run MySQL
off ZFS - in that case you need to adjust the ZFS recordsize for the
filesystem that is going to hold your InnoDB datafiles to match the
InnoDB block size.

If you are using ZFS you should also consider getting yourself a fast
SSD as a SLOG (or to disable the ZIL entirely if you dare) - all InnoDB
writes to datafiles are O_SYNC and benefit *greatly* from an SSD in
terms of write / transaction speed.

If you have enough CPU power to spare you should try turning on
compression for the ZFS filesystem holding the datafiles - it also can
accelerate DB writes / reads but YMMV.

Lastly, our InnoDB related configuration from my.cnf :

# InnoDB options skip-innodb_doublewrite
innodb_data_home_dir = /tank/db/
innodb_log_group_home_dir = /tank/logs/
innodb_support_xa = false
innodb_file_per_table = true
innodb_buffer_pool_size = 20G
innodb_flush_log_at_trx_commit = 2
innodb_log_buffer_size = 128M
innodb_log_file_size = 512M
innodb_log_files_in_group = 2
innodb_max_dirty_pages_pct = 90



Thanks,
Shon

--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb

___
Bacula-users mailing list
Bacula-users@lists.sourceforge.netmailto:Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


--
Med venlig hilsen / Best Regards

Henrik Johansen
hen...@scannet.dkmailto:hen...@scannet.dk
Tlf. 75 53 35 00

ScanNet Group
A/S ScanNet


--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb

___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


-- 
Med venlig hilsen / Best Regards

Henrik Johansen
hen...@scannet.dk
Tlf. 75 53 35 00

ScanNet Group
A/S ScanNet 

--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend 

Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-10-13 Thread Alan Brown
Alan Brown wrote:

 You are going to hit a big pain point with myisam with that many files 
 anyway (it breaks around 4 billion entries without tuning), but even 
 inno will grow large/slow and need a lot of my.cnf tuning

That should be 4Gb - about 50 million entries.




--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-10-12 Thread Alan Brown
Bruno Friedmann wrote:

 Rude answer :
 
 If you really want to use Mysql drop the myisam to innodb.
 But you don't want to use mysql for that job, just use Postgresql fine tuned 
 with batch insert enabled.

Seconded - having been through this issue.

You are going to hit a big pain point with myisam with that many files 
anyway (it breaks around 4 billion entries without tuning), but even 
inno will grow large/slow and need a lot of my.cnf tuning

Go straight to Postgres - you'll need it eventually anyway, then read up 
on tuning it. For large databases it runs faster and uses less memory.






--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-10-12 Thread Rory Campbell-Lange
On 12/10/10, Alan Brown (a...@mssl.ucl.ac.uk) wrote:
 Bruno Friedmann wrote:

  But you don't want to use mysql for that job, just use Postgresql
  fine tuned with batch insert enabled.
 
 Seconded - having been through this issue.

I am running Postgresql with batch insert with jobs of around 8 million
files, and it works without any problems.

Postgresql is tremendous at providing a smooth upgrade path too. We
migrated a lot of services with few problems from each major release
starting in the low 7.x release series.

Regards
Rory

-- 
Rory Campbell-Lange
r...@campbell-lange.net

Campbell-Lange Workshop
www.campbell-lange.net
0207 6311 555
3 Tottenham Street London W1T 2AF
Registered in England No. 04551928

--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-10-12 Thread Mingus Dew
Henrik,
 I really appreciate your reply, particularly as a fellow
Bacula-on-Solaris user. I do not have my databases on ZFS, only my Bacula
storage. I'll probably have to tune for local disk.

Thanks very much,
Shon

On Fri, Oct 8, 2010 at 3:30 PM, Henrik Johansen hen...@scannet.dk wrote:

 'Mingus Dew' wrote:

 All,
 I am running Bacula 5.0.1 on Solaris 10 x86. I'm currently running
 MySQL 4.1.22 for the database server. I do plan on upgrading to a
 compatible version of MySQL 5, but migrating to PostgreSQL isn't an
 option at this time.

 I am trying to backup to tape a very large number of files for a
 client. While the data size is manageable at around 2TB, the number of
 files is incredibly large.
 The first of the jobs had 27 million files and initially failed because
 the batch table became Full. I changed the myisam_data_pointer size
 to a value of 6 in the config.

 This job was then able to run successfully and did not take too long.

 I have another job which has 42 million files. I'm not sure what that
 equates to in rows that need to be inserted, but I can say that I've
 not been able to successfully run the job, as it seems to hang for
 over 30 hours in a Dir inserting attributes status. This causes
 other jobs to backup in the queue and once canceled I have to restart
 Bacula.

 I'm looking for way to boost performance of MySQL or Bacula (or both)
 to get this job completed.


 You *really* need to upgrade to MySQL 5 and change to InnoDB - there is no
 way in hell that MySQL 4 + MyISAM is going to perform decent in your
 situation.
 Solaris 10 is a Tier 1 platform for MySQL so the latest versions are
 always available from www.mysql.com in the native pkg format so there
 really
 is no excuse.

 We run our Bacula Catalog MySQl servers on Solaris (OpenSolaris) so
 perhaps I can give you some pointers.

 Our smallest Bacula DB is currently ~70 GB (381,230,610 rows).

 Since you are using Solaris 10 I assume that you are going to run MySQL
 off ZFS - in that case you need to adjust the ZFS recordsize for the
 filesystem that is going to hold your InnoDB datafiles to match the
 InnoDB block size.

 If you are using ZFS you should also consider getting yourself a fast
 SSD as a SLOG (or to disable the ZIL entirely if you dare) - all InnoDB
 writes to datafiles are O_SYNC and benefit *greatly* from an SSD in
 terms of write / transaction speed.

 If you have enough CPU power to spare you should try turning on
 compression for the ZFS filesystem holding the datafiles - it also can
 accelerate DB writes / reads but YMMV.

 Lastly, our InnoDB related configuration from my.cnf :

 # InnoDB options skip-innodb_doublewrite
 innodb_data_home_dir = /tank/db/
 innodb_log_group_home_dir = /tank/logs/
 innodb_support_xa = false
 innodb_file_per_table = true
 innodb_buffer_pool_size = 20G
 innodb_flush_log_at_trx_commit = 2
 innodb_log_buffer_size = 128M
 innodb_log_file_size = 512M
 innodb_log_files_in_group = 2
 innodb_max_dirty_pages_pct = 90



 Thanks,
 Shon


 --
 Beautiful is writing same markup. Internet Explorer 9 supports
 standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
 Spend less time writing and  rewriting code and more time creating great
 experiences on the web. Be a part of the beta today.
 http://p.sf.net/sfu/beautyoftheweb


  ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users



 --
 Med venlig hilsen / Best Regards

 Henrik Johansen
 hen...@scannet.dk
 Tlf. 75 53 35 00

 ScanNet Group
 A/S ScanNet
--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-10-08 Thread Bruno Friedmann
On 10/07/2010 11:03 PM, Mingus Dew wrote:
 All,
  I am running Bacula 5.0.1 on Solaris 10 x86. I'm currently running
 MySQL 4.1.22 for the database server. I do plan on upgrading to a compatible
 version of MySQL 5, but migrating to PostgreSQL isn't an option at this
 time.
 
  I am trying to backup to tape a very large number of files for a
 client. While the data size is manageable at around 2TB, the number of files
 is incredibly large.
 The first of the jobs had 27 million files and initially failed because the
 batch table became Full. I changed the myisam_data_pointer size to a value
 of 6 in the config.
 This job was then able to run successfully and did not take too long.
 
 I have another job which has 42 million files. I'm not sure what that
 equates to in rows that need to be inserted, but I can say that I've not
 been
 able to successfully run the job, as it seems to hang for over 30 hours in a
 Dir inserting attributes status. This causes other jobs to backup in the
 queue and
 once canceled I have to restart Bacula.
 
 I'm looking for way to boost performance of MySQL or Bacula (or both) to
 get this job completed.
 
 Thanks,
 Shon
 
Rude answer :

If you really want to use Mysql drop the myisam to innodb.
But you don't want to use mysql for that job, just use Postgresql fine tuned 
with batch insert enabled.

:-)

-- 

Bruno Friedmann (irc:tigerfoot)
Ioda-Net Sàrl www.ioda-net.ch
 openSUSE Member
User www.ioda.net/r/osu
Blog www.ioda.net/r/blog
  fsfe fellowship www.fsfe.org
GPG KEY : D5C9B751C4653227
vcard : http://it.ioda-net.ch/ioda-net.vcf

--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-10-08 Thread Mingus Dew
Bruno,
 Not so rude at all :) You've made me think of 2 questions

How difficult is it (or procedure for) converting to InnoDB and what exactly
will this gain in performance increase?

Also, you mention Postgresql and batch inserts. Does Bacula not use batch
inserts with MySQL by default?
I'm assuming I'm using batch inserts because Bacula uses a table called
'batch'

-Shon

On Fri, Oct 8, 2010 at 2:07 AM, Bruno Friedmann br...@ioda-net.ch wrote:

 On 10/07/2010 11:03 PM, Mingus Dew wrote:
  All,
   I am running Bacula 5.0.1 on Solaris 10 x86. I'm currently running
  MySQL 4.1.22 for the database server. I do plan on upgrading to a
 compatible
  version of MySQL 5, but migrating to PostgreSQL isn't an option at this
  time.
 
   I am trying to backup to tape a very large number of files for a
  client. While the data size is manageable at around 2TB, the number of
 files
  is incredibly large.
  The first of the jobs had 27 million files and initially failed because
 the
  batch table became Full. I changed the myisam_data_pointer size to a
 value
  of 6 in the config.
  This job was then able to run successfully and did not take too long.
 
  I have another job which has 42 million files. I'm not sure what that
  equates to in rows that need to be inserted, but I can say that I've not
  been
  able to successfully run the job, as it seems to hang for over 30 hours
 in a
  Dir inserting attributes status. This causes other jobs to backup in
 the
  queue and
  once canceled I have to restart Bacula.
 
  I'm looking for way to boost performance of MySQL or Bacula (or both)
 to
  get this job completed.
 
  Thanks,
  Shon
 
 Rude answer :

 If you really want to use Mysql drop the myisam to innodb.
 But you don't want to use mysql for that job, just use Postgresql fine
 tuned with batch insert enabled.

 :-)

 --

 Bruno Friedmann (irc:tigerfoot)
 Ioda-Net Sàrl www.ioda-net.ch
  openSUSE Member
User www.ioda.net/r/osu
Blog www.ioda.net/r/blog
  fsfe fellowship www.fsfe.org
 GPG KEY : D5C9B751C4653227
 vcard : http://it.ioda-net.ch/ioda-net.vcf


 --
 Beautiful is writing same markup. Internet Explorer 9 supports
 standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
 Spend less time writing and  rewriting code and more time creating great
 experiences on the web. Be a part of the beta today.
 http://p.sf.net/sfu/beautyoftheweb
 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users

--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-10-08 Thread Bruno Friedmann
For batch insert by default on mysql, it could be or not, depending on several 
factors
is mysql pthread safe or not, and configure option choose during building time.

The mysql 4 is obsolete now with 5.0.3 (I think there's some good reasons for 
that)

Transforming table to innodb is quite simple, but depend how much and how are 
the indexes on tables.
Innodb doesn't like varchar indexes  254.

So if you don't have any blockers, you will have to upgrade your mysql to 
something more recent, so perhaps moving to postgresql
is a perfect alternative, just in time.
For performances even with batch-inserted, I'm remembering a graph that Eric B 
have show during prototyping db last year at
pgdays.eu ( perharps the slides are always there )

Eric ? any suggestions if you read that.

Anyway, I'm not using Solaris, so it would be nice to have some advice from 
experimented people on that plateform.


On 10/08/2010 02:36 PM, Mingus Dew wrote:
 Bruno,
  Not so rude at all :) You've made me think of 2 questions
 
 How difficult is it (or procedure for) converting to InnoDB and what exactly
 will this gain in performance increase?
 
 Also, you mention Postgresql and batch inserts. Does Bacula not use batch
 inserts with MySQL by default?
 I'm assuming I'm using batch inserts because Bacula uses a table called
 'batch'
 
 -Shon
 
 On Fri, Oct 8, 2010 at 2:07 AM, Bruno Friedmann br...@ioda-net.ch wrote:
 
 On 10/07/2010 11:03 PM, Mingus Dew wrote:
 All,
  I am running Bacula 5.0.1 on Solaris 10 x86. I'm currently running
 MySQL 4.1.22 for the database server. I do plan on upgrading to a
 compatible
 version of MySQL 5, but migrating to PostgreSQL isn't an option at this
 time.

  I am trying to backup to tape a very large number of files for a
 client. While the data size is manageable at around 2TB, the number of
 files
 is incredibly large.
 The first of the jobs had 27 million files and initially failed because
 the
 batch table became Full. I changed the myisam_data_pointer size to a
 value
 of 6 in the config.
 This job was then able to run successfully and did not take too long.

 I have another job which has 42 million files. I'm not sure what that
 equates to in rows that need to be inserted, but I can say that I've not
 been
 able to successfully run the job, as it seems to hang for over 30 hours
 in a
 Dir inserting attributes status. This causes other jobs to backup in
 the
 queue and
 once canceled I have to restart Bacula.

 I'm looking for way to boost performance of MySQL or Bacula (or both)
 to
 get this job completed.

 Thanks,
 Shon

 Rude answer :

 If you really want to use Mysql drop the myisam to innodb.
 But you don't want to use mysql for that job, just use Postgresql fine
 tuned with batch insert enabled.

 :-)

 --

 Bruno Friedmann (irc:tigerfoot)
 Ioda-Net Sàrl www.ioda-net.ch
  openSUSE Member
User www.ioda.net/r/osu
Blog www.ioda.net/r/blog
  fsfe fellowship www.fsfe.org
 GPG KEY : D5C9B751C4653227
 vcard : http://it.ioda-net.ch/ioda-net.vcf


 --
 Beautiful is writing same markup. Internet Explorer 9 supports
 standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
 Spend less time writing and  rewriting code and more time creating great
 experiences on the web. Be a part of the beta today.
 http://p.sf.net/sfu/beautyoftheweb
 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users

 


-- 

Bruno Friedmann (irc:tigerfoot)
Ioda-Net Sàrl www.ioda-net.ch
 openSUSE Member
User www.ioda.net/r/osu
Blog www.ioda.net/r/blog
  fsfe fellowship www.fsfe.org
GPG KEY : D5C9B751C4653227
vcard : http://it.ioda-net.ch/ioda-net.vcf

--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-10-08 Thread Henrik Johansen
'Mingus Dew' wrote:
 All,
 I am running Bacula 5.0.1 on Solaris 10 x86. I'm currently running
 MySQL 4.1.22 for the database server. I do plan on upgrading to a
 compatible version of MySQL 5, but migrating to PostgreSQL isn't an
 option at this time.

 I am trying to backup to tape a very large number of files for a
 client. While the data size is manageable at around 2TB, the number of
 files is incredibly large.
The first of the jobs had 27 million files and initially failed because
 the batch table became Full. I changed the myisam_data_pointer size
 to a value of 6 in the config.

This job was then able to run successfully and did not take too long.

 I have another job which has 42 million files. I'm not sure what that
 equates to in rows that need to be inserted, but I can say that I've
 not been able to successfully run the job, as it seems to hang for
 over 30 hours in a Dir inserting attributes status. This causes
 other jobs to backup in the queue and once canceled I have to restart
 Bacula.

 I'm looking for way to boost performance of MySQL or Bacula (or both)
 to get this job completed.

You *really* need to upgrade to MySQL 5 and change to InnoDB - there is no
way in hell that MySQL 4 + MyISAM is going to perform decent in your
situation. 

Solaris 10 is a Tier 1 platform for MySQL so the latest versions are
always available from www.mysql.com in the native pkg format so there really
is no excuse.

We run our Bacula Catalog MySQl servers on Solaris (OpenSolaris) so
perhaps I can give you some pointers.

Our smallest Bacula DB is currently ~70 GB (381,230,610 rows).

Since you are using Solaris 10 I assume that you are going to run MySQL
off ZFS - in that case you need to adjust the ZFS recordsize for the
filesystem that is going to hold your InnoDB datafiles to match the
InnoDB block size.

If you are using ZFS you should also consider getting yourself a fast
SSD as a SLOG (or to disable the ZIL entirely if you dare) - all InnoDB
writes to datafiles are O_SYNC and benefit *greatly* from an SSD in
terms of write / transaction speed.

If you have enough CPU power to spare you should try turning on
compression for the ZFS filesystem holding the datafiles - it also can
accelerate DB writes / reads but YMMV.

Lastly, our InnoDB related configuration from my.cnf :

# InnoDB options 
skip-innodb_doublewrite
innodb_data_home_dir = /tank/db/
innodb_log_group_home_dir = /tank/logs/
innodb_support_xa = false
innodb_file_per_table = true
innodb_buffer_pool_size = 20G
innodb_flush_log_at_trx_commit = 2
innodb_log_buffer_size = 128M
innodb_log_file_size = 512M
innodb_log_files_in_group = 2
innodb_max_dirty_pages_pct = 90



Thanks,
Shon

--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb

___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


-- 
Med venlig hilsen / Best Regards

Henrik Johansen
hen...@scannet.dk
Tlf. 75 53 35 00

ScanNet Group
A/S ScanNet 

--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-10-08 Thread Phil Stracchino
On 10/08/10 15:30, Henrik Johansen wrote:
 Since you are using Solaris 10 I assume that you are going to run MySQL
 off ZFS - in that case you need to adjust the ZFS recordsize for the
 filesystem that is going to hold your InnoDB datafiles to match the
 InnoDB block size.

Henrik,
This is an interesting observation.  How does one determine/set the
InnoDB block size?


-- 
  Phil Stracchino, CDK#2 DoD#299792458 ICBM: 43.5607, -71.355
  ala...@caerllewys.net   ala...@metrocast.net   p...@co.ordinate.org
 Renaissance Man, Unix ronin, Perl hacker, Free Stater
 It's not the years, it's the mileage.

--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-10-08 Thread Tim Gustafson
 This is an interesting observation.  How does one
 determine/set the InnoDB block size?

Sorry for butting in here, but I've been following this thread.

You can't change the InnoDB block size unless you recompile from source, from 
what I understand...but that's besides the point.

Using InnoDB adds quite a bit of overhead to most database operations; 
shouldn't Bacula be using MyISAM tables, which are much faster?  My thinking is 
that there is not a lot of concurrency with database reads and writes, and 
probably not much need for referential integrity...or am I missing something?

Tim Gustafson
Baskin School of Engineering
UC Santa Cruz
t...@soe.ucsc.edu
831-459-5354

--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-10-08 Thread Attila Fülöp
Phil Stracchino wrote:
 On 10/08/10 15:30, Henrik Johansen wrote:
 Since you are using Solaris 10 I assume that you are going to run MySQL
 off ZFS - in that case you need to adjust the ZFS recordsize for the
 filesystem that is going to hold your InnoDB datafiles to match the
 InnoDB block size.
 
 Henrik,
 This is an interesting observation.  How does one determine/set the
 InnoDB block size?

Phil,

please see

http://dev.mysql.com/tech-resources/articles/mysql-zfs.html#Set_the_ZFS_Recordsize_to_match_the_block_size

16K is the zfs recodesize I'm using.

Attila




--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-10-08 Thread Phil Stracchino
On 10/08/10 17:49, Attila Fülöp wrote:
 please see
 
 http://dev.mysql.com/tech-resources/articles/mysql-zfs.html#Set_the_ZFS_Recordsize_to_match_the_block_size
 
 16K is the zfs recodesize I'm using.

Aha!  Thanks, Attila.  Exactly what I needed.


-- 
  Phil Stracchino, CDK#2 DoD#299792458 ICBM: 43.5607, -71.355
  ala...@caerllewys.net   ala...@metrocast.net   p...@co.ordinate.org
 Renaissance Man, Unix ronin, Perl hacker, Free Stater
 It's not the years, it's the mileage.

--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] Tuning for large (millions of files) backups?

2010-10-07 Thread Mingus Dew
All,
 I am running Bacula 5.0.1 on Solaris 10 x86. I'm currently running
MySQL 4.1.22 for the database server. I do plan on upgrading to a compatible
version of MySQL 5, but migrating to PostgreSQL isn't an option at this
time.

 I am trying to backup to tape a very large number of files for a
client. While the data size is manageable at around 2TB, the number of files
is incredibly large.
The first of the jobs had 27 million files and initially failed because the
batch table became Full. I changed the myisam_data_pointer size to a value
of 6 in the config.
This job was then able to run successfully and did not take too long.

I have another job which has 42 million files. I'm not sure what that
equates to in rows that need to be inserted, but I can say that I've not
been
able to successfully run the job, as it seems to hang for over 30 hours in a
Dir inserting attributes status. This causes other jobs to backup in the
queue and
once canceled I have to restart Bacula.

I'm looking for way to boost performance of MySQL or Bacula (or both) to
get this job completed.

Thanks,
Shon
--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users