We have the same problem. We have some servers
that have more than 2 million files. Theses
servers are taken a lot of time to backup, and
the restores are taking a lot of time too. As we
have seen, it points that the problem is in the
inserts/selects to the mysql server.
Vicente Hernández
Veloxia Network S.L.
At 23:05 24/05/2006, you wrote:
Peter Eriksson wrote:
Hi List! :-)
I'm (again) trying to improve the performance of our Bacula
installation.
Currently it takes approx 2 full days for a full recover to generate the
(in bacula-dir) incore tree of files to restore for a 360GB partition
which is kind of annoying (especially when you make mistakes like I did
and have to restart the whole operation from the beginning...)
Anyway, I figured I'd check with you how you have your indexes set up.
Below you'll find my current configuration. One thing I'm a bit curious
about is why it says NULL in the Cardinality field for all the indexes
(except for the primary one). I have a feeling this is incorrect, but
since I'm no MySQL expert I'm not sure...
How should a *correct* output look like?
mysql show index from File;
+---+++--
+-+---+-+--++--++-+
| Table | Non_unique | Key_name | Seq_in_index | Column_name |
Collation | Cardinality | Sub_part | Packed | Null | Index_type |
Comment |
+---+++--
+-+---+-+--++--++-+
| File | 0 | PRIMARY|1 | FileId | A
|79393114 | NULL | NULL | | BTREE | |
| File | 1 | JobId |1 | JobId | A
|NULL | NULL | NULL | | BTREE | |
| File | 1 | PathId |1 | PathId | A
|NULL | NULL | NULL | | BTREE | |
| File | 1 | FilenameId |1 | FilenameId | A
|NULL | NULL | NULL | | BTREE | |
| File | 1 | JobId_2|1 | JobId | A
|NULL | NULL | NULL | | BTREE | |
| File | 1 | JobId_2|2 | PathId | A
|NULL | NULL | NULL | | BTREE | |
| File | 1 | JobId_2|3 | FilenameId | A
|NULL | NULL | NULL | | BTREE | |
| File | 1 | JobId_3|1 | JobId | A
|NULL | NULL | NULL | | BTREE | |
+---+++--
+-+---+-+--++--++-+
8 rows in set (0.02 sec)
mysql show index from Job;
+---++--+
--+-+---+-+--++--++-+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation
| Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+---++--+
--+-+---+-+--++--++-+
| Job | 0 | PRIMARY |1 | JobId | A
|2947 | NULL | NULL | | BTREE | |
| Job | 1 | Name |1 | Name| A
| 26 | 128 | NULL | | BTREE | |
+---++--+
--+-+---+-+--++--++-+
2 rows in set (0.04 sec)
Btw, bacula-dir growed to 1.4GB RAM during the two days when building
that incore index - and the machine has 2GB of RAM - I shudder to think
of
how long it would have taken in it wouldn't have fit inside theavailable
RAM or if I would ever have to recover one of the bigger filesystems...
- Peter
Hi -
I couldn't help but notice you mentioning that it had taken over two
whole days to recover 360G of data. Are you kidding?
The problem isn't the 360G of data, it could be 1000G and the problem
would be the same. It is the 5 million files. The current code is not very
scalable beyond a million or two files when doing an interactive restore.
It simply takes too long to build the in-memory tree.
The reason why I ask is because I've not been able to find, over days of
searching, any sort of benchmarks for Bacula preforming any kind of
task. In fact, your numbers are the first I've seen on the subject.
There have been a good number of reports on this list. Most boil down to
having the correct indexes defined and the correct tuning of MySQL (or for
users other than Peter PostgreSQL).
So maybe it would benefit the both of us if we were able to dig up (and
hopefully have some others contribute) some benchmarks with their