Dear Developers,
By default, postgresql data files storage and dump are non-deterministic,
fetching inexpressive deduplication results.
I made some tests clustering all (or at least the largest) Bacula Catalog
tables according to the primary key (but other index can be used), and I yield
much better dedup ratio.
The table cluster must be done and configured once for each table:
select * from pg_indexes where tablename='table';
CLUSTER table USING table_pkey;
And a BeforeJobScript can cluster all database tables where the prior
configuration was performed. E.g.:
su - postgres -c "psql -d bacula -c 'cluster verbose'"
Some literature says the clusterization might speed up indexed queries.
My proposal is that this technique is incorporated to Bacula database creation
and backup catalog dump generation script.
Regards,
--
Heitor Medrado de Faria | CEO Bacula do Brasil & USA | Visto EB-1 | LPIC-III |
EMC 05-001 | ITIL-F
• Não seja tarifado pelo tamanho dos seus backups, conheça o Bacula Enterprise
http://www.bacula.com.br/enterprise/
• Ministro treinamento e implementação in-company do Bacula Community
http://www.bacula.com.br/in-company/
• Compre o novo livro do Bacula http://www.bacula.com.br/livro
• Brazil +55 (61) 98268-4220 | USA +1 (323) 300-5387 | www.bacula.com.br
Indico as capacitações complementares:
Shell básico e Programação em Shell com Julio Neves | Zabbix com Adail Host.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-devel mailing list
Bacula-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-devel