Hello Phil,
Well, Bacula does not send millions of rows at a time. It does already
batch data together and submits up to a maximum of 500,000 records at
one time. It has never been necessary to change that number because any
respectable database should be able to handle a batch of 500,000 records
at a time (about 50-100MB of data)-- even a cluster, in my opinion. I
suspect that the Galera guys are very sadly mistaken to suggest an
optimum of only 1,000 at a time for the size of the datasets we see some
Bacula customers using, and I will be *extremely* surprised if this
problem shows up in Oracle MySQL.
That said, you can experiment if you wish and try changing the maximum
number of rows (changes). It is in <bacula>/src/cats/sql_create.c at
line 870. If this is something that comes up frequently, we certainly
could put the maximum on a directive in the Catalog resource or perhaps
even elsewhere.
Best regards,
Kern
On 07/24/2017 04:24 PM, Phil Stracchino wrote:
On 07/24/17 07:38, Kern Sibbald wrote:
Hello,
We are pleased to announce that we have just released Bacula version 9.0.2
This is a minor bug fix release, but a few of the bugs are important.
The main items fixed are:
– Postgresql should now work with Postgresql prior to 9.0 Note: the ssl
connection feature added in 9.0 is not available on postgresql servers
older than 9.0 (it needs the new connection API).
– The issues with MariaDB (reconnect variable) are now fixed
– The problem of the btape “test” command finding a wrong number of
files in the append test was a bug. It is now fixed. It is unlikely that
it affected anything but btape.
– The bacula-tray-monitor.deskop script is released in the scripts
directory.
– We recommend that you build with both libz and lzo library support
(the developer packages must be installed when building, and the shared
object libraries must be installed at run time). However we have
modified the code so that Bacula *should* build and run with either or
both libz or lzo absent.
Kern,
I've just discovered a serious compatibility problem - which should,
however, be fairly straightforward to fix. This affects ALL Bacula
releases. (I'm actually still running 7.4.7, because Gentoo's Bacula
ebuild has not yet updated to 9.0.x.)
The problem occurs when running Bacula against a MySQL+Galera cluster.
In this case, the cluster is MariaDB 10.1 plus Galera, but I have no
doubt the same problem will occur with Percona XtraDB Cluster, and
possibly with Oracle MySQL 5.7 using Group Replication, which is
Oracle's copy of Galera synchronous replication.
The problem is that Bacula apparently sends all of the records of a
backup job to the database in a single massive blast. Because writesets
must be prepared in memory for certification before being committed,
Galera has a limit on the maximum size of a writeset which can be
applied in a single transaction. This limit is set by two Galera
variables, wsrep_max_ws_rows and wsrep_max_ws_size. These default to
128K rows and 2GB total writeset size respectively. If a single
transaction exceeds either of these, it will fail. My nightly
incremental backups since I converted to a cluster last week have been
working, but about half of this morning's differentials failed because
they tried to insert too many records in a single transaction.
The failure APPEARS to occur when sending spooled attributes to the
database, and the failure point APPEARS to occur somewhere between about
25,000 and 60,000 files backed up, but I don't have enough information
yet to narrow it down more closely than that.
The proper fix for this is to batch these writes into chunks (honestly,
it's best practice to do this anyway; you should never be sending
millions of rows in a single operation because it can consume huge
amounts of memory), and provide a Bacula configuration variable to limit
the chunk size, with the default of 0 for "unlimited". Users running
standalone or asynchronous-replicated MySQL could simply leave it at the
default, or set it to whatever they feel comfortable with as a batch
size. Users running against a Galera cluster or Group Replication
should probably set the chunk size no higher than 25K. Codership Øy,
the creators of Galera, and Percona Software actually recommend not
exceeding 1000 rows per writeset for best performance.
I will try to find or make time to look at the code myself and see if I
can propose a patch.
As a temporary workaround, I tried bringing my DB cluster down to a
single node (this, no writeset replication), as well as making sure that
attribute spooling was turned off for ALL jobs. Neither of these
worked. I was able to complete backups ONLY by bringing my DB down to a
single standalone node with Galera disabled.
Until such time as there is a DB writeset size limit in Bacula, Bacula
will not be usable against Galera clusters and should probably be
presumed unusable against MySQL Group Replication.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-devel mailing list
Bacula-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-devel