I've now got everything updated to 9.0.2 using a work-in-progress development version of the app-backup/bacula-9.0.2 ebuild. I'm running against MariaDB 10.2.7 (which more or less approximates MySQL 5.7) with Galera enabled.
Build platforms: Gentoo Linux on amd64 (AMD Phenom II, Thuban microarchitecture) using gcc 6.3.0 Solaris 10u9 on amd64 (Intel P4 Xeon, Nocona microarchitecture) using Solaris Studio 12.2 Solaris 11.3 on amd64 (AMD Opteron 2384, Shanghai microarchitecture) using gcc 4.9.4 Build considerations: Solaris 10 required the tgoto prototype in conio.c to be moved down one line. No other build issues encountered other than that enabling building the storage daemon also forces enabling the director, even if director is requested to be disabled. I did change the DB write batch size limit at sql_create.c:870 from 500000 to 1000 per Galera best-performance recommendations. I was able to complete incremental backups and some differential backups. I was able to successfully run jobs that backed up as many as 120,000 files, with wsrep_max_rows at its default of 128K. A differential job that tried to back up 177,000 files failed with wsrep_max_rows_exceeded. If that is truly the only place in the code that the write batch size is set, then it appears database write batching is not actually working. I maintain that even without Galera, 500000 is an unreasonably large batch size. Just because a modern database *can* handle it doesn't make it a good idea. 50000 would be more reasonable, and 10000 would be better. Problems encountered so far, running the Director and both SDs in the foreground at -d200: 1. None of the datetime fields in the schema have defaults. This is a problem unless STRICT SQL mode is disabled, which is a bad idea. It is probable that in upcoming Oracle MySQL versions (and forks thereof), strict SQL will be mandatory. Adding the canonically-correct-SQL DEFAULT '1970-01-01-00:00:00' to all datetime fields prevented any further DB-related outright *failures*. However, this causes problems with Volume Use Duration settings. Using DEFAULT '0000-00-00 00:00:00' for datetime is permitted by MySQL 5.7 or MariaDB 10.2.x *as long as* SQL_MODE does not include NO_ZERO_DATE or NO_ZERO_IN_DATE. This does not APPEAR to cause any problems with volume expiration. 2. Various actions in BAT still create multiple overlapping and often-confusing dialog boxes. Deleting a volume, for example, emits a confirmation dialog, followed by three more simultaneous dialogs: - Warning: This command will delete volume ... and all Jobs saved on that volume from the Catalog - Bat Question: Are you sure you want to delete Volume ...? (yes/no) - Text input dialog: Are you sure you want to delete Volume ...? (yes/no) You can't respond to the Warning until you respond to the Text Input Dialog. You can't respond to the Text Input Dialog until you respond to the Bat Question. If you type in the text input dialog's text input box, it will throw an error. You have to ignore the text box and click OK instead. However, this APPEARS to no longer cause BAT to become unresponsive. I have not yet tried a PURGE VOLUME, which is the other operation that would in the past cause BAT to become unresponsive. 3. I am having difficulty getting my LTO4 SD to mount and unmount tapes. This is what the director logged when trying to run a restore from the LTO4 tape SD with the wrong tape mounted: 29-Jul 13:13 babylon5-sd JobId 14248: Warning: acquire.c:279 Read acquire: Wrong Volume mounted on Tape device "LTO-4" +(/dev/nst0): Wanted LTO4-FULL-0019 have LTO4-FULL-0013 29-Jul 13:13 babylon5-sd JobId 14248: Warning: acquire.c:235 Read open Tape device "LTO-4" (/dev/nst0) Volume +"LTO4-FULL-0019" failed: ERR=tape_dev.c:170 Unable to open device "LTO-4" (/dev/nst0): ERR=No medium found 29-Jul 13:13 babylon5-sd JobId 14248: Warning: acquire.c:235 Read open Tape device "LTO-4" (/dev/nst0) Volume +"LTO4-FULL-0019" failed: ERR=tape_dev.c:170 Unable to open device "LTO-4" (/dev/nst0): ERR=No medium found 29-Jul 13:13 babylon5-sd JobId 14248: Warning: acquire.c:235 Read open Tape device "LTO-4" (/dev/nst0) Volume +"LTO4-FULL-0019" failed: ERR=tape_dev.c:170 Unable to open device "LTO-4" (/dev/nst0): ERR=No medium found 29-Jul 13:13 babylon5-sd JobId 14248: Warning: acquire.c:235 Read open Tape device "LTO-4" (/dev/nst0) Volume +"LTO4-FULL-0019" failed: ERR=tape_dev.c:170 Unable to open device "LTO-4" (/dev/nst0): ERR=No medium found 29-Jul 13:13 babylon5-sd JobId 14248: Warning: acquire.c:235 Read open Tape device "LTO-4" (/dev/nst0) Volume +"LTO4-FULL-0019" failed: ERR=tape_dev.c:170 Unable to open device "LTO-4" (/dev/nst0): ERR=Input/output error 29-Jul 13:13 babylon5-sd JobId 14248: Warning: acquire.c:235 Read open Tape device "LTO-4" (/dev/nst0) Volume +"LTO4-FULL-0019" failed: ERR=tape_dev.c:170 Unable to open device "LTO-4" (/dev/nst0): ERR=Input/output error 29-Jul 13:13 babylon5-sd JobId 14248: Warning: acquire.c:235 Read open Tape device "LTO-4" (/dev/nst0) Volume +"LTO4-FULL-0019" failed: ERR=tape_dev.c:170 Unable to open device "LTO-4" (/dev/nst0): ERR=Input/output error 29-Jul 13:13 babylon5-sd JobId 14248: Warning: acquire.c:235 Read open Tape device "LTO-4" (/dev/nst0) Volume +"LTO4-FULL-0019" failed: ERR=tape_dev.c:170 Unable to open device "LTO-4" (/dev/nst0): ERR=Input/output error 29-Jul 13:13 babylon5-sd JobId 14248: Warning: acquire.c:235 Read open Tape device "LTO-4" (/dev/nst0) Volume +"LTO4-FULL-0019" failed: ERR=tape_dev.c:170 Unable to open device "LTO-4" (/dev/nst0): ERR=Input/output error 29-Jul 13:13 babylon5-sd JobId 14248: Warning: acquire.c:235 Read open Tape device "LTO-4" (/dev/nst0) Volume +"LTO4-FULL-0019" failed: ERR=tape_dev.c:170 Unable to open device "LTO-4" (/dev/nst0): ERR=Input/output error 29-Jul 13:13 babylon5-sd JobId 14248: Fatal error: acquire.c:328 Too many errors trying to mount Tape device "LTO-4" +(/dev/nst0) for reading. 29-Jul 13:13 babylon4 JobId 14248: Fatal error: job.c:2699 Bad response from SD to Read Data command. Wanted 3000 OK data , got len=11 msg="3000 error " If I *START* the sd with the correct tape in place, it automounts it just fine. I was able to complete a test restore that required a single tape by pre-loading the tape. But I cannot manually mount or unmount tapes, either from BAT or from the console. It just plain doesn't work. Nothing happens. The SD doesn't log *anything* (at -d200) and as far as I can tell, never receives the mount or umount commands. status storage=babylon5-sd says about the device: Device status: Device Tape is "LTO-4" (/dev/nst0) mounted with: Volume: LTO4-FULL-0019 Pool: *unknown* Media type: LTO-4 Total Bytes Read=0 Blocks Read=0 Bytes/block=0 Positioned at File=0 Block=0 Configured device capabilities: EOF BSR BSF FSR FSF EOM REM !RACCESS AUTOMOUNT !LABEL !ANONVOLS ALWAYSOPEN Device state: OPENED TAPE LABEL !MALLOC !APPEND !READ !EOT !WEOT !EOF !NEXTVOL !SHORT !MOUNTED Writers=0 reserves=0 blocked=0 enabled=1 usage=1,024 Attached JobIds: Device parameters: Archive name: /dev/nst0 Device name: LTO-4 File=0 block=0 Min block=0 Max block=2048000 Do I need to re-test my tape drive under Bacula 9.x? Has something changed between 7.4.7 and 9 x in tape handling that requires configuration changes? Summary: - Can't run full backups because I can't mount and unmount LTO4 tapes except by restarting the SD, which will cause the running jobs to fail - Database write batching is not working, causing jobs that back up more than 128K files to fail - Schema is not compliant with MySQL 5.7 or MariaDB 10.2 with strict SQL compliance enabled, which will cause many database-related failures -- Phil Stracchino Babylon Communications ph...@caerllewys.net p...@co.ordinate.org Landline: +1.603.293.8485 Mobile: +1.603.998.6958 ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Bacula-devel mailing list Bacula-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-devel