Re: [Bacula-users] LTO tape performances, again...
On 1/24/24 12:48, Marco Gaiarin wrote: My new IBM LTO9 tape unit have a data sheet performace of: https://www.ibm.com/docs/it/ts4500-tape-library/1.10.0?topic=performance-lto-specifications so on worst case (compression disabled) seems to perform 400 MB/s on an LTO9 tape. Practically on Bacula i get 70-80 MB/s. I've just: 1) followed: https://www.bacula.org/9.6.x-manuals/en/problems/Testing_Your_Tape_Drive_Wit.html#SECTION00422000 getting 237.7 MB/s on random data (worst case). 2) checked disk performance (data came only from local disk); i've currently 3 servers, some perform better, some worster, but the best one have a read disk performance pretty decent, at least 200MB/s on random access (1500 MB/s on sequential one). Disk that is local to the server does not mean it is local to the bacula-sd process or tape drive. If the connection is 1 gigabit Ethernet, then max rate is going to be 125 MB/s. 3) disabled data spooling, of course; as just stated, data came only from local disks. Enabled attribute spooling. That is probably not what you want to do. You want the the bacula-sd process to spool data on its local disk so that when it is despooled to the tape drive it is reading only from local disk, not from a small RAM buffer that is being filled through a network socket. Even with a 10 G Ethernet network it is better to spool data for LTO tape drives, since the client itself might not be able to keep up with the tape drive, or is busy, or the network is congested, etc. Clearly i can expect some performance penalty on Bacula and mixed files, but really 70MB/s are slow... What else can i tackle with? Thanks. ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Autochangers and unload timeout...
On 1/24/24 10:13, Marco Gaiarin wrote: I've reached my first tape change on my autochangers, yeah! But... 24-Jan 17:22 cnpve3-sd JobId 16234: [SI0202] End of Volume "AAJ661L9" at 333:49131 on device "LTO9Storage0" (/dev/nst0). Write of 524288 bytes got -1. 24-Jan 17:22 cnpve3-sd JobId 16234: Re-read of last block succeeded. 24-Jan 17:22 cnpve3-sd JobId 16234: End of medium on Volume "AAJ661L9" Bytes=17,846,022,566,912 Blocks=34,038,588 at 24-Jan-2024 17:22. 24-Jan 17:22 cnpve3-sd JobId 16234: 3307 Issuing autochanger "unload Volume AAJ661L9, Slot 2, Drive 0" command. 24-Jan 17:28 cnpve3-sd JobId 16234: 3995 Bad autochanger "unload Volume AAJ661L9, Slot 2, Drive 0": ERR=Child died from signal 15: Termination Results=Program killed by Bacula (timeout) 24-Jan 17:28 cnpve3-sd JobId 16234: 3304 Issuing autochanger "load Volume AAJ660L9, Slot 1, Drive 0" command. 24-Jan 17:29 cnpve3-sd JobId 16234: 3305 Autochanger "load Volume AAJ660L9, Slot 1, Drive 0", status is OK. So, unload timeout, but subsequent load command works as expected (and backup are continuing...). I can safely ignore this? Better tackle with tiemout parameters on /etc/bacula/scripts/mtx-changer.conf script? Thanks. Hello Marco, It looks like the mtx command (called by the mtx-changer script) is taking more than 6 minutes to return, so the process is being killed. But, it then looks like it *must* have succeeded since the load command loads a new tape into the now empty drive. You can try a few things to debug this. First, I would stop the SD, and then manually load/unload tapes into your drive with the mtx command: # mtx -f /dev/tape/by-id/ status If this shows a tape loaded in, for example, drive 0, unload it: # mtx -f /dev/tape/by-id/ unload X 0 (where X is the slot reported loaded in the drive) Then, try loading a different tape: # mtx -f /dev/tape/by-id/ load Y 0(where Y is a slot that has a tape in it, of course :) By doing these manual steps, At least you can find out how long your tape library takes for these processes, and then you can adjust mtx-changer.conf as Pierre explained. Additionally, if you are feeling brave and like playing the part if guinea pig, you can try replacing the default mtx-changer bash/perl script in your Autochanger's "ChangerCommand" with my `mtx-changer-python.py` script. It is a drop-in replacement with better logging and some additional features (with more planned). It is very configurable, and logs everything very clearly - including mtx changer errors, etc (log level is configurable, of course). It needs a few Python modules installed, and as far as I know very few people have even tried it (maybe no one, lol) - But I have been running it in the Bacula Systems lab with our tape library since this past Summer and it "just works"™ If you are even interested, you can find it in my Github account where I have shared it and a few other scripts here: https://github.com/waa Best regards, Bill -- Bill Arlofski w...@protonmail.com signature.asc Description: OpenPGP digital signature ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] LTO tape performances, again...
Am 25.01.24 um 10:06 schrieb Marco Gaiarin: 2) checked disk performance (data came only from local disk); i've currently 3 servers, some perform better, some worster, but the best one have a read disk performance pretty decent, at least 200MB/s on random access (1500 MB/s on sequential one). Jim Pollard on private email ask me about controllers: i've not specified, sorry, but LTO units are connected to a specific SAS controller, not the disks one. I'm registred also a lower than expected write performance. My LTO-6 drive should be handle 160 MB/s uncompressable random data. By the way mostly bacula said after writing a sequence that the tranfer speed is mostly round about 80 MB/s. I've not investigated yet but normally it should go faster. The job is spooled to /tmp and the swap is not in use. So the Transfer should be much more faster. My suggestion now is: Create a big data-random file as like a spool file in /tmp. Spool it with dd from tmp to /dev/null Spool from /dev/random to tape Spool from /tmp to tape Any suggestions about bs usage or something else? Pierre ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Clarification on incremental files...
> On Wed, 24 Jan 2024 18:00:56 +0100, Marco Gaiarin said: > > Suppose that setting 5 or 1 options depend on setting options; eg, it is > totally unuseful to have: > Options { > Signature = MD5 > accurate = <...>1 > } > > so, calculating MD5 and checking SHA1. In fact, I think that will check MD5. The implementation can only store one type of checksum in the catlog and incremental backups just check whatever was stored (and assume it is the same type as in the original backup). > But options 'i' what mean? Compare THE inode number? Or the number of inodes > of that particolar file? 'i' is the st_ino field in the stat, i.e. the number that uniquely identifies the data for the file in the file system. Note that there is always exactly 1 inode that references the data for each file in a UNIX file system. > Also, 'n' mean soft or hard link? What is the interrelation between 'i' and > 'n' options? 'n' is the st_nlinks field in the stat, i.e. the number of hard links to the inode. Nothing records the number of soft links. > Anyway, i'm currently trying: > > Options { > Signature = MD5 > accurate = pugsmcd5 > } I think you need to remove 'c' otherwise you will get the same results as before when the ctime changes. __Martin ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] newbie: errors in baculum
seems I hit this: https://www.mail-archive.com/bacula-users@lists.sourceforge.net/msg72759.html Is it advised to use bacularis instead now? ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] newbie: errors in baculum
Am 25.01.24 um 11:58 schrieb Stefan G. Weichinger: "Error code: 1000 Message: Internal error. TDbCommand failed to execute the query SQL " SELECT conname, consrc, contype, " Googled that, unsure about it. I find https://www.mail-archive.com/search?l=bacula-users@lists.sourceforge.net&q=subject:%22Re%5C%3A+%5C%5BBacula%5C-users%5C%5D+Baculum+api+installs%2C+but+throws+Error+1000%22&o=newest&f=1 but the linked PR/patch seems not to match exactly ... it's 3 yrs old so I assume it's not valid anymore in my environment. This is PostgreSQL-15.5 (Debian 15.5-0+deb12u1) ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] LTO tape performances, again...
> 2) checked disk performance (data came only from local disk); i've currently > 3 servers, some perform better, some worster, but the best one have a read > disk performance pretty decent, at least 200MB/s on random access (1500 MB/s > on sequential one). Jim Pollard on private email ask me about controllers: i've not specified, sorry, but LTO units are connected to a specific SAS controller, not the disks one. -- ...mi dispiace solo un po' per gli svizzeri, ieri hanno giocato una partita di merda, oggi gli arriva bossi...(Piccia, il 27/6/2006) ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
[Bacula-users] newbie: errors in baculum
greetings, bacula-users I am a complete starter with bacula and test things in a Debian 12.4 machine. I can write to and read from tape already, using an autochanger ... looks good already. I maybe have a "mixed" setup: at first I installed from the debian repos, then I found the specific bacula repos for Debian. Which repos to use for bacula and bookworm? I added baculum: # cat /etc/apt/sources.list.d/baculum.list deb http://www.bacula.org/downloads/baculum/stable-11/debian bullseye main deb-src http://www.bacula.org/downloads/baculum/stable-11/debian bullseye main and was able to set up API and web ... although in the baculum-GUI I get database-related errors like: "Error code: 1000 Message: Internal error. TDbCommand failed to execute the query SQL " SELECT conname, consrc, contype, " Googled that, unsure about it. Do I maybe have a too old DB now, coming from bacula-9.6? Is it compatible at all? Should I somehow upgrade bacula? Should I start from scratch? pls advise, thanks in advance, Stefan ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Autochangers and unload timeout...
Am 24.01.24 um 18:13 schrieb Marco Gaiarin: 24-Jan 17:22 cnpve3-sd JobId 16234: [SI0202] End of Volume "AAJ661L9" at 333:49131 on device "LTO9Storage0" (/dev/nst0). Write of 524288 bytes got -1. 24-Jan 17:22 cnpve3-sd JobId 16234: Re-read of last block succeeded. 24-Jan 17:22 cnpve3-sd JobId 16234: End of medium on Volume "AAJ661L9" Bytes=17,846,022,566,912 Blocks=34,038,588 at 24-Jan-2024 17:22. 24-Jan 17:22 cnpve3-sd JobId 16234: 3307 Issuing autochanger "unload Volume AAJ661L9, Slot 2, Drive 0" command. 24-Jan 17:28 cnpve3-sd JobId 16234: 3995 Bad autochanger "unload Volume AAJ661L9, Slot 2, Drive 0": ERR=Child died from signal 15: Termination Results=Program killed by Bacula (timeout) 24-Jan 17:28 cnpve3-sd JobId 16234: 3304 Issuing autochanger "load Volume AAJ660L9, Slot 1, Drive 0" command. 24-Jan 17:29 cnpve3-sd JobId 16234: 3305 Autochanger "load Volume AAJ660L9, Slot 1, Drive 0", status is OK. So, unload timeout, but subsequent load command works as expected (and backup are continuing...). In the mtx-changer.conf You can set debug_log=1 to create a mtx.log in ~bacula home dir which should be /var/lib/bacula. I'd set debug_level=100 to log everything. Maybe the offline time is to low. In my opinion I give it simply 900 seconds to prevent me from failures the drive needs more time than expected athough almost it needs less than 60 seconds. offline_sleep should be 1. By the way I'm using mtx-changer script for years untouched I found in my one the parameters won't be used in the waiting loop: # The purpose of this function to wait a maximum # time for the drive. It will # return as soon as the drive is ready, or after # waiting a maximum of 900 seconds. # Note, this is very system dependent, so if you are # not running on Linux, you will probably need to # re-write it, or at least change the grep target. # We've attempted to get the appropriate OS grep targets # in the code at the top of this script. # wait_for_drive() { i=0 while [ $i -le 900 ]; do # Wait max 900 seconds if mt -f $1 status 2>&1 | grep "${ready}" >/dev/null 2>&1; then stinit 2>/dev/null >/dev/null break fi debug $dbglvl "Device $1 - not ready, retrying..." sleep 1 i=`expr $i + 1` done } By the way I'm not further sure that is still the state in the distributed mtx-changer script. Normally I would expect in the while statement something like while [ ${offline_sleep -eq 1 ] && [ $i -le ${offline_time} ]; do # Wait max 900 seconds (untested) Cheers, Pierre ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users