Re: [Bacula-users] Strategies for backup of millions of files...
Mandi! Marco Gaiarin In chel di` si favelave... > OK, things start working. The script do correctly snapshot; if needed, i can > simply mount the snapshot elsewhere via a simple 'mount -t zfs'. Ok, only a note to say, afer some long time, that i've abandoned the 'zfs snapshot' script. After some test in some real wold scenario i got without snapshot: 14-Nov 17:43 svpve3-sd JobId 29940: Elapsed time=48:14:20, Transfer rate=19.31 M Bytes/second FD Files Written: 22,001,262 FD Bytes Written: 3,348,102,512,393 (3.348 TB) and with snapshot: 21-Nov 16:03 svpve3-sd JobId 30339: Elapsed time=47:19:10, Transfer rate=19.69 M Bytes/second FD Files Written: 21,997,689 FD Bytes Written: 3,348,098,003,450 (3.348 TB) a roughly 1,9% gain. Probably other scenario can gain better (or worster) but this is mine, an considering the complication of handing path with 'strippath' in the game. Simply, it's not worth. -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
> You select only the Incremental job to restore, of course the data > is in one of the previous job. We can probably improve the error > message, or remove the meta-data-only record that is reported > as a file. > > I do not recommend to use the jobid= parameter in the restore > command line unless you want to use some advanced scenario. In > general it's not a big issue, but here, records depend on a previous > job that is not included in your restore. > > Just use the restore menu 5 or 12, it should work, if not, please open > an issue on the bug tracker. Changed files and new files should be in that job, I tested this a number of times. ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
Hello Dragan, On 9/27/24 14:16, Dragan Milivojević wrote: Search the mailing list for a thread that I started a few days ago. It breaks bacula by saving empty files, no metadata, only name is saved. I have looked more closely, and your restore command is incorrect. You select only the Incremental job to restore, of course the data is in one of the previous job. We can probably improve the error message, or remove the meta-data-only record that is reported as a file. I do not recommend to use the jobid= parameter in the restore command line unless you want to use some advanced scenario. In general it's not a big issue, but here, records depend on a previous job that is not included in your restore. Just use the restore menu 5 or 12, it should work, if not, please open an issue on the bug tracker. Thanks, Eric ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
Hello Dragan, On 9/27/24 14:16, Dragan Milivojević wrote: Search the mailing list for a thread that I started a few days ago. It breaks bacula by saving empty files, no metadata, only name is saved. Did you open an issue on the bug tracker? gitlab.bacula.org Thanks Best Regards, Eric ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
Mandi! Marco Gaiarin In chel di` si favelave... > No, i'm still working on snapshotting, next step backup; i've not clear if > Include{} and Exclude{} fileset directive apply to 'stippath' or > 'non-strippath' path... OK, things start working. The script do correctly snapshot; if needed, i can simply mount the snapshot elsewhere via a simple 'mount -t zfs'. First note: fileset processing happens *BEFORE* job pre-script, so before filesystem get actually snapshotted and mounted. So, script in 'list' mode have to 'blindly' emit mount path, not able to verify if effectively the FS get snapshotted and mounted. Only a note. Second note: path in fileset have to be 'absolute', not 'relative' to 'strippath', and this seems a bit misleading to me; an example fileset will be: FileSet { Name = PVETerzoNodoSVTestSnap Include { Options { accurate = sm strippath = 4 wildfile = "*.tar" wilddir = "/rpool-backup/bacula/snapshots/bacula-FVG-SV-SVPVE3TestSnap/rpool-backup/varie/test/exclude" Exclude = yes } #File = "\\|/root/snapshotBackup %n list" File = "/rpool-backup/bacula/snapshots/bacula-FVG-SV-SVPVE3TestSnap/rpool-backup/varie" } } so effectively in 'wilddir' i have to use full path, or probably something like this: wilddir = "*/rpool-backup/varie/test/exclude" but (apart a performace hit, i think) can match also other stuff, so can be dangerous. A question; currently i've split the jobs to 'interleave' the use of the LTO9 unit, so i've different fileset with different path; a fileset for example is: FileSet { Name = PVETerzoNodoSVSedi Include { Options { Signature = MD5 accurate = sm noatime = yes } File = "/rpool-backup/rsnapshot/.sync/FVG_PP" File = "/rpool-backup/rsnapshot/.sync/FVG_3T" File = "/rpool-backup/rsnapshot/.sync/FVG_TMS" } } so in some sort of 'positive logic' (eg, listing dirs to backup); now i have to conver this in 'negative logic', eg: FileSet { Name = PVETerzoNodoSVSedi Include { Options { Exclude = yes wilddir = "*" } Options { Signature = MD5 accurate = sm strippath = 4 wilddir "/rpool-backup/bacula/snapshots/bacula-FVG-SV-SVPVE3TestSnap/rpool-backup/rsnapshot/.sync/FVG_PP" wilddir "/rpool-backup/bacula/snapshots/bacula-FVG-SV-SVPVE3TestSnap/rpool-backup/rsnapshot/.sync/FVG_3T" wilddir "/rpool-backup/bacula/snapshots/bacula-FVG-SV-SVPVE3TestSnap/rpool-backup/rsnapshot/.sync/FVG_TMS" } File = "\\|/root/snapshotBackup %n list" } } ...seems a bit overcomplicated for me... probably a better approach would be to save somwhere on client (where /root/snapshotBackup get run) the list of dirs to backup, for every job %n. But still this fit not so well if there some complex setup (eg, if there's specific exclusions inside include dirs...). I'm a bit confused... thanks. -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
Mandi! Arno Lehmann In chel di` si favelave... > So I guess a good recommendation would be to make sure strippath is > applied to *all* paths in a file set, unless you are really sure you > know what you do. Uh, oh... clearly sure! > Also, did you per chance already try, in combination with accurate mode? > I think I'll have a look myself, as it's too long ago I tried the strip > path feature... No, i'm still working on snapshotting, next step backup; i've not clear if Include{} and Exclude{} fileset directive apply to 'stippath' or 'non-strippath' path... -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
Hi Marco, Am 07.10.2024 um 14:50 schrieb Marco Gaiarin: Mandi! Gary R. Schmidt In chel di` si favelave... ... ...i don't think is about restoring, but backing up; indeed the example could be instead: if you have a fileset with an exactly. And with paths such as /usr/bin and /.snap/bin it would get more exciting :-) So I guess a good recommendation would be to make sure strippath is applied to *all* paths in a file set, unless you are really sure you know what you do. Also, did you per chance already try, in combination with accurate mode? I think I'll have a look myself, as it's too long ago I tried the strip path feature... Cheers, Arno -- Arno Lehmann IT-Service Lehmann Sandstr. 6, 49080 Osnabrück ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
Mandi! Gary R. Schmidt In chel di` si favelave... > Because it does exactly what it says it will do. > If you backup /a/b/c/d, and then restore it with "strippath=1" it will > restore to /b/c/d. > If you have some pre-existing /b/c/d it's now been over-written, or > overlaid, if you prefer.. ...i don't think is about restoring, but backing up; indeed the example could be instead: if you have a fileset with an Include { Options { strippath = 1 } File = /a/b/c/d } Include { Options { } File = /b/c/d } you really are doing a total mess... Anyway, i think i've understood. Thanks! -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
On 05/10/2024 02:16, Marco Gaiarin wrote: [SNIP] > > Doc say: strippath=integer This option will cause integer paths to be stripped from the front of the full path/filename being backed up. This can be useful if you are migrating data from another vendor or if you have taken a snapshot into some subdirectory. This directive can cause your filenames to be overlayed with regular backup data, so should be used only by experts and with great care. What exactly mean 'This directive can cause your filenames to be overlayed'?! Because it does exactly what it says it will do. If you backup /a/b/c/d, and then restore it with "strippath=1" it will restore to /b/c/d. If you have some pre-existing /b/c/d it's now been over-written, or overlaid, if you prefer.. Cheers, GaryB-) ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
Mandi! Arno Lehmann In chel di` si favelave... > Look at > https://www.bacula.org/15.0.x-manuals/en/main/Configuring_Director.html#SECTION002370 > > and search for "strippath" :-) Uh, oh! And was available also on 9.4! https://www.bacula.org/9.4.x-manuals/en/main/Configuring_Director.html > It would be interesting to see how that works in combination with > accurate mode which I think I never tried and can't see documented > anywhere... so time to do some experiments. Doc say: strippath=integer This option will cause integer paths to be stripped from the front of the full path/filename being backed up. This can be useful if you are migrating data from another vendor or if you have taken a snapshot into some subdirectory. This directive can cause your filenames to be overlayed with regular backup data, so should be used only by experts and with great care. What exactly mean 'This directive can cause your filenames to be overlayed'?! -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
Hi Marco, Am 02.10.2024 um 18:07 schrieb Marco Gaiarin: ... Initially i supposed Bacula will have some 'ignore prefix' parameters that will permit that, but this is available only on restore, not backup. For backup, data have to be 'rooted' (mounted) on the same path... Look at https://www.bacula.org/15.0.x-manuals/en/main/Configuring_Director.html#SECTION002370 and search for "strippath" :-) It would be interesting to see how that works in combination with accurate mode which I think I never tried and can't see documented anywhere... so time to do some experiments. Cheers, Arno -- Arno Lehmann IT-Service Lehmann Sandstr. 6, 49080 Osnabrück ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
Mandi! Josip Deanovic via Bacula-users In chel di` si favelave... >> Bacula will just do it. Nothing special required. > This statement might be misleading in this particular case Marco > described. > Bacula will be able to run incremental backup but if the mountpoint is > changing every time backup is running, the content of the mountpoint > directory would be fully copied over and over again, thus, taking more > space compared to the incremental backup of the same directory. Yes, i'm speaking exactly about that. If snapshot get (auto)mounted with a different path, i can surely define a fileset with a script that consider the different path, but... fils are not the same, because are rooted on different path! Initially i supposed Bacula will have some 'ignore prefix' parameters that will permit that, but this is available only on restore, not backup. For backup, data have to be 'rooted' (mounted) on the same path... > As I said in some post, it might be avoided by using bind-mount option > of the mount(8). I need to start some experimentation... -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
On Tue, Oct 1, 2024, at 3:42 AM, Josip Deanovic via Bacula-users wrote: > On 2024-10-01 03:16, Dan Langille wrote: >> On Mon, Sep 30, 2024, at 10:48 AM, Marco Gaiarin wrote: >>> Mandi! Dan Langille >>> In chel di` si favelave... >>> From https://dan.langille.org/2023/12/24/backing-up-freebsd-with-bacula-via-zfs-snapshot/ : >>> >>> I'm still doing some experimentation on this, indeed. >>> >>> It doesn't all run as incrementals. If the list of DATASETS (see above URL) does not change, the fileset does not change. >>> >>> OK, but if mountpoint change, so change root path, so... how can >>> bacula do >>> incrementals, if all path are different?! >>> >>> So. >>> I can use script for the list of path to backup; i can use 'Ignore >>> FileSet Changes = yes' >>> (but if i use script i think it is not needed...) but how can bacula >>> do >>> incrementals? >> >> Bacula will just do it. Nothing special required. > > This statement might be misleading in this particular case Marco > described. > > Bacula will be able to run incremental backup but if the mountpoint is > changing every time backup is running, the content of the mountpoint > directory would be fully copied over and over again, thus, taking more > space compared to the incremental backup of the same directory. > > As I said in some post, it might be avoided by using bind-mount option > of the mount(8). Now I understand the concern. Sorry for the misdirection. The script in my blog post avoids that by using the same snapshot name every time it backs up: See line 8: SNAPNAME="snapshot-for-backup" The snapshot is created for this backup and deleted during the 'RunsWhen = After' script. Hope that helps. -- Dan Langille d...@langille.org ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
On 2024-10-01 03:16, Dan Langille wrote: On Mon, Sep 30, 2024, at 10:48 AM, Marco Gaiarin wrote: Mandi! Dan Langille In chel di` si favelave... From https://dan.langille.org/2023/12/24/backing-up-freebsd-with-bacula-via-zfs-snapshot/ : I'm still doing some experimentation on this, indeed. It doesn't all run as incrementals. If the list of DATASETS (see above URL) does not change, the fileset does not change. OK, but if mountpoint change, so change root path, so... how can bacula do incrementals, if all path are different?! So. I can use script for the list of path to backup; i can use 'Ignore FileSet Changes = yes' (but if i use script i think it is not needed...) but how can bacula do incrementals? Bacula will just do it. Nothing special required. This statement might be misleading in this particular case Marco described. Bacula will be able to run incremental backup but if the mountpoint is changing every time backup is running, the content of the mountpoint directory would be fully copied over and over again, thus, taking more space compared to the incremental backup of the same directory. As I said in some post, it might be avoided by using bind-mount option of the mount(8). Regards! -- Josip Deanovic ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
On Mon, Sep 30, 2024, at 10:48 AM, Marco Gaiarin wrote: > Mandi! Dan Langille > In chel di` si favelave... > >> From >> https://dan.langille.org/2023/12/24/backing-up-freebsd-with-bacula-via-zfs-snapshot/ >> : > > I'm still doing some experimentation on this, indeed. > > >> It doesn't all run as incrementals. If the list of DATASETS (see above URL) >> does not change, the fileset does not change. > > OK, but if mountpoint change, so change root path, so... how can bacula do > incrementals, if all path are different?! > > So. > I can use script for the list of path to backup; i can use 'Ignore > FileSet Changes = yes' > (but if i use script i think it is not needed...) but how can bacula do > incrementals? Bacula will just do it. Nothing special required. >> Oh wait, I think I realize something. The snapshots do not have to be >> explicitly mounted to be available for backup (at least on FreeBSD). > > Never minded about that. Probably you can mount a snapshot with > '-o mountpoint=...'. But now i'm curious... > > >> Is this the problem you are trying to solve? mounting the snapshot to back >> it up? > > Yes. But i need incrementals too. ;-) Incremental are based on the mtime of the file, not having an older version of the file mounted. The script will work for incrementals. And it has nothing to do with having a snapshot mounted. Perhaps you need to explain more about this requirement. It sounds like you may be assuming something which is not required. -- Dan Langille d...@langille.org ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
Mandi! Josip Deanovic via Bacula-users In chel di` si favelave... > Unless the "Ignore FileSet Change" option is used, Bacula will compare the MD5 > checksum of the Include/Exclude contents of the FileSet and decide whether a > Full backup is needed or not. > I don't recommend using this option and the Bacula documentation strongly > recommends against it as well because it bears a risk of having incomplete > backup. Thanskfor the explanation! > In your case, you are trying to devise a versioning by creating new > directories > named by the date but it might be unnecessary as Bacula already keeps that > information for you. Ahem, no; because i need to script that, i'm simply a bit scared to mount snapshot one over another... -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
Mandi! Dan Langille In chel di` si favelave... > From > https://dan.langille.org/2023/12/24/backing-up-freebsd-with-bacula-via-zfs-snapshot/ > : I'm still doing some experimentation on this, indeed. > It doesn't all run as incrementals. If the list of DATASETS (see above URL) > does not change, the fileset does not change. OK, but if mountpoint change, so change root path, so... how can bacula do incrementals, if all path are different?! So. I can use script for the list of path to backup; i can use 'Ignore FileSet Changes = yes' (but if i use script i think it is not needed...) but how can bacula do incrementals? > Oh wait, I think I realize something. The snapshots do not have to be > explicitly mounted to be available for backup (at least on FreeBSD). Never minded about that. Probably you can mount a snapshot with '-o mountpoint=...'. But now i'm curious... > Is this the problem you are trying to solve? mounting the snapshot to back it > up? Yes. But i need incrementals too. ;-) -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
On Thu, Sep 26, 2024, at 7:02 AM, Marco Gaiarin wrote: > Mandi! Arno Lehmann > In chel di` si favelave... > >> I have not looked up the discussion leading here. > > See my previous post: i'm fighting with ZFS. ;-) > > >> What I usually prefer is a combination of a Run Script on the client to >> create a file list to back up, and a File= entry in the include list >> that reads from file or program, i.e. "\|" or "\<" ones. That sounds exactly like what I do. I use the same script for three purposes: create list destroy: >From >https://dan.langille.org/2023/12/24/backing-up-freebsd-with-bacula-via-zfs-snapshot/ > : Job { Name= "r730-01 snapshots" JobDefs = "DefaultJob" Client = r730-01-fd FileSet = "r730-01 snapshots" RunScript { RunsWhen = Before FailJobOnError = Yes Command= "/usr/local/sbin/snapshots-for-backup.sh create" } RunScript { RunsWhen = After FailJobOnError = No Command= "/usr/local/sbin/snapshots-for-backup.sh destroy" } } Then in the FileSet: FileSet { Name = "r730-01 snapshots" Ignore FileSet Changes = yes Include { Options { signature = MD5 } Exclude Dir Containing = .NOBACKUP File = "\\|/usr/local/sbin/snapshots-for-backup.sh list" } } Everything is in one place and consistent. > > OK, effectively i've just to run a script on the client to snapshot the FS > and mount it RO, so it make sense to do that. That is also what I do. I backup the snapshot, which is mounted read-only - there is no other way to mount a ZFS snapshot. It is, by definition, read-only. NOTE: There is no explicit mount required. See below. > A question, indeed: normally if i modify fileset, bacula 'reset' itself and > restart doing a Full backup. > If i manage to get files from script or file, what happen? I have 'Ignore FileSet Changes = yes' on my fileset. > Or it is still the 'fileset change' the trigger, so scripts can make all the > dumbest things, but bacula keep going on incrementals? It doesn't all run as incrementals. If the list of DATASETS (see above URL) does not change, the fileset does not change. > Also, if my script mount on '/some/dirs-202040926' and pass this as a backup > dir for a full, but tomorrow script mount '/some/dirs-202040927' and calls > for an incremental on that, i think will make still a new full... If the output of that script is the same, the fileset is the same. It's not the name of the script, it's the output. > Seems to me there's no (easy) escape, and i need to mount snapshots on the > same mountpoint to have working incrementals... Oh wait, I think I realize something. The snapshots do not have to be explicitly mounted to be available for backup (at least on FreeBSD). Here is an example. [dvl@nagios03:~] $ cd /.zfs/snapshot [dvl@nagios03:/.zfs/snapshot] $ ls -l total 1122 drwxr-xr-x 18 root wheel 22 Jul 12 20:22 2024-07-12-20:29:47-0 drwxr-xr-x 18 root wheel 22 Sep 5 14:27 2024-09-25-10:55:42-0 drwxr-xr-x 18 root wheel 22 Sep 5 14:27 autosnap_2024-09-19_00:00:01_daily drwxr-xr-x 18 root wheel 22 Sep 5 14:27 autosnap_2024-09-20_00:00:00_daily drwxr-xr-x 18 root wheel 22 Sep 5 14:27 autosnap_2024-09-21_00:00:01_daily ... There are all my snapshots. Let's cd into one: [dvl@nagios03:/.zfs/snapshot] $ cd 2024-07-12-20:29:47-0 [dvl@nagios03:/.zfs/snapshot/2024-07-12-20:29:47-0] $ ls COPYRIGHT boot entropy home libexec mnt proc root tmp var bin dev etc lib media net rescuesbin usr There, it's all there. You don't have to mount anything. Is this the problem you are trying to solve? mounting the snapshot to back it up? -- Dan Langille d...@langille.org ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
Search the mailing list for a thread that I started a few days ago. It breaks bacula by saving empty files, no metadata, only name is saved. On Fri, 27 Sept 2024, 10:28 Josip Deanovic via Bacula-users, < bacula-users@lists.sourceforge.net> wrote: > On 2024-09-26 17:01, Dragan Milivojević wrote: > >> For example, while browsing/restoring backup of the specific date, you > >> would > >> get the state of a backed up directory as it was during the execution > >> of > >> a > >> selected backup job. > >> I would recommend the "accurate" option in this case. > > > > Be careful with the accurate option, if you use the recommended pino5 > > option > > it breaks backups. > > Hello Dragan, > > Could you please elaborate on that? > Are you referring to some bug or to the 'o' option that appeared in > Bacula > version 13 and would result in saving only the metadata in case the > content > of a file hasn't been changed? I don't see the problem with that, > provided > that it works as documented. > > > Regards! > > -- > Josip Deanovic > > > ___ > Bacula-users mailing list > Bacula-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bacula-users > ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
On 2024-09-26 17:01, Dragan Milivojević wrote: For example, while browsing/restoring backup of the specific date, you would get the state of a backed up directory as it was during the execution of a selected backup job. I would recommend the "accurate" option in this case. Be careful with the accurate option, if you use the recommended pino5 option it breaks backups. Hello Dragan, Could you please elaborate on that? Are you referring to some bug or to the 'o' option that appeared in Bacula version 13 and would result in saving only the metadata in case the content of a file hasn't been changed? I don't see the problem with that, provided that it works as documented. Regards! -- Josip Deanovic ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
> For example, while browsing/restoring backup of the specific date, you > would > get the state of a backed up directory as it was during the execution of > a > selected backup job. > I would recommend the "accurate" option in this case. Be careful with the accurate option, if you use the recommended pino5 option it breaks backups. ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
On 2024-09-26 13:02, Marco Gaiarin wrote: Hello Marco! Or it is still the 'fileset change' the trigger, so scripts can make all the dumbest things, but bacula keep going on incrementals? Also, if my script mount on '/some/dirs-202040926' and pass this as a backup dir for a full, but tomorrow script mount '/some/dirs-202040927' and calls for an incremental on that, i think will make still a new full... Unless the "Ignore FileSet Change" option is used, Bacula will compare the MD5 checksum of the Include/Exclude contents of the FileSet and decide whether a Full backup is needed or not. I don't recommend using this option and the Bacula documentation strongly recommends against it as well because it bears a risk of having incomplete backup. Seems to me there's no (easy) escape, and i need to mount snapshots on the same mountpoint to have working incrementals... Consistency is always desired. :-) In your case, you are trying to devise a versioning by creating new directories named by the date but it might be unnecessary as Bacula already keeps that information for you. For example, while browsing/restoring backup of the specific date, you would get the state of a backed up directory as it was during the execution of a selected backup job. I would recommend the "accurate" option in this case. If you cannot avoid creation of the directories based on the date, you might help yourself by using bind mount option of the mount(8). Regards -- Josip Deanovic ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
Mandi! Arno Lehmann In chel di` si favelave... > I have not looked up the discussion leading here. See my previous post: i'm fighting with ZFS. ;-) > What I usually prefer is a combination of a Run Script on the client to > create a file list to back up, and a File= entry in the include list > that reads from file or program, i.e. "\|" or "\<" ones. OK, effectively i've just to run a script on the client to snapshot the FS and mount it RO, so it make sense to do that. A question, indeed: normally if i modify fileset, bacula 'reset' itself and restart doing a Full backup. If i manage to get files from script or file, what happen? Or it is still the 'fileset change' the trigger, so scripts can make all the dumbest things, but bacula keep going on incrementals? Also, if my script mount on '/some/dirs-202040926' and pass this as a backup dir for a full, but tomorrow script mount '/some/dirs-202040927' and calls for an incremental on that, i think will make still a new full... Seems to me there's no (easy) escape, and i need to mount snapshots on the same mountpoint to have working incrementals... Thanks. -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
Hi Marco, Am 13.09.2024 um 17:42 schrieb Marco Gaiarin: Mandi! Marco Gaiarin In chel di` si favelave... EG, i'll come back on this on mid-september... Still working on script. I have not looked up the discussion leading here. For a sake of mental health i want to mount snapshot on different mount point... so different file path. But path are defined in filesets, and i cannot define adifferent fileset for every mountpoint... I know i can use StripPrefix/AddPrefix on restore, but there's some way to use a 'StripPrefix' like options on jobs? The File Set Include Option "Strip Path = " may be useful. What I usually prefer is a combination of a Run Script on the client to create a file list to back up, and a File= entry in the include list that reads from file or program, i.e. "\|" or "\<" ones. However, you can do really interesting and slightly confusing things, too: Job { Name = "magichomedirs:_home_b:10:27" Fileset = "magichomedirs" Level = Full Type = "Backup" Client = "homedirfileserver-fd" MaximumConcurrentJobs = 1 Messages = "Standard" Pool = "short-lab" Priority = 10 } FileSet { Name = magichomedirs Include { Options { Signature = SHA1 } File = "\\|/opt/bacula/scripts/magicfileset.sh '%n'" } } # cat /opt/bacula/scripts/magicfileset.sh #!/bin/bash IFS=: read -r j p v b <<< $1 if [ -z "$b" ] ; then exit 0; fi p=${p//_/\/} # echo "Job $j prefix $p from $v to $b" w=$(dirname "$p") n=$(basename "$p") # below is only one line, wrapped by mailer! find "$w" -maxdepth 1 -type d -iname "$n"'*' | awk "/.*[0-9]+$/ {p=match(\$0, /[0-9]+\$/); s=0+substr(\$0, p); if (p && (s>=$v) && (s<$b)) print \$0}" Which works as intended: # ls /home/ arno b004 b008 b012 b016 b020 b026 b030 b034 test b001 b005 b009 b013 b017 b021 b027 b031 b035 b002 b006 b010 b014 b018 b024 b028 b032 b036 b003 b007 b011 b015 b019 b025 b029 b033 b037 # /opt/bacula/scripts/magicfileset.sh magichomedirs:_home_b:10:27 /home/b010 /home/b011 /home/b012 /home/b013 /home/b014 /home/b015 /home/b016 /home/b017 /home/b018 /home/b019 /home/b020 /home/b021 /home/b024 /home/b025 /home/b026 # Usually, more explicit configuration is better -- you configure once, usually under no pressure and with time to experiment, but you need this stuff to have predictable effect when you are under pressure. Cheers, Arno I hope i was clear... -- Arno Lehmann IT-Service Lehmann Sandstr. 6, 49080 Osnabrück ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
Mandi! Marco Gaiarin In chel di` si favelave... > EG, i'll come back on this on mid-september... Still working on script. For a sake of mental health i want to mount snapshot on different mount point... so different file path. But path are defined in filesets, and i cannot define adifferent fileset for every mountpoint... I know i can use StripPrefix/AddPrefix on restore, but there's some way to use a 'StripPrefix' like options on jobs? I hope i was clear... -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
Mandi! Dan Langille In chel di` si favelave... >> >> https://dan.langille.org/2023/12/24/backing-up-freebsd-with-bacula-via-zfs-snapshot/ > How is that working out? You don't know?! ;-) I'm currently on holiday; i'll come back next week, but i need to move and/or do some other tasks. EG, i'll come back on this on mid-september... -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
On Fri, Aug 2, 2024, at 11:03 AM, Marco Gaiarin wrote: > Mandi! Andrea Venturoli > In chel di` si favelave... > >> Probably yes, but why bother? >> Just take multiple snapshots: they have by default a different >> mountpoint and don't take any space. > > Right... never minded about that... i've also found some example: > > > https://dan.langille.org/2023/12/24/backing-up-freebsd-with-bacula-via-zfs-snapshot/ How is that working out? -- Dan Langille d...@langille.org ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
Mandi! Andrea Venturoli In chel di` si favelave... > Probably yes, but why bother? > Just take multiple snapshots: they have by default a different > mountpoint and don't take any space. Right... never minded about that... i've also found some example: https://dan.langille.org/2023/12/24/backing-up-freebsd-with-bacula-via-zfs-snapshot/ Need some tests, but... thanks! -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
On 7/26/24 14:31, Marco Gaiarin wrote: Yes, FS is ZFS so i can snapshot them. But FS/mountpoint is the same for all data, and i've 5 jobs for spliting and optimizing them. There's some way to 'snapshot' FS only one time, with multiple jobs? Probably yes, but why bother? Just take multiple snapshots: they have by default a different mountpoint and don't take any space. At least this is what I've always done. bye av. ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
Mandi! Josip Deanovic via Bacula-users In chel di` si favelave... > Alternatively, Bacula has the noatime option which is not set by > default. I've give it a try. Seems it is useful, rather not dramatic. Thanks! -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
Mandi! Bill Arlofski via Bacula-users In chel di` si favelave... > The typical way to help with this type of situation is to create several > Fileset/Job pairs and then run them all > concurrently. Each Job would be reading a different set of directories. Finally we have splitted data using more 'logic' consideration (analyzing data) leading to 5 jobs from 2 to 4 TB each; seems pretty decent to me. -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
Mandi! Dan Langille In chel di` si favelave... > Do your filesystem have any snapshot capabilities? With that many files, > backing up a snapshot would give you better results with respect to > consistency. Cool! Never minded about that! Yes, FS is ZFS so i can snapshot them. But FS/mountpoint is the same for all data, and i've 5 jobs for spliting and optimizing them. There's some way to 'snapshot' FS only one time, with multiple jobs? If i name the snapshot with a unique name (eg, 'Bacula') i can surely create a script that check if snapshot is just created and skip if yes; but how can i destroy the snapshot if and ony if i'm the last job? -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
On 2024-07-15 17:26, Marco Gaiarin wrote: We have found that a dir (containing mostly home directories) with roughly one and a half million files, took too much time to be backud up; it is not a problem of backup media, also with spooling it took hours to prepare a spool. There's some strategy i can accomplish to reduce backup time (bacula side; clearly we have to work also filesystem side...)? For example, currently i have: Options { Signature = MD5 accurate = sm } if i remove signature and check only size, i can gain some performance? Hello Marco, Most probably yes. If your file system supports it, you could mount the file system with the noatime and nodiratime options. Alternatively, Bacula has the noatime option which is not set by default. That option would prevent Bacula from updating inode atime which would most probably result in some performance gains (although, not dramatic). Also, check the option keepatime which could negatively affect the performance if enabled (it is disabled by default). A long time ago, I had a FreeBSD with a file system about 30GB in size, with the small usage percentage but with the huge number of indexed directories. To archive the whole directory structure using tar, it used to take more than 26 hours. I solved it by using dump tool which does the backup on the block level thus doesn't suffer from large directory tree issue. This approach bears the risk of having inconsistent data in the backup in case where file system is mounted while performing a dump. This could be solved by utilizing snapshots or some type of file system locking/freeze (depending on the OS and the file system). Regards -- Josip Deanovic ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
On Mon, Jul 15, 2024, at 11:26 AM, Marco Gaiarin wrote: > For example, currently i have: > > Options { > Signature = MD5 > accurate = sm > } > > if i remove signature and check only size, i can gain some performance? You might. You also lose integrity checks - is the file I just restored the same as the one I backed up. Do your filesystem have any snapshot capabilities? With that many files, backing up a snapshot would give you better results with respect to consistency. -- Dan Langille d...@langille.org ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Strategies for backup of millions of files...
On 7/15/24 9:26 AM, Marco Gaiarin wrote: We have found that a dir (containing mostly home directories) with roughly one and a half million files, took too much time to be backud up; it is not a problem of backup media, also with spooling it took hours to prepare a spool. There's some strategy i can accomplish to reduce backup time (bacula side; clearly we have to work also filesystem side...)? For example, currently i have: Options { Signature = MD5 accurate = sm } if i remove signature and check only size, i can gain some performance? Thanks. Hello Marco, The typical way to help with this type of situation is to create several Fileset/Job pairs and then run them all concurrently. Each Job would be reading a different set of directories. Doing something like backing user home directories that begin with [a-g], [h-m], [n-s], [t-z] in four or more different concurrent jobs. A coup le FileSet examples that should work in how I described: 8< FileSet { Name = Homes_A-G Include { Options { signature = sha1 compression = zstd regexdir = "/home/[a-g]" } Options { exclude = yes regexdir = "/home/.*" } File = /home } } FileSet { Name = Homes_H-M Include { Options { signature = sha1 compression = zstd regexdir = "/home/[h-m]" } Options { exclude = yes regexdir = "/home/.*" } File = /home } } ...and so on... 8< Hope this helps! Bill -- Bill Arlofski w...@protonmail.com signature.asc Description: OpenPGP digital signature ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users