Hi Sebastian, Sorry for this late response...
So I guess I managed to solve my problems. Probably obvious for any Bacula expert, but here's what I did : I listed the files of the last good job before "redundant jobs" occur, and all following jobs. I tried to identify what Job I could delete (because they had redundant files ...) and which ones to keep. I ended up keeping the last good job and the penultimate job who was the last with no errors and which had every file that the other jobs had (well I hope so). All other jobs starting from the last good one were to delete. I listed all the volumes used by the jobs to delete (and not used by the jobs to keep) I deleted all the jobs to be deleted and purged all the volumes associated to these jobs. I also had to use update command to set these volumes to recycle=yes. I finally got back 7 full LTO-8 tapes. I hope it's OK now. Regarding my configuration, you pointed me to some improvements. So, in the end, I just want to thank you a lot for the time and information you gave me ! Take care ! Samuel Le ven. 8 oct. 2021 à 22:32, <neumei...@mail.de> a écrit : > Hello Samuel, > I also wrote my answer under the text-blocks and summarized it at the end, > to get it a little bit clearer, because the email is a little bit lengthy. > If you want to answer to this you can leave everthing out except the > summarization. That should get the email way shorter again an makes it > easyier for other people to read by. Thank you. > > > Le jeu. 30 sept. 2021 à 02:23, <neumei...@mail.de> a écrit : > > > > > > > > I mean that my goal is to save any new file, once and permanently. > > > > If monday I have file1 on my nas, I want it to be saved on tape. > > > > Tuesday I add file2 : I want it to be saved on tape. > > > > Wednesday, file1 is deleted from NAS: it's a mistake, and I still > want to keep file1 forever on tape (and be able to restore it ). > > > > Every file that has existed once on my NAS must be saved permanently > on a tape. > > > > > > > Okay I understand it like this: > > > You have done one full backup at the beginning. After that, you are > doing incremental backups every night to save every new file. If the tape > is full it gets packed away as archive and never gets rewritten? Right? > > > Your primary goal is to save your current data and archive "deleted" > files forever? > > > > Yes ! Exactly. > > Okay! > > > > > I don't use tapes, but I think if you do incremental backups and you > want to restore something you need to insert a huge part of the tapes > because bacula needs to read them.(I'm not sure about that.) > > > If bacula should to this, you will have a huge problem if you want to > restore a file in lets say 10years. > > > > Not really. I can do a restore job, searching by finename. If the tape > is not in the library, Bacula asks me to put it in...I've tested this > procedure a few times, it works. > > Okay, I trust you in this. > > > > And being honest I really don't like the idea of doing incremental > backups endlessly without differential- and full-backups in between(I wrote > more about that later) > > > > > > > Let me show you my (simplified) configuration : > > > > > > > > I mounted ( nfs ) my first NAS on, say, /mnt/NAS1/ > > > > My file set is : FileSet { > > > > Name = "NAS1" > > > > File = /mnt/NAS1 > > > > } > > > > > > > > My job is Job { > > > > Name = "BackupNAS1" > > > > JobDefs = "DefaultJob" > > > > Level = Incremental > > > > FileSet="NAS1" > > > > #Accurate = yes # Not clear what I should do here. activate to yes > seemed to add many unwanted files - probably moved/renamed files ? > > > > Pool = BACKUP1 > > > > Storage = ScalarI3-BACKUP1 # this is my tape library > > > > Schedule = NAS1Daily #run every day > > > > > > > > } > > > > > > > > with > > > > JobDefs { > > > > Name = "DefaultJob" > > > > Type = Backup > > > > Level = Incremental > > > > Client = lto8-fd > > > > FileSet = "Test File Set" > > > > Messages = Standard > > > > SpoolAttributes = yes > > > > Priority = 10 > > > > Write Bootstrap = "/var/lib/bacula/%c.bsr" > > > > } > > > > > > > > My pool is : > > > > Pool { > > > > Name = BACKUP1 > > > > Pool Type = Backup > > > > Recycle = no > > > > AutoPrune = no > > > > Volume Retention = 100 years > > > > Job Retention = 100 years > > > > Maximum Volume Bytes = 0 > > > > Maximum Volumes = 1000 > > > > Storage = ScalarI3-BACKUP1 > > > > Next Pool = BACKUP1 > > > > } > > > > > > To your .conf: > > > -under JobDefs-DefaultJob :you declare "FileSet = "Test File Set"" > and in your jobdef you declare "FileSet="NAS1"" if that's your standard > fileset, set it like this or try to ommit it in the jobdef. It is a little > bit confusing. > > OK > > > -you use the "Next Pool"-Ressource in your Pool. Documentation states: > it belongs under Schedule>Run>Next Pool. Either way it describes a > migrating job. I think that's not what you want to do? > > I had tried a "virtual backup", so that all my incremental jobs merge > into one, periodically. I thought it was only virtual, only dealing with > the catalog data, but it seems I can do that only by recreating a whole > bunch of volumes. > > I have hundreds of TeraOctets of datas and I don't want to do that ! So > I let the incremental jobs running. Let aside my current problem, it's > convenient for what I need... > > Okay, I noted that you did "virtual backups". As far as I know is a > "virtual full-backup" something where bacula reads incremental- and > differential- backups and the last full-backup and constructs a new full > backup out of them without sending all of the data over the network. See: > https://www.baculasystems.com/incremental-backup-software/ This Site > states: "[...]Virtual Full” in Bacula terminology). With this technique > Bacula's software calculates a new full backup from all differential and > incremental backups that followed the initial full backup, without the > requirement of another full data transfer over the network." I also took > note that you have a lot of data to manage. > > > > If I would be in your place, I would do it differently(assumed i got > your primary goal right that you want to save your current data and archive > "deleted" files forever?): > > > -first I would set nfs user permissions, if nfs or samba doesn't to > the trick I would straight head to nextcloud(also Open-Source with a > pricing plan for companies) > > > Why?: -> you can set permissions that your users can't delete their > files and are forced to move it in a archive-folder with a good > naming-convention, when they want to get rid of it(maybe you can automate > it and the files go in an archive-folder, when your users hit the delete > button and not to the bin. Should they do mistakes, it's up to you, to > figure out the right file(might not be that clean). > > > -> having a good naming-convention and some sort of documentation > makes it 1million times easier to find the right file in the future. > > > > We have all this settle, in a way or another. But, I still need to give > some full rights to some users , and the problem is more : what if the NAS > burns or what if 3 HDD crash at the same time etc... > > I want a robust and simple backup solution in case of rare event... > > > > > I think you have two major goals: 1. keeping the productiveData(the > data your users are currently using) save and 2. archiving old files, your > users don't need anymore. > > No. Every file could be used any time. Any file is available. > > > To achieve the first goal I would go ahead and implement a > backup-strategy with three pools(one incremental-, one differential-, one > full-pool) and rotating tapes(rewriting them after a given time). > > > > I have hundreds of TeraOctets. This would mean doubling or more the > space I need... > > Okay, I clearly wouldn't suggest that but that's up to you and your > decision. > > > > > Should one of your NASs fail on you, you will be able to restore the > files by yourself fast, keeping offline-time short, and therefore > blackout-costs small. > > > > > > To achieve the second goal I would go ahead and search for companies > that are specialized in securing data for a long time(I've read about them > in a book). The first idea I have had, was using a tape like a "special > harddrive". Collect the files your users don't need anymore somewhere and > write them once to a tape and label it by hand. If something happens to > this tape, the data will be gone. I don't like this idea. I wouldn't do > that. Probably the best idea would be to call a data-securing-company which > do that job for you. Either way I wouldn't keep the productiveData-tapes > and the archiv-tapes at the same spot(that would be another pro for the > data-securing-company), because it violates the > 3-2-1-backup-rule(everything will be gone when disaster strikes(flood, > fire, hurricane....)). If you don't know about the 3-2-1-backup-rule please > look it up on the internet(this rule discusses good backups in more detail). > > > > My idea was : when one volume is full, store it in another place... So > it was OK, I guess. > > > > I'm not sure I fully understand here : you say "since the volume-use > duration is set to short." . But I believe it's exactly the contrary here : > my volume-use duration is set to 100 years !? isn't it ?. > > > Yes, is exactly the contrary. I'm not sure but that shouldn't be a > problem. If you want to write to it indefinitely you can specify it as > 0(the default) as specified in the documentation(chapter: configuring the > director). > > > > > > > > In the bacula-dir.conf you specified the director ressource type > "Messages" there is a option called "append" > > > > > A part of my bacula-dir.conf: > > > > > # WARNING! the following will create a file that you must cycle > from > > > > > # time to time as it will grow indefinitely. However, it will > > > > > # also keep all your messages if they scroll off the console. > > > > > append = "/var/log/bacula/bacula.log" = all, !skipped > > > > > console = all, !skipped > > > > > > > > > > At the end "all, !skipped" are the types or classes of messages > which go into it. They are described in more detail in the "Messages > Resource"-Chapter: > > > > > > https://www.bacula.org/11.0.x-manuals/en/main/Messages_Resource.html > > > > > > > > > > If I type the "messages"-command in the bconsole the output is in > my case in both cases the same. > > > > > > > > > > > > > This is regarding logs, right ? Doesn't seem to apply to me here. > I'm dealing with big video files being unnecessarily saved 10, 15 or 20 > times on tapes.... > > > > Or maybe I missed something here ? > > > In your last email, you asked "Specifically, how do you go about > identifying exactly which volumes / jobid are to be "deactivated" and how > do you do that? > > > You know the day where everything came to a halt. Knowing this, you > can look through your logs which jobs ran on that day. For every Job there > is a longer list with one tag named "Volume name(s)". > > > Under this tag are the volumes listed, that got used in that job. > > > Sorry for making it not clearer. > > > > I understand very clearly. But this is going to be quite long to > check, because I have also to see what job has got "new" files. I was > hoping there would be a way to "deduplicate" files in jobs, and jobs in > incremental backup ... > > Well it seems I have to do this by hand ? > > I don't know a faster way, so yes doing it by hand is probably the only > way. You can try to write a script but that get's also very tedious, and if > there is a mistake in the script you probably get an even bigger problem. I > wouldn't do that. > > > > Maybe there is someone who has experiences with a similar backup-job > or such data-securing-companies and can help you better. > > > > Anyway, I think my case shows a kind of misconception ( or > misconfiguration ? ) : If an incremental job is delayed for some reason, > why should it backup many times the same file !? How to avoid that ? > > Yes, I can help you with this. > > I would suggest that we first go through your bacula-dir.conf and search > for mistakes to set up the system as you intended in the beginning even I > clearly don't recommend it. But I unterstand the problem you are in and I'm > the one in the positioin of easy talking. > Summarization/things I want to point out or already mentioned: > -"under JobDefs-DefaultJob :you declare "FileSet = "Test File Set"" and > in your jobdef you declare "FileSet="NAS1"" if that's your standard > fileset, set it like this or try to ommit it in the jobdef. It is a little > bit confusing." > -"you use the "Next Pool"-Ressource in your Pool. Documentation states: it > belongs under Schedule>Run>Next Pool. Either way it describes a migrating > job. I think that's not what you want to do?" > -"[...]volume-use duration is set to 100 years[...]" "If you want to write > to it indefinitely you can specify it as 0(the default) as specified in the > documentation(chapter: configuring the director)." > -I slightly changed the Fileset to make it fit how it's done in the > documentation. > > Fileset { > Name = "NAS1" > Include { > Options{ > signature = SHA1 > } > File = "/mnt/NAS1" > } > # Exclude { > # File = > # } > } > > My job is Job { > Name = "BackupNAS1" > JobDefs = "DefaultJob" > Level = Incremental > FileSet="NAS1" > #Accurate = yes # Not clear what I should do here. activate to yes seemed > to add many unwanted files - probably moved/renamed files ? > Pool = BACKUP1 > Storage = ScalarI3-BACKUP1 # "this is my tape library" > Schedule = NAS1Daily # "run every day" > } > > JobDefs { > Name = "DefaultJob" > Type = Backup > Level = Incremental > Client = lto8-fd > # FileSet = "Test File Set" Try leaving it out > Messages = Standard > SpoolAttributes = yes > Priority = 10 > Write Bootstrap = "/var/lib/bacula/%c.bsr" > } > > Pool { > Name = BACKUP1 > Pool Type = Backup > Recycle = no > AutoPrune = no > Volume Retention = 100 years # set to 0 to disable > Job Retention = 100 years > Maximum Volume Bytes = 0 > Maximum Volumes = 1000 # might get you in trouble, set it to 0 to permit > any number of volumes > Storage = ScalarI3-BACKUP1 > # Next Pool = BACKUP1 doesn't belong here > } > > > I would like to also have a look at your schedule-resource. If it's > possible. > > Is it possible that you accidently added the fileset multiple times and > you are doing multiple backups of the same files? > Documentaion states: "Take special care not to include a directory twice > or Bacula will backup the same files two times wasting a lot of space on > your archive device. Including a directory twice is very easy to do. For > example:" > > Include { > Options {compression=GZIP } > File = / > File = /usr > } > > > > I hope that helps. > > Sebastian > > > ------------------------------ > FreeMail powered by mail.de - *mehr Sicherheit, Seriosität und Komfort*
_______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users