Hi Sebastian,

Sorry for this late response...

So I guess I managed to solve my problems.
Probably obvious for any Bacula expert, but here's what I did :

I listed the files of the last good job before "redundant jobs" occur, and
all following jobs.
I tried to identify what Job I could delete (because they had
redundant files ...) and which ones to keep.

I ended up keeping the last good job and the penultimate job who was the
last with no errors and which had every file that the other jobs had (well
I hope so).
All other jobs starting from the last good one were to delete.
I listed all the volumes used by the jobs to delete (and not used by the
jobs to keep)
I deleted all the jobs to be deleted and purged all the volumes associated
to these jobs.
I also had to use update command to set these volumes to recycle=yes.

I finally got back 7 full LTO-8 tapes.
I hope it's OK now.

Regarding my configuration, you pointed me to some improvements.

So, in the end, I just want to thank you a lot for the time and information
you gave me !

Take care !

Samuel

Le ven. 8 oct. 2021 à 22:32, <neumei...@mail.de> a écrit :

> Hello Samuel,
> I also wrote my answer under the text-blocks and summarized it at the end,
> to get it a little bit clearer, because the email is a little bit lengthy.
> If you want to answer to this you can leave everthing out except the
> summarization. That should get the email way shorter again an makes it
> easyier for other people to read by. Thank you.
>
> > Le jeu. 30 sept. 2021 à 02:23, <neumei...@mail.de> a écrit :
> > > >
> > > > I mean that my goal is to save any new file, once and permanently.
> > > > If monday I have file1 on my nas, I want it to be saved on tape.
> > > > Tuesday I add file2 : I want it to be saved on tape.
> > > > Wednesday, file1 is deleted from NAS: it's a mistake, and I still
> want to keep file1 forever on tape (and be able to restore it ).
> > > > Every file that has existed once on my NAS must be saved permanently
> on a tape.
> > > >
> > > Okay I understand it like this:
> > > You have done one full backup at the beginning. After that, you are
> doing incremental backups every night to save every new file. If the tape
> is full it gets packed away as archive and never gets rewritten? Right?
> > > Your primary goal is to save your current data and archive "deleted"
> files forever?
> >
> > Yes ! Exactly.
>
> Okay!
>
>
> > > I don't use tapes, but I think if you do incremental backups and you
> want to restore something you need to insert a huge part of the tapes
> because bacula needs to read them.(I'm not sure about that.)
> > > If bacula should to this, you will have a huge problem if you want to
> restore a file in lets say 10years.
> >
> > Not really. I can do a restore job, searching by finename. If the tape
> is not in the library, Bacula asks me to put it in...I've tested this
> procedure a few times, it works.
>
> Okay, I trust you in this.
>
> > > And being honest I really don't like the idea of doing incremental
> backups endlessly without differential- and full-backups in between(I wrote
> more about that later)
> > >
> > > > Let me show you my (simplified)  configuration :
> > > >
> > > > I mounted ( nfs ) my first NAS on, say, /mnt/NAS1/
> > > > My file set is :  FileSet {
> > > > Name = "NAS1"
> > > > File = /mnt/NAS1
> > > > }
> > > >
> > > > My job is  Job {
> > > > Name = "BackupNAS1"
> > > > JobDefs = "DefaultJob"
> > > > Level = Incremental
> > > > FileSet="NAS1"
> > > > #Accurate = yes # Not clear what I should do here. activate to yes
> seemed to add many unwanted files - probably moved/renamed files ?
> > > > Pool = BACKUP1
> > > > Storage = ScalarI3-BACKUP1 # this is my tape library
> > > > Schedule = NAS1Daily #run every day
> > > >
> > > > }
> > > >
> > > > with
> > > > JobDefs {
> > > > Name = "DefaultJob"
> > > > Type = Backup
> > > > Level = Incremental
> > > > Client = lto8-fd
> > > > FileSet = "Test File Set"
> > > > Messages = Standard
> > > > SpoolAttributes = yes
> > > > Priority = 10
> > > > Write Bootstrap = "/var/lib/bacula/%c.bsr"
> > > > }
> > > >
> > > > My pool is :
> > > > Pool {
> > > > Name = BACKUP1
> > > > Pool Type = Backup
> > > > Recycle = no
> > > > AutoPrune = no
> > > > Volume Retention = 100 years
> > > > Job Retention = 100 years
> > > > Maximum Volume Bytes = 0
> > > > Maximum Volumes = 1000
> > > > Storage = ScalarI3-BACKUP1
> > > > Next Pool = BACKUP1
> > > > }
> > >
> > > To your .conf:
> > > -under JobDefs-DefaultJob :you declare  "FileSet = "Test File Set""
> and in your jobdef you declare "FileSet="NAS1"" if that's your standard
> fileset, set it like this or try to ommit it in the jobdef. It is a little
> bit confusing.
> > OK
> > > -you use the "Next Pool"-Ressource in your Pool. Documentation states:
> it belongs under Schedule>Run>Next Pool. Either way it describes a
> migrating job. I think that's not what you want to do?
> > I had tried a "virtual backup", so that all my incremental jobs merge
> into one, periodically. I thought it was only virtual, only dealing with
> the catalog data, but it seems I can do that only by recreating a whole
> bunch of volumes.
> > I have hundreds of TeraOctets of datas and I don't want to do that ! So
> I let the incremental jobs running. Let aside my current problem, it's
> convenient for what I need...
>
> Okay, I noted that you did "virtual backups". As far as I know is a
> "virtual full-backup" something where bacula reads incremental- and
> differential- backups and the last full-backup and constructs a new full
> backup out of them without sending all of the data over the network. See:
> https://www.baculasystems.com/incremental-backup-software/  This Site
> states: "[...]Virtual Full” in Bacula terminology). With this technique
> Bacula's software calculates a new full backup from all differential and
> incremental backups that followed the initial full backup, without the
> requirement of another full data transfer over the network."  I also took
> note that you have a lot of data to manage.
>
> > > If I would be in your place, I would do it differently(assumed i got
> your primary goal right that you want to save your current data and archive
> "deleted" files forever?):
> > > -first I would set nfs user permissions, if nfs or samba doesn't to
> the trick I would straight head to nextcloud(also Open-Source with a
> pricing plan for companies)
> > > Why?: -> you can set permissions that your users can't delete their
> files and are forced to move it in a archive-folder with a good
> naming-convention, when they want to get rid of it(maybe you can automate
> it and the files go in an archive-folder, when your users hit the delete
> button and not to the bin. Should they do mistakes, it's up to you, to
> figure out the right file(might not be that clean).
> > > -> having a good naming-convention and some sort of documentation
> makes it 1million times easier to find the right file in the future.
> >
> > We have all this settle, in a way or another. But, I still need to give
> some full rights to some users , and the problem is more : what if the NAS
> burns or what if 3 HDD crash at the same time etc...
> > I want a robust and simple backup solution in case of rare event...
> >
> > > I think you have two major goals: 1. keeping the productiveData(the
> data your users are currently using) save and 2. archiving old files, your
> users don't need anymore.
> > No. Every file could be used any time. Any file is available.
> > > To achieve the first goal I would go ahead and implement a
> backup-strategy with three pools(one incremental-, one differential-, one
> full-pool) and rotating tapes(rewriting them after a given time).
> >
> > I have hundreds of TeraOctets. This would mean doubling or more the
> space I need...
>
> Okay, I clearly wouldn't suggest that but that's up to you and your
> decision.
>
>
> > > Should one of your NASs fail on you, you will be able to restore the
> files by yourself fast, keeping offline-time short, and therefore
> blackout-costs small.
> > >
> > > To achieve the second goal I would go ahead and search for companies
> that are specialized in securing data for a long time(I've read about them
> in a book). The first idea I have had, was using a tape like a "special
> harddrive". Collect the files your users don't need anymore somewhere and
> write them once to a tape and label it by hand. If something happens to
> this tape, the data will be gone. I don't like this idea. I wouldn't do
> that. Probably the best idea would be to call a data-securing-company which
> do that job for you. Either way I wouldn't keep the productiveData-tapes
> and the archiv-tapes at the same spot(that would be another pro for the
> data-securing-company), because it violates the
> 3-2-1-backup-rule(everything will be gone when disaster strikes(flood,
> fire, hurricane....)). If you don't know about the 3-2-1-backup-rule please
> look it up on the internet(this rule discusses good backups in more detail).
> >
> > My idea was : when one volume is full, store it in another place... So
> it was OK, I guess.
> > > > I'm not sure I fully understand here : you say "since the volume-use
> duration is set to short." . But I believe it's exactly the contrary here :
> my volume-use duration is set to 100 years !? isn't it ?.
> > > Yes, is exactly the contrary. I'm not sure but that shouldn't be a
> problem. If you want to write to it indefinitely you can specify it as
> 0(the default) as specified in the documentation(chapter: configuring the
> director).
> > >
> > > > > In the bacula-dir.conf you specified the director ressource type
> "Messages" there is a option called "append"
> > > > > A part of my bacula-dir.conf:
> > > > > # WARNING! the following will create a file that you must cycle
> from
> > > > > # time to time as it will grow indefinitely. However, it will
> > > > > # also keep all your messages if they scroll off the console.
> > > > > append = "/var/log/bacula/bacula.log" = all, !skipped
> > > > > console = all, !skipped
> > > > >
> > > > > At the end "all, !skipped" are the types or classes of messages
> which go into it. They are described in more detail in the "Messages
> Resource"-Chapter:
> > > > >
> https://www.bacula.org/11.0.x-manuals/en/main/Messages_Resource.html
> > > > >
> > > > > If I type the "messages"-command in the bconsole the output is in
> my case in both cases the same.
> > > > >
> > > >
> > > > This is regarding logs, right ? Doesn't seem to apply to me here.
> I'm dealing with big video files being unnecessarily saved 10, 15 or 20
> times on tapes....
> > > > Or maybe I missed something here ?
> > > In your last email, you asked "Specifically, how do you go about
> identifying exactly which volumes / jobid are to be "deactivated" and how
> do you do that?
> > > You know the day where everything came to a halt. Knowing this, you
> can look through your logs which jobs ran on that day. For every Job there
> is a longer list with one tag named "Volume name(s)".
> > > Under this tag are the volumes listed, that got used in that job.
> > > Sorry for making it not clearer.
> >
> > I understand very clearly. But this is going to be quite long to
> check, because I have also to see what job has got "new" files. I was
> hoping there would be a way to "deduplicate" files in jobs, and jobs in
> incremental backup ...
> > Well it seems I have to do this by hand ?
>
> I don't know a faster way, so yes doing it by hand is probably the only
> way. You can try to write a script but that get's also very tedious, and if
> there is a mistake in the script you probably get an even bigger problem. I
> wouldn't do that.
>
> > > Maybe there is someone who has experiences with a similar backup-job
> or such data-securing-companies and can help you better.
> >
> > Anyway, I think my case shows a kind of misconception ( or
> misconfiguration ? ) : If an incremental job is delayed for some reason,
> why should it backup many times the same file !? How to avoid that ?
>
> Yes,  I can help you with this.
>
> I would suggest that we first go through your bacula-dir.conf and search
> for mistakes to set up the system as you intended in the beginning even I
> clearly don't recommend it. But I unterstand the problem you are in and I'm
> the one in the positioin of easy talking.
> Summarization/things I want to point out or already mentioned:
> -"under JobDefs-DefaultJob :you declare  "FileSet = "Test File Set"" and
> in your jobdef you declare "FileSet="NAS1"" if that's your standard
> fileset, set it like this or try to ommit it in the jobdef. It is a little
> bit confusing."
> -"you use the "Next Pool"-Ressource in your Pool. Documentation states: it
> belongs under Schedule>Run>Next Pool. Either way it describes a migrating
> job. I think that's not what you want to do?"
> -"[...]volume-use duration is set to 100 years[...]" "If you want to write
> to it indefinitely you can specify it as 0(the default) as specified in the
> documentation(chapter: configuring the director)."
> -I slightly changed the Fileset to make it fit how it's done in the
> documentation.
>
> Fileset {
>     Name = "NAS1"
>     Include {
>         Options{
>             signature = SHA1
>                      }
>          File = "/mnt/NAS1"
>                 }
> #  Exclude {
> #       File =
> #                }
> }
>
> My job is  Job {
> Name = "BackupNAS1"
> JobDefs = "DefaultJob"
> Level = Incremental
> FileSet="NAS1"
> #Accurate = yes # Not clear what I should do here. activate to yes seemed
> to add many unwanted files - probably moved/renamed files ?
> Pool = BACKUP1
> Storage = ScalarI3-BACKUP1 # "this is my tape library"
> Schedule = NAS1Daily # "run every day"
> }
>
> JobDefs {
> Name = "DefaultJob"
> Type = Backup
> Level = Incremental
> Client = lto8-fd
> # FileSet = "Test File Set" Try leaving it out
> Messages = Standard
> SpoolAttributes = yes
> Priority = 10
> Write Bootstrap = "/var/lib/bacula/%c.bsr"
> }
>
> Pool {
> Name = BACKUP1
> Pool Type = Backup
> Recycle = no
> AutoPrune = no
> Volume Retention = 100 years # set to 0 to disable
> Job Retention = 100 years
> Maximum Volume Bytes = 0
> Maximum Volumes = 1000 # might get you in trouble, set it to 0 to permit
> any number of volumes
> Storage = ScalarI3-BACKUP1
> # Next Pool = BACKUP1 doesn't belong here
> }
>
>
> I would like to also have a look at your schedule-resource. If it's
> possible.
>
> Is it possible that you accidently added the fileset multiple times and
> you are doing multiple backups of the same files?
> Documentaion states: "Take special care not to include a directory twice
> or Bacula will backup the same files two times wasting a lot of space on
> your archive device. Including a directory twice is very easy to do. For
> example:"
>
>   Include {
>     Options {compression=GZIP }
>     File = /
>     File = /usr
>   }
>
>
>
> I hope that helps.
>
> Sebastian
>
>
> ------------------------------
> FreeMail powered by mail.de - *mehr Sicherheit, Seriosität und Komfort*
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to