Using 'o' in the fileset accurate option has caused problems with restore for
another user on the list recently.  I don't know if it could also break
estimate.  Have you used the 'o' option in previous incremental backups?

Accurate backups assume that the full path to each file is the same as before.
Has /path/to/ARCHIVES (the root of the backup) changed since previous backups?
Have you renamed/moved any files or directories within the files being backed
up?

You could try to find more info about some file that is being included
unexpectedly by:

1. Add the listing option to the two estimate commands and record the outputs
somewhere.

2. Find a file in the accurate=yes listing that doesn't appear in the
accurate=no listing.

3. Run this SQL command:

SELECT DISTINCT Job.JobId,StartTime AS JobStartTime,VolumeName, File.lstat
 FROM Job,Path,File,Media,JobMedia,Client
 WHERE File.JobId=Job.JobId
 AND File.FileIndex > 0
 AND Path.Path='%1'
 AND File.FileName='%2'
 AND Client.Name='%3'
 AND Path.PathId=File.PathId
 AND JobMedia.JobId=Job.JobId
 AND JobMedia.MediaId=Media.MediaId
 AND Client.ClientId=Job.ClientId
 ORDER BY Job.StartTime DESC LIMIT 1;

where %1 is the full path to the directory of the file with trailing slash, %2
is the filename of the file and %3 is the Bacula client name (lto8-fd).

4. Run the shell commands:

stat /full/path/to/filename

stat -t /full/path/to/filename

__Martin


>>>>> On Wed, 11 Dec 2024 11:37:19 +0100, Samuel Zaslavsky said:
> 
> Hello everyone !
> 
> Any ideas for solving this problem?
> I'm stuck here...
> 
> Thanks a lot for your help,
> 
> Best,
> 
> Sam
> 
> 
> Le ven. 6 déc. 2024 à 12:38, Samuel Zaslavsky <s...@w4tch.tv> a écrit :
> 
> > Hello all,
> >
> > Sorry for this late reply.  (According to Murphy's law, our Bacula server
> > failed, and it took some time to figure out that the SAS card was dead, and
> > get another one ...)
> >
> > I did some tests, and there's something new :
> > In fact, the fileset seems to be OK, and the Incremental "could" resume...
> >
> > If I do an estimate without accurate, I get around 18K files / 3To , which
> > looks like "OK" for incremental :
> >
> > *estimate accurate=no level=incremental job=BackupARCHIVES
> > Using Catalog "MyCatalog"
> > Connecting to Client lto8-fd at localhost:9102
> > 2000 OK estimate files=17,949 bytes=3,465,825,854,672
> >
> > But with accurate=yes , I get 530K files / 23To, which is (almost ? see
> > below) a FULL backup  :
> >
> > *estimate accurate=yes level=incremental job=BackupARCHIVES
> > Using Catalog "MyCatalog"
> > Connecting to Client lto8-fd at localhost:9102
> > 2000 OK estimate files=530,779 bytes=23,005,947,679,372
> >
> > So I have tried to estimate my job with acurate=true and several options
> > for accurate in the fileset :
> >
> > FileSet {
> >   Name = "ARCHIVES"
> >   Ignore FileSet Changes=yes
> >   Include {
> >     Options {
> >                 signature=MD5
> >                 *#I have tried many accurate options here :  mcso5, mso5,
> > so5,M,m,o5,sm,s, or nothing (line commented)*
> >                 accurate=s
> >                 mtimeonly=yes
> >                 #Verify = pin5
> >     }
> >         File = /path/to/ARCHIVES
> >   }
> >
> >   Exclude {
> >         File = "/path/to/ARCHIVES/#recycle"
> >   }
> > }
> >
> > But all my attempts with Job accurate=yes give me almost the same result -
> > but slightly different sometimes ( could be 531,057 files, or 530,719
> > or 530,779 files, depending on the accurate option chosen...)
> >
> > As I understand it, accurate=s in the Fileset definition should tell
> > Bacula that a file with "same path, same size" shouldn't be backed up
> > again...
> > But this isn't the case and my understanding is clearly wrong...
> > And as I understand it also, accurate=no would lead, in case of restore,
> > to restore even suppressed and moved files, which is not what I want.
> >
> > So I am stuck : resume incremental backup without accurate option, or
> > restart a full backup with accurate option. Or understand better things ...
> > :)
> > Can anyone help me there ?
> >
> > Thanks a lot !
> >
> > Sam
> >
> >
> > Le mar. 12 nov. 2024 à 17:18, Radosław Korzeniewski <
> > rados...@korzeniewski.net> a écrit :
> >
> >> Hello,
> >>
> >> wt., 12 lis 2024 o 12:00 Samuel Zaslavsky <s...@w4tch.tv> napisał(a):
> >>
> >>> So, precisely, how does Bacula "decides" a fileset is different from
> >>> another one ?
> >>>
> >>
> >> Bacula is checking the fileset changes based on a fileset content.
> >>
> >>
> >>> Where is this information stored in the database ?
> >>>
> >>
> >> It is a fileset table.
> >>
> >>
> >>> How could I trick bacula to believe that all previous jobs have been
> >>> made with the new fileset : This would be OK for me !
> >>> My idea was that it would be using the fileset.md5 field (or createtime
> >>> ?)
> >>>
> >>
> >> From what I know Bacula is checking the md5 column for this purpose. So,
> >> if your new fileset md5 will be the same of which is already saved then
> >> Bacula won't have any traces it is a different fileset.
> >>
> >>
> >>>
> >>> So, a few related questions : When will bacula add another entry in the
> >>> fileset database table ?
> >>>
> >>
> >> If your current configuration includes a selected fileset and md5 does
> >> not match the one saved in the database then Bacula will create a new 
> >> entry.
> >>
> >>
> >>> It seems it's not when doing an estimate...
> >>> So, should I start ( and abort quickly) a dummy job using the "new"
> >>> fileset, to see the "new" fileset in the database ? And then try to trick
> >>> Bacula into thinking the old one is the new one ?
> >>>
> >>
> >> Yes, it sounds good to me.
> >>
> >>
> >>> If this is a pertinent approach, how exactly do that ?
> >>> For example, where in the database is the information that a "file"
> >>> belongs to some files set ?
> >>>
> >>
> >> Database relation.
> >> bacula=# \d file
> >>                                 Table "public.file"
> >>   Column   |   Type   | Collation | Nullable |               Default
> >>
> >>
> >> -----------+----------+-----------+----------+--------------------------------------
> >>  fileid    | bigint   |           | not null |
> >> nextval('file_fileid_seq'::regclass)
> >>  jobid     | integer  |           | not null |
> >> ...
> >> bacula=# \d job
> >>                                              Table "public.job"
> >>       Column       |            Type             | Collation | Nullable |
> >>              Default
> >>
> >> -------------------+-----------------------------+-----------+----------+------------------------------------
> >>  jobid             | integer                     |           | not null |
> >> nextval('job_jobid_seq'::regclass)
> >>  filesetid         | integer                     |           |          |
> >> 0
> >> ...
> >> bacula=# \d fileset
> >>                                             Table "public.fileset"
> >>    Column   |            Type             | Collation | Nullable |
> >>            Default
> >>
> >> ------------+-----------------------------+-----------+----------+--------------------------------------------
> >>  filesetid  | integer                     |           | not null |
> >> nextval('fileset_filesetid_seq'::regclass)
> >>
> >> Is it in the file.lstat column ? If yes, how is it "encoded" here ?
> >>>
> >>
> >> This column is a "direct" serialisation of lstat(2), struct stat {};
> >> buffer. It is (to some extent) OS dependent.
> >>
> >>
> >>>
> >>> Basically, I need to understand how the relationship between fileset
> >>> conf <--> fileset in database <--> files in fileset is handled...
> >>>
> >>>
> >> When Bacula detects a fileset change, because the md5 is different then
> >> the one saved in the database then it creates a new entry. Then a job will
> >> get a new filesetid as a reference for job entry in the database and all
> >> files saved will get a jobid reference.
> >>
> >>
> >>> Or maybe I am completely wrong here ?
> >>>
> >>
> >> You are just asking questions. You can't be wrong asking questions.
> >>
> >>
> >>> The filesystem to be backed up is mounted as NFS share on the bacula
> >>> host.
> >>>
> >>
> >> Any chance you can access this filesystem directly without mounting it on
> >> the remote host? It would be better.
> >>
> >>
> >>> Is it possible that the reboot/remount changed a key parameter for
> >>> Bacula ?
> >>>
> >>
> >> Well. Let's assume your first mount point is: /mnt/data. Then you reboot
> >> and the new mount point is /mnt/data1. Then it makes a huge difference for
> >> Bacula, for sure.
> >>
> >>
> >>> So that Bacula recognizes well the filesystem as the old one,
> >>>
> >>
> >> It doesn't matter. What matters are some metadata for the file, i.e.
> >> modify time. You can configure what metadata should be verified in this
> >> case. But you can't force Bacula to not backup a file which does not exist
> >> at the previous backup jobs.
> >>
> >>
> >>> but doesn't recognize any file anymore ? ( Remember that fileset name,
> >>> and all paths in the fileset configuration are the same as before ! )
> >>>
> >>
> >> I'm pretty sure your fileset content is different then the one used
> >> before, so Bacula wants to create a full backup.
> >>
> >> best regards
> >> --
> >> Radosław Korzeniewski
> >> rados...@korzeniewski.net
> >>
> >
> 


_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to