Hello,

pon., 7 mar 2022 o 12:44 egoitz--- via Bacula-devel <
bacula-devel@lists.sourceforge.net> napisał(a):

> - I'm working in trying to create an open source version of the delta
> encoding plugin by using the bacula-fd plugin api. When working on it I
> have seen Bacula's source is aware of delta and delta file sequentiation. I
> have seen for instance, even a .bvfs command exists for showing deltas of a
> file id. But, what  I have not found is that Bacula works on that Delta
> files generation (patch generation, signatures, etc...).
>
"delta files generation" - whatever it is is solely responsible for the
plugin.
Bacula provides a delta sequencing mechanism only.

> I assume that Bacula in the non-fd part, acts just as a just delta file
> holder keeping the files and stores the patch sequentiation just that.
>
It is stored in the catalog, so during restore the Director knows that it
has to include all incremental files in sequence instead of the latest one.

> Bacula keeps records of deltas in database (and file storages) but only fd
> works with them (with probably a library like librsync in the delta plugin)
>
No, the delta sequencing - a.k.a. block level incrementals is handled by fd
and dir where a fd/plugin is responsible for all diff + patches.

> in the sense of applying patches over an original file or even generating
> deltas when backup. Am I wrong?. Was just for understanding the nice work
> done and what's already written and free in Bacula's source for this
> purpose.
>
Yes, the fd/plugin is responsible for all "the dirty work" and Bacula
"framework" helps organize it.

> - By the way, I have one question about virtual files. I have not seen
> very clear (perhaps my problem as don't understand it) how to work with
> them. I understand the concept, but have not seen a clear example of how
> for instance in the backup you create a virtual file, how do you see it in
> bvfs and finally... what you get after restoring. In page 36/146 of Bacula
> 11 for developers pdf, you say "This will create a virtual file." but
> really you are entering in the structure :
>
> sp->type = FT_REG;
> sp->statp.st_mode = 0700 | S_IFREG;
>
> FT_REG and S_IFREG both are for regular files.... what exactly causes a
> virtual file to be created?. Perhaps st_size -1?.
>
In this sense a "virtual" is a file which does not exist on a backup server
and is created programmatically with a plugin API. This file is seen by a
Bacula as an ordinary file with a slightly different backup stream id, so
fd will know what tool to use to restore it.
You can fill "struct stat" with whatever acceptable values you want, where
st_size should match as close as possible the real size of the saved file.
The size == -1 will confuse users.

> Are they relevant for what I'm trying to do?. It seems Bacula handles
> delta sequentiation so... perhaps for this purpose I shouldn't need
> "virtual files"?.
>
No, the virtual files are available in command plugins api only and are
used mainly for creating backups of different applications, i.e. running
databases, where a standard file backup is useless, not optimal or
simply unavailable.

> - I'm planning to implement delta encoding by checking the previous day
> file signature done by librsync. Instead of looking at the filesystem it
> would be nice if I could take a look at that signature in the last backup
> done (yesterday backup).
>
What "signature" are you thinking of?
There is an "accurate catalog query api" in Bacula but as far I know it is
not handling checksums (md5, sha, etc).
You can extend this code if you wish.

> Could it be possible in some manner, that if I see a file passed in
> EventHandleBackupFile() to check if yesterdays signature exists in the
> backup of yesterday, and then read the yesterday signature from the own
> backup?. I mean, instead of having to leave the signature in the being
> backed server's filesystem.
>
Do you want to store your librsync signatures in a catalog database? Did
you count the required size?

> - The last one :) . For restoring, and for the code seen (for instance in
> insert_missing_delta()) I assume Bacula detects we are restoring a delta
> compressed file.
>
Block incremental is far from compressed, IMVHO. BEE Plugins (i.e. vSphere,
Proxmox, XENServer among others) heavily use delta sequencing for VM Image
incremental backups. This functionality is not compression which you can
apply to generated backup stream, i.e. LZO, GZIP or GED.

> Then I assume Bacula restores apart from the own initial file, patches to
> arrive to the day we want to restore to. Am I wrong?.
>
I do not understand your question, so I describe how it works - for fd
command plugin:
- fd command plugin during backup generates delta sequence for a file and
saves incremental information - in most cases as an offset to changed block
and its new content, but it can do anything else, i.e. a QEMU Incremental
Plugin generates an incremental qcow2 image in this step
-  during restore Bacula will send selected fd command plugin the same
information and data what it generates during backup, so for XENServer
Plugin it simply get offset and raw data to perform destination image
patching, but for QEMU Plugin it receives a qcow2  incremental image which
is then used for qcow2 image patching.

> Perhaps later in a post-restore job I could run a shell script that tries
> to find patches pending to be applied to a parent file. I suppose then I
> could apply and the backup would become finally restored. Does some other
> more elegant way you could advise me?.
>
Do not run external scripts for that. It will hurt you more than you think.
:)

best regards
-- 
Radosław Korzeniewski
rados...@korzeniewski.net
_______________________________________________
Bacula-devel mailing list
Bacula-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to