Re: [BackupPC-users] Run command per file before storing in the pool

2022-02-18 Thread Bruno Rogério Fernandes



Em 18/02/2022 07:33, Adam Goryachev via BackupPC-users escreveu:


On 17/2/2022 23:43, Bruno Rogério Fernandes wrote:

Maybe I've got a solution

Instead of modifying backuppc behavior, I'm planning to disable 
compression setting at the server and create a FUSE filesystem that 
transparently compresses all the images using jpeg-xl format and put 
backuppc pool on top of that.


The only problem I can think of is that every time backuppc has to do 
some reading the FUSE will also need to decompress images on the fly. 
I have to do some testing because my server is not much powerful, just 
a dual-core system. 


I was thinking of something sort of similar

Why not use a fuse filesystem on the client, which acts as a kind of 
overlay All directory operations are transparently passed through to 
the native storage location. Reads/writes however are filtered by the 
"compression" before being transferred to the server. The saved bytes at 
the end are converted to null which keeps the file length the same as 
the server expects, but will compress well with pretty much any 
compression algorithm.


By not modifying the directory information, all the rsync comparisons 
will work without any modification. There is no added load for backuppc, 
and in addition, there is no change to the client when accessing images 
since it would access the real location, not the FUSE mounted version.


Just my thoughts...


This is a good alternative. The downside is that conversion will be done 
every time backuppc tries to read the files through rsync, e.g. on full 
backups. But it's a minor problem, as libjxl can convert files 
losslessly almost instantly.


Thank you all







___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/



___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/


Re: [BackupPC-users] Run command per file before storing in the pool

2022-02-18 Thread Adam Goryachev via BackupPC-users


On 17/2/2022 23:43, Bruno Rogério Fernandes wrote:

Maybe I've got a solution

Instead of modifying backuppc behavior, I'm planning to disable 
compression setting at the server and create a FUSE filesystem that 
transparently compresses all the images using jpeg-xl format and put 
backuppc pool on top of that.


The only problem I can think of is that every time backuppc has to do 
some reading the FUSE will also need to decompress images on the fly. 
I have to do some testing because my server is not much powerful, just 
a dual-core system. 


I was thinking of something sort of similar

Why not use a fuse filesystem on the client, which acts as a kind of 
overlay All directory operations are transparently passed through to 
the native storage location. Reads/writes however are filtered by the 
"compression" before being transferred to the server. The saved bytes at 
the end are converted to null which keeps the file length the same as 
the server expects, but will compress well with pretty much any 
compression algorithm.


By not modifying the directory information, all the rsync comparisons 
will work without any modification. There is no added load for backuppc, 
and in addition, there is no change to the client when accessing images 
since it would access the real location, not the FUSE mounted version.


Just my thoughts...




___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/


Re: [BackupPC-users] Run command per file before storing in the pool

2022-02-17 Thread Robert Trevellyan
I tend to agree. It sounds like an image library archive project, not a
backup project.

Robert Trevellyan


On Thu, Feb 17, 2022 at 10:36 AM Greg Harris 
wrote:

> Yet another sideline sitter here.  However, here goes with a very
> questionable thought.  Maybe BackupPC is the wrong tool for this particular
> directory/instance?  Perhaps something like Amazon Glacier with Cryptomator
> is a wiser choice in this one scenario?
>
> Thanks,
>
> Greg Harris
>
> On Feb 17, 2022, at 10:27 AM, backu...@kosowsky.org wrote:
>
> G.W. Haywood via BackupPC-users wrote at about 13:24:26 + on Thursday,
> February 17, 2022:
>
> Hi there,
>
> On Thu, 17 Feb 2022, brogeriofernandes wrote:
>
> I'm wondering if would be possible to run a command just after
> client transfers file data but before it's stored in backuppc
> pool. My idea is to do an image compression, like jpeg-xl lossless,
> instead of the standard zlib one.
>
>
> Have you considered using a compressing filesystem on the server?
>
>
> I think that is the best idea as:
> 1. It is transparent to BackupPC
> 2. You benefit from all the optimizations of the underlying file
>   system
> 3. No new coding is needed
> 4. No need to create special compression cases for different file
>   types
> 5. Compression is automagically multi-threaded and cached/backgrounded
>   so as to minimally slow down the program (I never "feel" the
>   overhead of compression on my btrfs/zsd file system).
> 5. It's essentially totally reliable
>
> It's particularly easy for a filesystem like btrfs... where you can
> use 'zstd' which is both fast and good compression.
>
> I would compare the speed and compression ratio of btrfs with 'zstd'
> with the speed and compression ratio of your raw lossless jpg
> compression.
>
>
> ... more bandwidth-friendly ... compression before transferring to
> server ...
>
>
> The data can be compressed on the client by the transfer tools during
> the transfer.  This can be purely to reduce network load and it can be
> independent of any compression (perhaps by a different method) of the
> data when it is stored by the server.  The compression algorithms for
> transfer and storage can be chosen for different reasons.  Of course
> if it is required to perform multiple compression and/or decompression
> steps for each file, the server will have to handle an increased load.
>
> This can all be more or less transparent to BackupPC.
>
> --
>
> 73,
> Ged.
>
>
> ___
> BackupPC-users mailing list
> BackupPC-users@lists.sourceforge.net
> List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki:https://github.com/backuppc/backuppc/wiki
> Project: https://backuppc.github.io/backuppc/
>
>
>
> ___
> BackupPC-users mailing list
> BackupPC-users@lists.sourceforge.net
> List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki:https://github.com/backuppc/backuppc/wiki
> Project: https://backuppc.github.io/backuppc/
>
>
> ___
> BackupPC-users mailing list
> BackupPC-users@lists.sourceforge.net
> List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki:https://github.com/backuppc/backuppc/wiki
> Project: https://backuppc.github.io/backuppc/
>
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/


Re: [BackupPC-users] Run command per file before storing in the pool

2022-02-17 Thread Greg Harris
Yet another sideline sitter here.  However, here goes with a very questionable 
thought.  Maybe BackupPC is the wrong tool for this particular 
directory/instance?  Perhaps something like Amazon Glacier with Cryptomator is 
a wiser choice in this one scenario?

Thanks,

Greg Harris

On Feb 17, 2022, at 10:27 AM, 
backu...@kosowsky.org wrote:

G.W. Haywood via BackupPC-users wrote at about 13:24:26 + on Thursday, 
February 17, 2022:
Hi there,

On Thu, 17 Feb 2022, brogeriofernandes wrote:

I'm wondering if would be possible to run a command just after
client transfers file data but before it's stored in backuppc
pool. My idea is to do an image compression, like jpeg-xl lossless,
instead of the standard zlib one.

Have you considered using a compressing filesystem on the server?

I think that is the best idea as:
1. It is transparent to BackupPC
2. You benefit from all the optimizations of the underlying file
  system
3. No new coding is needed
4. No need to create special compression cases for different file
  types
5. Compression is automagically multi-threaded and cached/backgrounded
  so as to minimally slow down the program (I never "feel" the
  overhead of compression on my btrfs/zsd file system).
5. It's essentially totally reliable

It's particularly easy for a filesystem like btrfs... where you can
use 'zstd' which is both fast and good compression.

I would compare the speed and compression ratio of btrfs with 'zstd'
with the speed and compression ratio of your raw lossless jpg
compression.

... more bandwidth-friendly ... compression before transferring to
server ...

The data can be compressed on the client by the transfer tools during
the transfer.  This can be purely to reduce network load and it can be
independent of any compression (perhaps by a different method) of the
data when it is stored by the server.  The compression algorithms for
transfer and storage can be chosen for different reasons.  Of course
if it is required to perform multiple compression and/or decompression
steps for each file, the server will have to handle an increased load.

This can all be more or less transparent to BackupPC.

--

73,
Ged.


___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/


___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/


___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/


Re: [BackupPC-users] Run command per file before storing in the pool

2022-02-17 Thread backuppc
G.W. Haywood via BackupPC-users wrote at about 13:24:26 + on Thursday, 
February 17, 2022:
 > Hi there,
 > 
 > On Thu, 17 Feb 2022, brogeriofernandes wrote:
 > 
 > > I'm wondering if would be possible to run a command just after
 > > client transfers file data but before it's stored in backuppc
 > > pool. My idea is to do an image compression, like jpeg-xl lossless,
 > > instead of the standard zlib one.
 > 
 > Have you considered using a compressing filesystem on the server?

I think that is the best idea as:
1. It is transparent to BackupPC
2. You benefit from all the optimizations of the underlying file
   system
3. No new coding is needed
4. No need to create special compression cases for different file
   types
5. Compression is automagically multi-threaded and cached/backgrounded
   so as to minimally slow down the program (I never "feel" the
   overhead of compression on my btrfs/zsd file system).
5. It's essentially totally reliable

It's particularly easy for a filesystem like btrfs... where you can
use 'zstd' which is both fast and good compression.

I would compare the speed and compression ratio of btrfs with 'zstd'
with the speed and compression ratio of your raw lossless jpg
compression.
> 
 > > ... more bandwidth-friendly ... compression before transferring to
 > > server ...
 > 
 > The data can be compressed on the client by the transfer tools during
 > the transfer.  This can be purely to reduce network load and it can be
 > independent of any compression (perhaps by a different method) of the
 > data when it is stored by the server.  The compression algorithms for
 > transfer and storage can be chosen for different reasons.  Of course
 > if it is required to perform multiple compression and/or decompression
 > steps for each file, the server will have to handle an increased load.
 > 
 > This can all be more or less transparent to BackupPC.
 > 
 > -- 
 > 
 > 73,
 > Ged.
 > 
 > 
 > ___
 > BackupPC-users mailing list
 > BackupPC-users@lists.sourceforge.net
 > List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
 > Wiki:https://github.com/backuppc/backuppc/wiki
 > Project: https://backuppc.github.io/backuppc/


___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/


Re: [BackupPC-users] Run command per file before storing in the pool

2022-02-17 Thread backuppc
Bruno Rogério Fernandes wrote at about 09:43:13 -0300 on Thursday, February 17, 
2022:
 > Maybe I've got a solution
 > 
 > Instead of modifying backuppc behavior, I'm planning to disable 
 > compression setting at the server and create a FUSE filesystem that 
 > transparently compresses all the images using jpeg-xl format and put 
 > backuppc pool on top of that.

I would check the speed as FUSE file systems can tend to be
slow... especially since you said your machine is slow.

 > 
 > The only problem I can think of is that every time backuppc has to do 
 > some reading the FUSE will also need to decompress images on the fly. I 
 > have to do some testing because my server is not much powerful, just a 
 > dual-core system.

Decompressing only occurs if you need to read the underlying file as
otherwise the md5sum (and/or other file stats like timestamp, size,
owner, perms) are used to check for changes.

Plus, the same decompressing occurs if you use the built-in zlib
compression.

 > 
 > 
 > ___
 > BackupPC-users mailing list
 > BackupPC-users@lists.sourceforge.net
 > List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
 > Wiki:https://github.com/backuppc/backuppc/wiki
 > Project: https://backuppc.github.io/backuppc/


___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/


Re: [BackupPC-users] Run command per file before storing in the pool

2022-02-17 Thread Paul Fox
"G.W. Haywood via BackupPC-users" wrote:
 > 
 > On Thu, 17 Feb 2022, brogeriofernandes wrote:
 > 
 > > I'm wondering if would be possible to run a command just after
 > > client transfers file data but before it's stored in backuppc
 > > pool. My idea is to do an image compression, like jpeg-xl lossless,
 > > instead of the standard zlib one.
 > 
 > Have you considered using a compressing filesystem on the server?

Just kibitzing from the sidelines:  It seems like image manipulation
tools should be learning how to deal with the compresseed jpgs
directly.  Then there would be no reason not to simply compress them
all and leave them that way.

paul
=--
paul fox, p...@foxharp.boston.ma.us (arlington, ma)



___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/


Re: [BackupPC-users] Run command per file before storing in the pool

2022-02-17 Thread G.W. Haywood via BackupPC-users

Hi there,

On Thu, 17 Feb 2022, brogeriofernandes wrote:


I'm wondering if would be possible to run a command just after
client transfers file data but before it's stored in backuppc
pool. My idea is to do an image compression, like jpeg-xl lossless,
instead of the standard zlib one.


Have you considered using a compressing filesystem on the server?


... more bandwidth-friendly ... compression before transferring to
server ...


The data can be compressed on the client by the transfer tools during
the transfer.  This can be purely to reduce network load and it can be
independent of any compression (perhaps by a different method) of the
data when it is stored by the server.  The compression algorithms for
transfer and storage can be chosen for different reasons.  Of course
if it is required to perform multiple compression and/or decompression
steps for each file, the server will have to handle an increased load.

This can all be more or less transparent to BackupPC.

--

73,
Ged.


___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/


Re: [BackupPC-users] Run command per file before storing in the pool

2022-02-17 Thread Bruno Rogério Fernandes

Maybe I've got a solution

Instead of modifying backuppc behavior, I'm planning to disable 
compression setting at the server and create a FUSE filesystem that 
transparently compresses all the images using jpeg-xl format and put 
backuppc pool on top of that.


The only problem I can think of is that every time backuppc has to do 
some reading the FUSE will also need to decompress images on the fly. I 
have to do some testing because my server is not much powerful, just a 
dual-core system.



___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/


Re: [BackupPC-users] Run command per file before storing in the pool

2022-02-16 Thread Guillermo Rozas
>
> Unless I am missing something...
>

No, you're not missing anything. Effectively THIS is the hard part:


> Even if you did somehow pre-compress files and pipe them onto the
> client side rsync, you would probably break the ability for rsync to
> tell changed files based on stat'ing the file size...
>

One can probably feed rsync with a pre-processed version of the filesystem,
and that's what will be backed up. But this will inevitably end up in
re-compressing everything on every rsync session. There is probably no way
out if one wants to do it client side, as the OP wanted.

Regards,
Guillermo
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/


Re: [BackupPC-users] Run command per file before storing in the pool

2022-02-16 Thread backuppc
Guillermo Rozas wrote at about 22:37:09 -0300 on Wednesday, February 16, 2022:
 > >
 > > > Certainly, this would be more bandwidth-friendly if it was possible to
 > >  > do this compression before transferring to server, but I can't figure
 > >  > out how I could accomplish this.
 > > Presumably harder as it would require host-side code to do things
 > > such as running a patched host version of rsync (or other transfer
 > > method executable)
 > >
 > 
 > I think you can achieve this by running a wrapper script on the client and
 > tying its execution to the BackupPC key. The steps would be:
 > 
 > - generate a key to use by the rsync/ssh connection from BackupPC and
 > nothing else (you should do this for security reasons anyway)
 > - on the client's authorized_keys file, use the "command" option to execute
 > a script every time a connection from this key is made. This command will
 > run instead of the rsync command BackupPC wants to run, and the latter will
 > be saved in the SSH_ORIGINAL_COMMAND variable
 > - in this script, pipe in some way your compression process into the rsync
 > command sent originally by BackupPC
 > 
 > This is the method the rrsync [1] script uses to restrict rsync to a
 > certain folder and options (I'm actually using a modified version of it
 > with BackupPC). The hard part is to figure out the third point above...
 >


To do what you are suggesting, you don't need to play with the
authorized_key file... you can just set Conf{RsyncClientPath} to
whatever new version of rsync you have written.

But I really don't see how your suggestion solves anything...

Unless you want to re-write the guts of rsync on the client side
(similar to what Craig has done with rsync_bpc on the server side, I
don't see how you can pre-process files on the client side.

Even if you did somehow pre-compress files and pipe them onto the
client side rsync, you would probably break the ability for rsync to
tell changed files based on stat'ing the file size...

Unless I am missing something...


___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/


Re: [BackupPC-users] Run command per file before storing in the pool

2022-02-16 Thread Guillermo Rozas
>
> > Certainly, this would be more bandwidth-friendly if it was possible to
>  > do this compression before transferring to server, but I can't figure
>  > out how I could accomplish this.
> Presumably harder as it would require host-side code to do things
> such as running a patched host version of rsync (or other transfer
> method executable)
>

I think you can achieve this by running a wrapper script on the client and
tying its execution to the BackupPC key. The steps would be:

- generate a key to use by the rsync/ssh connection from BackupPC and
nothing else (you should do this for security reasons anyway)
- on the client's authorized_keys file, use the "command" option to execute
a script every time a connection from this key is made. This command will
run instead of the rsync command BackupPC wants to run, and the latter will
be saved in the SSH_ORIGINAL_COMMAND variable
- in this script, pipe in some way your compression process into the rsync
command sent originally by BackupPC

This is the method the rrsync [1] script uses to restrict rsync to a
certain folder and options (I'm actually using a modified version of it
with BackupPC). The hard part is to figure out the third point above...

Regards,
Guillermo

[1] http://www.guyrutenberg.com/2014/01/14/restricting-ssh-access-to-rsync/

>
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/


Re: [BackupPC-users] Run command per file before storing in the pool

2022-02-16 Thread brogeriofernandes
Thanks for answering

Em qua, 2022-02-16 às 13:39 -0500, backu...@kosowsky.org escreveu:
> brogeriofernan...@gmail.com wrote at about 15:21:10 -0300 on
> Wednesday, February 16, 2022:
>  > Hi everyone
>  > 
>  > Currently, I'm using urbackup to backup a lot of image files and
>  > thinking of migrating to backuppc.
>  > 
>  > I'm wondering if would be possible to run a command just after
> client
>  > transfers file data but before it's stored in backuppc pool. My
> idea is
>  > to do an image compression, like jpeg-xl lossless, instead of the
>  > standard zlib one.
> 
> That capability does not (yet) exist.
> Currently, BackupPC only supports zlib compression (or no
> compression)
> and even the level of the compression is not variable between files
> on
> a given backup (and frankly, compression is typically set uniformly
> for all hosts and backups for pooling consistency).
> 
> The ability to have a per-file option to specify compression type is
> an interesting feature request -- but would require some significant
> rearchitecting to:
> 
> 1. Create logic after the file transfer to decide what compression to
>    use, potentially based on a number of different possible criteria.
>    e.g., file extension, file size, regexp match on file name, file
>    type...
>    To do it right would add a fair bit of complexity and depending on
>    the test could slow down backups
> 
> 2. Decide on how to associate the compression type with the file.
> 
>    If this is done at the pool file level (as part of a header or
>    Magic number), then you may run into problems if for the same file
>    content, there is conflicting choices of compression (this could
>    happen between hosts or even for the same host if the files have
>    the same content but different names, for example)
> 
>    If this is done at the backup level by adding an entry to the
>    attrib file, then you have a similar problem of conflicting
>    compression directions for the same pooled file content.
> 
> 
> 3. Have a way to pass code (perl interpreted code? perl code stub?
>    command line function call) to do the compression.

In my use-case I wouldn't need to check file type as I'm backing up
just JPEG files, but you made I good point and it would be a more
complete solution that way.


> 
>  > Reading the docs, I couldn't find any mention to specifying a
> command
>  > to run before storing data, just commands to run before or after a
>  > backup is done as a whole (eg. $Conf{DumpPreUserCmd},
>  > $Conf{DumpPostUserCmd})
> 
> True
> 
> 
>  > Maybe one option would be to convert and keep both versions of
> files,
>  > JPEG and JPEG-XL, on client, but it's inviable to me because
> currently
>  > I have about 15TB of images and don't have enough room to keep
> both of
>  > them.
> 
> That is the best way.
> 
>  > 
>  > So, the steps I'm thinking of would be to transfer the files from
>  > client to server as usual but, just before storing them in the
> pool,
>  > running a command (in this case, cjxl) to do an image compression.
> Doing this right is not so simple as outlined above.
> Though the code is open source and you are encouraged to submit
> patches to do what you want here.

I would love to implement this feature, but honestly, I don't have any
clue where to start. So, as this feature doesn't exist, probably I'll
dig into sources to figure out how can I do that.

>  > Certainly, this would be more bandwidth-friendly if it was
> possible to
>  > do this compression before transferring to server, but I can't
> figure
>  > out how I could accomplish this.
> Presumably harder as it would require host-side code to do things
> such as running a patched host version of rsync (or other transfer
> method executable)
>  > 
>  > Any thoughts on this?
>  > 
>  > Many thanks
>  > 
>  > 
>  > ___
>  > BackupPC-users mailing list
>  > BackupPC-users@lists.sourceforge.net
>  > List:   
> https://lists.sourceforge.net/lists/listinfo/backuppc-users
>  > Wiki:    https://github.com/backuppc/backuppc/wiki
>  > Project: https://backuppc.github.io/backuppc/
> 
> 
> ___
> BackupPC-users mailing list
> BackupPC-users@lists.sourceforge.net
> List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki:    https://github.com/backuppc/backuppc/wiki
> Project: https://backuppc.github.io/backuppc/



___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/


Re: [BackupPC-users] Run command per file before storing in the pool

2022-02-16 Thread backuppc
brogeriofernan...@gmail.com wrote at about 15:21:10 -0300 on Wednesday, 
February 16, 2022:
 > Hi everyone
 > 
 > Currently, I'm using urbackup to backup a lot of image files and
 > thinking of migrating to backuppc.
 > 
 > I'm wondering if would be possible to run a command just after client
 > transfers file data but before it's stored in backuppc pool. My idea is
 > to do an image compression, like jpeg-xl lossless, instead of the
 > standard zlib one.

That capability does not (yet) exist.
Currently, BackupPC only supports zlib compression (or no compression)
and even the level of the compression is not variable between files on
a given backup (and frankly, compression is typically set uniformly
for all hosts and backups for pooling consistency).

The ability to have a per-file option to specify compression type is
an interesting feature request -- but would require some significant
rearchitecting to:

1. Create logic after the file transfer to decide what compression to
   use, potentially based on a number of different possible criteria.
   e.g., file extension, file size, regexp match on file name, file
   type...
   To do it right would add a fair bit of complexity and depending on
   the test could slow down backups

2. Decide on how to associate the compression type with the file.

   If this is done at the pool file level (as part of a header or
   Magic number), then you may run into problems if for the same file
   content, there is conflicting choices of compression (this could
   happen between hosts or even for the same host if the files have
   the same content but different names, for example)

   If this is done at the backup level by adding an entry to the
   attrib file, then you have a similar problem of conflicting
   compression directions for the same pooled file content.


3. Have a way to pass code (perl interpreted code? perl code stub?
   command line function call) to do the compression.


 > Reading the docs, I couldn't find any mention to specifying a command
 > to run before storing data, just commands to run before or after a
 > backup is done as a whole (eg. $Conf{DumpPreUserCmd},
 > $Conf{DumpPostUserCmd})

True


 > Maybe one option would be to convert and keep both versions of files,
 > JPEG and JPEG-XL, on client, but it's inviable to me because currently
 > I have about 15TB of images and don't have enough room to keep both of
 > them.

That is the best way.

 > 
 > So, the steps I'm thinking of would be to transfer the files from
 > client to server as usual but, just before storing them in the pool,
 > running a command (in this case, cjxl) to do an image compression.
Doing this right is not so simple as outlined above.
Though the code is open source and you are encouraged to submit
patches to do what you want here.

 > Certainly, this would be more bandwidth-friendly if it was possible to
 > do this compression before transferring to server, but I can't figure
 > out how I could accomplish this.
Presumably harder as it would require host-side code to do things
such as running a patched host version of rsync (or other transfer
method executable)
 > 
 > Any thoughts on this?
 > 
 > Many thanks
 > 
 > 
 > ___
 > BackupPC-users mailing list
 > BackupPC-users@lists.sourceforge.net
 > List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
 > Wiki:https://github.com/backuppc/backuppc/wiki
 > Project: https://backuppc.github.io/backuppc/


___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/