Re: [Bacula-users] Fix documentation on deduplication

2024-04-24 Thread Roberto Greiner


Em 24/04/2024 04:30, Radosław Korzeniewski escreveu:

Hello,

wt., 23 kwi 2024 o 13:33 Roberto Greiner  napisał(a):


Em 23/04/2024 04:34, Radosław Korzeniewski escreveu:

Hello,

śr., 17 kwi 2024 o 14:01 Roberto Greiner 
napisał(a):


The error is at the end of the page, where it says that you
can see how
much space is being used using 'df -h', but the problem is
that df can't
actually see the space gain from dedup, it shows how much
would be used
without dedup.


This command (df -h) shows how much allocated and free space is
available on the filesystem. So when you have a dedup ratio 20:1,
and you wrote 20TB, then your df command shows 1TB allocated.


But that is the exact problem I had. df did NOT show 1TB
allocated. It indicated 20TB allocated (yes, in ZFS).

I have not used ZFS Dedup for a long time (I'm a ZFS user from the 
first beta in Solaris), so I'm curious - if your zpool is 2TB in size 
and you have a 20:1 dedup ratio with 20TB saved and 1TB allocated then 
what df shows for you?

Something like this?
Size: 2TB
Used: 20TB
Avail: 1TB
Use%: 2000%

No, the values are quite different. I wrote 20tb to stay with the 
example previously given. My actual numbers are:


df: 2,9TB used
zpool list: 862GB used, 3.4x dedup level.
Actual partition size: 7.2TB

Roberto


--

-
Marcos Roberto Greiner

   Os otimistas acham que estamos no melhor dos mundos
Os pessimistas tem medo de que isto seja verdade
 James Branch Cabell
  -
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Fix documentation on deduplication

2024-04-23 Thread Roberto Greiner


Em 23/04/2024 04:34, Radosław Korzeniewski escreveu:

Hello,

śr., 17 kwi 2024 o 14:01 Roberto Greiner  napisał(a):


The error is at the end of the page, where it says that you can
see how
much space is being used using 'df -h', but the problem is that df
can't
actually see the space gain from dedup, it shows how much would be
used
without dedup.


This command (df -h) shows how much allocated and free space is 
available on the filesystem. So when you have a dedup ratio 20:1, and 
you wrote 20TB, then your df command shows 1TB allocated.


But that is the exact problem I had. df did NOT show 1TB allocated. It 
indicated 20TB allocated (yes, in ZFS).



Yes, zpool list shows you the exact Dedup ratio achieved without 
additional checking or counting. But this command (as mentioned by 
Heitor) will work with ZFS only.
Aligned volumes can be used with external deduplication appliances 
where zpool command is unavailable. Then you can quickly check with 
the df -h command.


Yes zpool listed all the information properly, both the actually 
allocated space and the dedup ratio, and as I said, in ZFS, df is not 
showing the correct information (in an Ubuntu 22.04 and ZFS environment).


Thank you,

Roberto


--
-
Marcos Roberto Greiner

   Os otimistas acham que estamos no melhor dos mundos
Os pessimistas tem medo de que isto seja verdade
 James Branch Cabell
  -
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] Fix documentation on deduplication

2024-04-17 Thread Roberto Greiner

Hy,

I've installed a bacula system using ZFS deduplication in an Ubuntu 22.4 
server, and one thing that made me lose a lot of time is that there is 
an error in the documentation, more specifically on this page:


https://www.bacula.lat/community/block-level-file-system-deduplication-with-aligned-volumes-tutorial-bacula-9-0-8-and-above/?lang=en

The same page is available in Portuguese, with the same problem, in the 
following address:


https://www.bacula.lat/community/dedup-alinhado/

The error is at the end of the page, where it says that you can see how 
much space is being used using 'df -h', but the problem is that df can't 
actually see the space gain from dedup, it shows how much would be used 
without dedup.


After some search, I found in the chapter 1.7 of 
'https://bacula.org/whitepapers/DedupVolumes.pdf' that the proper 
command for checking dedup usage in ZFS is 'zpool list', and that 
command did show that dedup was working properly.


These are my outputs with the two commands:

user@bacula2:~$ df -h
Filesystem Size  Used Avail Use% Mounted on
tmpfs  788M  2,8M  786M   1% /run
/dev/mapper/ubuntu--vg-ubuntu--lv  910G   52G  812G   6% /
tmpfs  3,9G 0  3,9G   0% /dev/shm
tmpfs  5,0M 0  5,0M   0% /run/lock
/dev/sda2  2,0G  252M  1,6G  14% /boot
zfs    6,4T  128K  6,4T   1% /zfs
zfs/mnt    9,2T  2,9T  6,4T  31% /zfs/mnt
tmpfs  788M  4,0K  788M   1% /run/user/0
tmpfs  788M  4,0K  788M   1% /run/user/1000
user@bacula2:~$ zpool list
NAME   SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP DEDUP    
HEALTH  ALTROOT
zfs   7.27T   850G  6.44T    - - 3%    11% 3.41x    
ONLINE  -


So, could someone please correct the two above mentioned pages? It would 
avoid others from having the same problem.


Thank you,

Roberto



--
-
Marcos Roberto Greiner

   Os otimistas acham que estamos no melhor dos mundos
Os pessimistas tem medo de que isto seja verdade
 James Branch Cabell
  -



___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Backup in disk AND tape

2024-04-03 Thread Roberto Greiner

Em 02/04/2024 19:58, Bill Arlofski via Bacula-users escreveu:

On 4/2/24 12:01 PM, Roberto Greiner wrote:

Hi,

I've installed Bacula recently in a server with a 7TB RAID5 storage, and
a LTO-6 tape unit.

I have configured 9 remote servers (most Linux, one Windows) to have the
backup made in this server in the disk storage, and I'm finish to
understand how to do the tape backup. Now, I have a question about
making the backup into both destinations.

I have the following setup for JobsDef:

JobDefs {
    Name = "DefaultJob"
    Type = Backup
    Level = Incremental
    Client = bacula2-fd
    FileSet = "Full Set"
    Schedule = "WeeklyCycle"
    Storage = FileAligned
    Messages = Standard
    Pool = File
    SpoolAttributes = yes
    Priority = 10
    Write Bootstrap = "/opt/bacula/working/%c.bsr"
}

Then I added a server to have the backup, let's say (it's a linux,
despite the name):

Job {
    Name = "AD"
    JobDefs = "DefaultJob"
    Client = ad-fd
    FileSet = "etc"
}

This will, obviously go to the dedup-disk storage. The question is, how
should I add the tape setup? Is there a way to add a couple of lines to
the job definition above so that the backup goes to both systems? Should
I create a separate job definition for the tape backup? Some other way I
didn't consider?

Thanks,

Roberto


PS: The storage definitions for the disk and tape destinations:

Storage {
    Name = FileAligned
    Address = bacula2
    SDPort = 9103
    Password = ""
    Device = Aligned-Disk
    Media Type = File1
}

Storage {
    Name = Fita
    Address = bacula2
    SDPort = 9103
    Password = ""
    Device = Ultrium
    Media Type = LTO
}


Hello Marcos,

With Bacula, there are almost always 10+ different ways to accomplish 
things, and/or to even think about them.


For example, you can override the Pool, Level, and Storage in a 
Schedule...


So, with this in mind, you might set your job to run Incs each weekday 
to disk, and then set the Fulls to run to tape on the weekend. (just 
one idea)


Another option is to use Copy jobs. With Copy jobs, you can run your 
Incs and Fulls to disk, then you can run a Copy job to copy your Incs, 
Fulls, or both to tape during normal working hours because Copy jobs 
do not make use of any Clients, so business productivity will not be 
affected on your server(s).


In your case, I would probably go with a Copy job. This way, you have 
your backups on disk for fast restores when needed, and you have the 
same data copied to new jobids onto tape - maybe with longer retention 
periods, for example.


Also have a look at the `SelectionType = PoolUncopiedJobs` feature for 
Copy jobs. This is a nice, handy "shortcut" to make sure that each of 
your jobs in some Pool is copied once, and only once to tape.


In this case, you can have two Copy jobs configured, one looking at 
your Full disk pool and one looking at your Inc disk pool and copying 
jobs that have not been copied.


OR, you can have one copy job running on a schedule where the Pool is 
overridden at two different times of the day to copy from the Full 
disk pool, and then also from the Inc disk pool.


OR... (lol I said 10, so I am working towards that number, and I am 
getting close :) ... You can have your normal backup jobs include a 
`RunScript {RunsWhen = after}` section which triggers an immediate 
copy of the job to tape as soon as it is completed.


So, I would start with a look at Copy jobs and see where that goes. :)

Feel free to ask more questions once you have taken a look at Copy jobs.


Hope this helps,
Bill


Yes, this helps A LOT. I will study the copy job option. It really seems 
perfect for my scenario.


Tks!!!!

Roberto


--
-
Marcos Roberto Greiner

   Os otimistas acham que estamos no melhor dos mundos
Os pessimistas tem medo de que isto seja verdade
 James Branch Cabell
  -



___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] Backup in disk AND tape

2024-04-02 Thread Roberto Greiner

Hi,

I've installed Bacula recently in a server with a 7TB RAID5 storage, and 
a LTO-6 tape unit.


I have configured 9 remote servers (most Linux, one Windows) to have the 
backup made in this server in the disk storage, and I'm finish to 
understand how to do the tape backup. Now, I have a question about 
making the backup into both destinations.


I have the following setup for JobsDef:

JobDefs {
  Name = "DefaultJob"
  Type = Backup
  Level = Incremental
  Client = bacula2-fd
  FileSet = "Full Set"
  Schedule = "WeeklyCycle"
  Storage = FileAligned
  Messages = Standard
  Pool = File
  SpoolAttributes = yes
  Priority = 10
  Write Bootstrap = "/opt/bacula/working/%c.bsr"
}

Then I added a server to have the backup, let's say (it's a linux, 
despite the name):


Job {
  Name = "AD"
  JobDefs = "DefaultJob"
  Client = ad-fd
  FileSet = "etc"
}

This will, obviously go to the dedup-disk storage. The question is, how 
should I add the tape setup? Is there a way to add a couple of lines to 
the job definition above so that the backup goes to both systems? Should 
I create a separate job definition for the tape backup? Some other way I 
didn't consider?


Thanks,

Roberto


PS: The storage definitions for the disk and tape destinations:

Storage {
  Name = FileAligned
  Address = bacula2
  SDPort = 9103
  Password = ""
  Device = Aligned-Disk
  Media Type = File1
}

Storage {
  Name = Fita
  Address = bacula2
  SDPort = 9103
  Password = ""
  Device = Ultrium
  Media Type = LTO
}




--
-----
Marcos Roberto Greiner

   Os otimistas acham que estamos no melhor dos mundos
Os pessimistas tem medo de que isto seja verdade
 James Branch Cabell
  -



___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users