Re: [CentOS] C8 and backup solution

2020-04-03 Thread John Pierce
On Fri, Apr 3, 2020 at 6:58 AM Chris Adams  wrote:

> It isn't just databases - there are other things that backing up
> individual files one at a time is not so good.  The best way to handle
> that is to freeze/snapshot the whole filesystem, and then back up the
> snapshot.  This can be scripted pretty easily if the filesystem is on
> LVM.
>

I tried this with a fairly busy and rather large database server (ok, it
was a database server running a simulation of a production workload) and
LVM (with either ext4 or xfs)  and found LVM snapshots to be completely
unworkable under busy 8K block random write workloads

now, ZFS does snapshots very nicely



-- 
-john r pierce
  recycling used bits in santa cruz
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] C8 and backup solution

2020-04-03 Thread David G. Miller

On 4/3/20 6:13 AM, miguel medalha wrote:

I have been using rsnapshot for years, with great success.

https://rsnapshot.org/


___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos
Since no one else has mentioned it as a solution, I've been using amanda 
for years.  About the only change has been replacing my physical tape 
drive with mhVTL (Anybody need some unused DSS-3 tapes? Free to good home.).


Cheers,
Dave

--
"They that can give up essential liberty to obtain a little temporary safety deserve 
neither safety nor liberty."

-- Benjamin Franklin

___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] C8 and backup solution

2020-04-03 Thread Simon Matter via CentOS
> Once upon a time, Valeri Galtsev  said:
>> On 4/3/20 8:34 AM, John Pierce wrote:
>> >Do note, backup systems that use rsync or similar file by file copies
>> of a
>> >running system do not make coherent atomic snapshots, so things like
>> >relational databases should be excluded from those, and backed by
>> database
>> >tools
>>
>> Long ago I learned to back up databases by dumping them (with a flag
>> "lock" or similar to make sure no changed are made during dump), and
>> backing up dump file.
>
> It isn't just databases - there are other things that backing up
> individual files one at a time is not so good.  The best way to handle
> that is to freeze/snapshot the whole filesystem, and then back up the
> snapshot.  This can be scripted pretty easily if the filesystem is on
> LVM.
>
> Even better is to freeze _all_ filesystems simultaneously - this is
> usually easiest if the system is a virtual machine and/or the storage is
> on a SAN with snapshot capabilities.

Then again, to get a _really_ consistent backup, you can only terminate
all applications who read/write files, or instruct the applications to go
into a state which allows consistent file backups. Of course after the
backup you have to instruct the applications to go on with the work. You
can not achieve this consistency even with freezing _all_ filesystems
simultaneously. That's why usually RDBMS's and other more complex
applications provide their own backup mechanism embedded into the
application.

That's why rsnapshot backups are not so much worse than filesystem snapshots.

Regards,
Simon

___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] C8 and backup solution

2020-04-03 Thread Chris Adams
Once upon a time, Valeri Galtsev  said:
> On 4/3/20 8:34 AM, John Pierce wrote:
> >Do note, backup systems that use rsync or similar file by file copies of a
> >running system do not make coherent atomic snapshots, so things like
> >relational databases should be excluded from those, and backed by database
> >tools
> 
> Long ago I learned to back up databases by dumping them (with a flag
> "lock" or similar to make sure no changed are made during dump), and
> backing up dump file.

It isn't just databases - there are other things that backing up
individual files one at a time is not so good.  The best way to handle
that is to freeze/snapshot the whole filesystem, and then back up the
snapshot.  This can be scripted pretty easily if the filesystem is on
LVM.

Even better is to freeze _all_ filesystems simultaneously - this is
usually easiest if the system is a virtual machine and/or the storage is
on a SAN with snapshot capabilities.
-- 
Chris Adams 
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] C8 and backup solution

2020-04-03 Thread Valeri Galtsev




On 4/3/20 8:34 AM, John Pierce wrote:

Do note, backup systems that use rsync or similar file by file copies of a
running system do not make coherent atomic snapshots, so things like
relational databases should be excluded from those, and backed by database
tools


Long ago I learned to back up databases by dumping them (with a flag 
"lock" or similar to make sure no changed are made during dump), and 
backing up dump file.


Just my 2 cents.

Valeri


___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos



--

Valeri Galtsev
Sr System Administrator
Department of Astronomy and Astrophysics
Kavli Institute for Cosmological Physics
University of Chicago
Phone: 773-702-4247

___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] C8 and backup solution

2020-04-03 Thread John Pierce
Do note, backup systems that use rsync or similar file by file copies of a
running system do not make coherent atomic snapshots, so things like
relational databases should be excluded from those, and backed by database
tools
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] C8 and backup solution

2020-04-03 Thread miguel medalha
I have been using rsnapshot for years, with great success.

https://rsnapshot.org/


___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] C8 and backup solution

2020-04-03 Thread Alessandro Baggi

Il 02/04/20 21:14, Karl Vogel ha scritto:

[Replying privately because my messages aren't making it to the list]


In a previous message, Alessandro Baggi said:

A> Bacula works without any problem, well tested, solid but complex to
A> configure. Tested on a single server (with volumes on disk) and a
A> full backup of 810gb (~15 files) took 6,30 hours (too much).

For a full backup, I'd use something like "scp -rp". Anything else
has overhead you don't need for the first copy.

Also, pick a good cipher (-c) for the ssh/scp commands -- it can improve
your speed by an order of magnitude. Here's an example where I copy
my current directory to /tmp/bkup on my backup server:

Running on: Linux x86_64
Thu Apr 2 14:48:45 2020

me% scp -rp -c aes128-...@openssh.com -i $HOME/.ssh/bkuphost_ecdsa \
. bkuphost:/tmp/bkup

Authenticated to remote-host ([remote-ip]:22).
ansible-intro 100% 16KB 11.3MB/s 00:00 ETA
nextgov.xml 100% 27KB 21.9MB/s 00:00 ETA
building-VM-images 100% 1087 1.6MB/s 00:00 ETA
sort-array-of-hashes 100% 1660 2.5MB/s 00:00 ETA
...
ex1 100% 910 1.9MB/s 00:00 ETA
sitemap.m4 100% 1241 2.3MB/s 00:00 ETA
contents 100% 3585 5.5MB/s 00:00 ETA
ini2site 100% 489 926.1KB/s 00:00 ETA
mkcontents 100% 1485 2.2MB/s 00:00 ETA

Transferred: sent 6465548, received 11724 bytes, in 0.4 seconds
Bytes per second: sent 18002613.2, received 32644.2

Thu Apr 02 14:48:54 2020

A> scripted rsync. Simple, through ssh protocol and private key. No agent
A> required on target. I use file level deduplication using hardlinks.

I avoid block-level deduplication as a general rule -- ZFS memory
use goes through the roof if you turn that on.

rsync can do the hardlinks, but for me it's been much faster to create
a list of SHA1 hashes and use a perl script to link the duplicates.
I can send you the script if you're interested.

This way, you're not relying on the network for anything other than the
copies; everything else takes place on the local or backup system.

A> Using a scripted rsync is the simpler way but there is something that
A> could be leaved out by me (or undiscovered error). Simple to restore.

I've never had a problem with rsync, and I've used it to back up Linux
workstations with ~600Gb or so. One caveat -- if you give it a really
big directory tree, it can get lost in the weeds. You might want to do
something like this:

1. Make your original backup using scp.

2. Get a complete list of file hashes on your production systems
using SHA1 or whatever you like.

3. Whenever you do a backup, get a (smaller) list of modified files
using something like "find ./something -newer /some/timestamp/file"
or just making a new list of file hashes and comparing that to the
original list.

4. Pass the list of modified files to rsync using the "--files-from"
option so it doesn't have to walk the entire tree again.

Good luck!

--
Karl Vogel / voge...@pobox.com / I don't speak for the USAF or my company

The best setup is having a wife and a mistress. Each of them will assume
you're with the other, leaving you free to get some work done.
--programmer with serious work-life balance issues


Hi Karl,

thank you for your answer. I'm trying ssh scripted rsync using a faster 
cypher like you suggested and seems that transfer on 10GB is better of 
default selected cypher (129 sec vs 116 using aes128-gcm, I tested this 
multiple times). Now I will try to check on the entire dataset and see 
how much benefit I gain.


Waiting that, what do you think about bacula as backup solution?

Thank you in advance.


___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] C8 and backup solution

2020-04-02 Thread Alessandro Baggi
Yes you are right. I meant that I don't need an real agent like with bacula
that need to be configured completely

Il Gio 2 Apr 2020, 19:52 Jonathan Billings  ha scritto:

> On Thu, Apr 02, 2020 at 05:32:35PM +0200, Alessandro Baggi wrote:
> > scripted rsync. Simple, through ssh protocol and private key. No agent
> > required on target.
>
> Just a point of clarification -- you need an rsync binary on both
> sides of the ssh session, so 'rsync' would be the agent needed on the
> target.
>
> --
> Jonathan Billings 
> ___
> CentOS mailing list
> CentOS@centos.org
> https://lists.centos.org/mailman/listinfo/centos
>
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] C8 and backup solution

2020-04-02 Thread Jonathan Billings
On Thu, Apr 02, 2020 at 05:32:35PM +0200, Alessandro Baggi wrote:
> scripted rsync. Simple, through ssh protocol and private key. No agent
> required on target.

Just a point of clarification -- you need an rsync binary on both
sides of the ssh session, so 'rsync' would be the agent needed on the
target. 

-- 
Jonathan Billings 
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] C8 and backup solution

2020-04-02 Thread Alessandro Baggi

Il 02/04/20 17:49, Nicolas Kovacs ha scritto:

Le 02/04/2020 à 17:32, Alessandro Baggi a écrit :

I have not so much experiences on backups and choosing the bad tool could be
dangerous so I need some suggestion.

What backup solution do you suggest for my scenario?

I'm using Rsnapshot on all my CentOS 7 servers. It's a very elegant solution
that follows the KISS principle.

I've written a little blog article about it. It's in French, but the Unix bits
are universal. :o)

https://blog.microlinux.fr/rsnapshot-centos-7/

It's basically scripted rsync with ssh on steroids. Been using it for the last
five years or so. It just works.

Cheers,

Niki


Hi Niki,
thank you for your answer.

I remember you when I used Slackware and I think your KISS is inherited 
by there but sometimes more complex things are needed. Rsnapshot is a 
great tool but I need catalog, jobs info, pre/post job script on remote 
target, mailing, compression and possibly store data off-site (no on a 
public cloud).



___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] C8 and backup solution

2020-04-02 Thread Nicolas Kovacs
Le 02/04/2020 à 17:32, Alessandro Baggi a écrit :
> I have not so much experiences on backups and choosing the bad tool could be
> dangerous so I need some suggestion.
> 
> What backup solution do you suggest for my scenario?

I'm using Rsnapshot on all my CentOS 7 servers. It's a very elegant solution
that follows the KISS principle.

I've written a little blog article about it. It's in French, but the Unix bits
are universal. :o)

https://blog.microlinux.fr/rsnapshot-centos-7/

It's basically scripted rsync with ssh on steroids. Been using it for the last
five years or so. It just works.

Cheers,

Niki

-- 
Microlinux - Solutions informatiques durables
7, place de l'église - 30730 Montpezat
Site : https://www.microlinux.fr
Mail : i...@microlinux.fr
Tél. : 04 66 63 10 32
Mob. : 06 51 80 12 12
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


[CentOS] C8 and backup solution

2020-04-02 Thread Alessandro Baggi

Hi list,

I'm searching a valid backup system to perform backups of 3 server, one 
local and 2 remote, and 2 linux workstation. (this number could be 
higher in future). Currently I'm testing bacula, scripted rsync with 
hardlink and borgbackup on C8.1.


Bacula works without any problem, well tested, solid but complex to 
configure. Tested on a single server (with volumes on disk) and  a full 
backup of 810gb (~15 files) took 6,30 hours (too much). I would run 
deduplication but 1. bacula on C8 is not compiled with aligned 2. bacula 
put attention on some possibile bad scenario where deduplication could 
be a problem in losing one block that could break many files with the 
same shared block. So deduplication is secure or I should be away from it?


scripted rsync. Simple, through ssh protocol and private key. No agent 
required on target. I use file level deduplication using hardlinks. To 
perform compression and block deduplication I could use fs like zfs (not 
available from epel) or use something like stratis (I don't checked if 
it offer deduplication in this moment). Encryption could be performed on 
fs level. Using a scripted rsync is the simpler way but there is 
something that could be leaved out by me (or undiscovered error). Simple 
to restore.


BorgBackup is another solution similar to rsync in some way. It works 
like rsync through ssh but while rsync simply sends data, borg run 
deduplication, compression and encryption on place. Why is similar 
rsync? Because I need another script to run borg. It is not a complete 
solution and it works on push method and not pull like with bacula or 
rsync. If I would manage all my server from a central backup server is a 
problem, so to accomplish this I should run the borg command from the 
central server on the target server that point back to the remote 
repository (located on the central server). There is another solution 
with borg: using sshfs but is 4 time slower (too much solwer).


I have not so much experiences on backups and choosing the bad tool 
could be dangerous so I need some suggestion.


What backup solution do you suggest for my scenario?

Thank you in advance.

Alessandro.

___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos