Re: [Gluster-users] What is the recommended backup strategy for GlusterFS?

David Robinson Mon, 26 Oct 2015 10:50:17 -0700

Aravinda,

I was testing glusterfind and wondering if you could provide somefeedback.

My system is RH7.1 and I am using gluster 3.7.5. My setup for testingis a single brick with the parameters shown below...I was testing glusterfind by copying over my source code and thenrunning 'glusterfind pre' (code is ~140,000 files). The results of thetest is that "glusterfind pre" took over an hour to process these140,000 files and sat at 100% cpu-utilization for the extent of the run.Is this expected and is this the expected rate for "glusterfind pre" toprocess files?

The reason I am asking is because my production gluster system seesapproximately 2-million files changes per day. At this pace,glusterfind cannot process the requests fast enough to keep up.

I also went back and tested file deletion through a removal of thisdirectory. Looking at the/usr/var/lib/misc/glusterfsd/glusterfind/backup/gfs/tmp_output_0 file, it looks like it is only processing 1000-files perhour for file deletions.



[root@ff01bkp gfs]# gluster volume info
Volume Name: gfs
Type: Distribute
Volume ID: 7bbdfcf8-1801-4a2a-9233-0a3261cbcba7
Status: Started
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: ffib01bkp:/data/brick01/gfs
Options Reconfigured:
diagnostics.client-log-level: WARNING
diagnostics.brick-log-level: WARNING
server.allow-insecure: on
performance.readdir-ahead: on
storage.build-pgfid: on
changelog.changelog: on
changelog.capture-del-path: on
changelog.rollover-time: 90
changelog.fsync-interval: 30
client.event-threads: 8
server.event-threads: 8

------ Original Message ------
From: "Aravinda" <[email protected]>

To: "Mathieu Chateau" <[email protected]>; "M S Vishwanath Bhat"<[email protected]>

Cc: "gluster-users" <[email protected]>
Sent: 9/7/2015 2:02:09 AM

Subject: Re: [Gluster-users] What is the recommended backup strategy forGlusterFS?

We have one more tool. glusterfind!
This tool comes with gluster installaton, if you are using Gluster 3.7.glusterfind enables Changelogging(Journal) to Gluster Volume and usesthat information to detect the changes happened in the Volume.
1. Create a glusterfind session using, glusterfind create<SESSION_NAME> <VOLUME_NAME>
2. Do a full backup.
3. Run glusterfind pre command to generate the output file with thelist of changes happened in Gluster Volume after glusterfind create.For usage information glusterfind pre --help4. Consume that output file and backup only the files listed in outputfile.5. After consuming the output file, run glusterfind post command.(glusterfind post --help)
Doc link:http://gluster.readthedocs.org/en/latest/GlusterFS%20Tools/glusterfind/index.html
This tool is newly released with Gluster release 3.7, please reportissues or request for features herehttps://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS
regards Aravinda
On 09/06/2015 12:37 AM, Mathieu Chateau wrote:
Hello,
for my needs, it's about having a simple "photo" of files present 5days ago for example.But i do not want to store file data twice, as most file didn'tchange.Using snapshot is convenient of course, but it's risky as you looseboth data and snapshot in case of failure (snapshot only containsdelta blocks).Rsync with hardlink is more resistant (inode stay until last referenceis removed)
But interested to hear about production setup relying on it

Cordialement,
Mathieu CHATEAU
http://www.lotp.fr

2015-09-05 21:03 GMT+02:00 M S Vishwanath Bhat <[email protected]>:
MS
On 5 Sep 2015 12:57 am, "Mathieu Chateau" <[email protected]>wrote:
>
> Hello,
>
> so far I use rsnapshot. This script do rsync with rotation, andmost important same files are stored only once through hard link(inode). I save space, but still rsync need to parse all folders toknow for new files.
>
> I am also interested in solution 1), but need to be stored ondistinct drives/servers. We can't afford to loose data and snapshotin case of human error or disaster.
>
>
>
> Cordialement,
> Mathieu CHATEAU
> http://www.lotp.fr
>
> 2015-09-03 13:05 GMT+02:00 Merlin Morgenstern<[email protected]>:
>>
>> I have about 1M files in a GlusterFS with rep 2 on 3 nodes runnniggluster 3.7.3.
>>
>> What would be a recommended automated backup strategy for thissetup?
>>
>> I already considered the following:
Have you considered glusterfs geo-rep? It's actually for disasterrecovery. But might suit your backup use case as well.
My two cents

//MS

>>
>> 1) glusterfs snapshots in combination with dd. This unfortunatellywas not possible so far as I could not find any info on how to make aimage file out of the snapshots and how to automate the snapshotprocedure.
>>
>> 2) rsync the mounted file share to a second directory and do a taron the entire directory after rsync completed
>>
>> 3) combination of 1 and 2. Doing a snapshot that gets mountedautomaticaly and then rsync from there. Problem: How to automatesnapshots and how to know the mount path
>>
>> Currently I am only able to do the second option, but the fistoption seems to be the most atractive.
>>
>> Thank you for any help on this.
>>
>> _______________________________________________
>> Gluster-users mailing list
>> [email protected]
>> http://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
> _______________________________________________
> Gluster-users mailing list
> [email protected]
> http://www.gluster.org/mailman/listinfo/gluster-users
_______________________________________________ Gluster-users mailinglist[email protected]http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] What is the recommended backup strategy for GlusterFS?

Reply via email to