Aravinda,
I was testing glusterfind and wondering if you could provide some
feedback.
My system is RH7.1 and I am using gluster 3.7.5. My setup for testing
is a single brick with the parameters shown below...
I was testing glusterfind by copying over my source code and then
running 'glusterfind pre' (code is ~140,000 files). The results of the
test is that "glusterfind pre" took over an hour to process these
140,000 files and sat at 100% cpu-utilization for the extent of the run.
Is this expected and is this the expected rate for "glusterfind pre" to
process files?
The reason I am asking is because my production gluster system sees
approximately 2-million files changes per day. At this pace,
glusterfind cannot process the requests fast enough to keep up.
I also went back and tested file deletion through a removal of this
directory. Looking at the
/usr/var/lib/misc/glusterfsd/glusterfind/backup/gfs
/tmp_output_0 file, it looks like it is only processing 1000-files per
hour for file deletions.
[root@ff01bkp gfs]# gluster volume info
Volume Name: gfs
Type: Distribute
Volume ID: 7bbdfcf8-1801-4a2a-9233-0a3261cbcba7
Status: Started
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: ffib01bkp:/data/brick01/gfs
Options Reconfigured:
diagnostics.client-log-level: WARNING
diagnostics.brick-log-level: WARNING
server.allow-insecure: on
performance.readdir-ahead: on
storage.build-pgfid: on
changelog.changelog: on
changelog.capture-del-path: on
changelog.rollover-time: 90
changelog.fsync-interval: 30
client.event-threads: 8
server.event-threads: 8
------ Original Message ------
From: "Aravinda" <[email protected]>
To: "Mathieu Chateau" <[email protected]>; "M S Vishwanath Bhat"
<[email protected]>
Cc: "gluster-users" <[email protected]>
Sent: 9/7/2015 2:02:09 AM
Subject: Re: [Gluster-users] What is the recommended backup strategy for
GlusterFS?
We have one more tool. glusterfind!
This tool comes with gluster installaton, if you are using Gluster 3.7.
glusterfind enables Changelogging(Journal) to Gluster Volume and uses
that information to detect the changes happened in the Volume.
1. Create a glusterfind session using, glusterfind create
<SESSION_NAME> <VOLUME_NAME>
2. Do a full backup.
3. Run glusterfind pre command to generate the output file with the
list of changes happened in Gluster Volume after glusterfind create.
For usage information glusterfind pre --help
4. Consume that output file and backup only the files listed in output
file.
5. After consuming the output file, run glusterfind post command.
(glusterfind post --help)
Doc link:
http://gluster.readthedocs.org/en/latest/GlusterFS%20Tools/glusterfind/index.html
This tool is newly released with Gluster release 3.7, please report
issues or request for features here
https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS
regards Aravinda
On 09/06/2015 12:37 AM, Mathieu Chateau wrote:
Hello,
for my needs, it's about having a simple "photo" of files present 5
days ago for example.
But i do not want to store file data twice, as most file didn't
change.
Using snapshot is convenient of course, but it's risky as you loose
both data and snapshot in case of failure (snapshot only contains
delta blocks).
Rsync with hardlink is more resistant (inode stay until last reference
is removed)
But interested to hear about production setup relying on it
Cordialement,
Mathieu CHATEAU
http://www.lotp.fr
2015-09-05 21:03 GMT+02:00 M S Vishwanath Bhat <[email protected]>:
MS
On 5 Sep 2015 12:57 am, "Mathieu Chateau" <[email protected]>
wrote:
>
> Hello,
>
> so far I use rsnapshot. This script do rsync with rotation, and
most important same files are stored only once through hard link
(inode). I save space, but still rsync need to parse all folders to
know for new files.
>
> I am also interested in solution 1), but need to be stored on
distinct drives/servers. We can't afford to loose data and snapshot
in case of human error or disaster.
>
>
>
> Cordialement,
> Mathieu CHATEAU
> http://www.lotp.fr
>
> 2015-09-03 13:05 GMT+02:00 Merlin Morgenstern
<[email protected]>:
>>
>> I have about 1M files in a GlusterFS with rep 2 on 3 nodes runnnig
gluster 3.7.3.
>>
>> What would be a recommended automated backup strategy for this
setup?
>>
>> I already considered the following:
Have you considered glusterfs geo-rep? It's actually for disaster
recovery. But might suit your backup use case as well.
My two cents
//MS
>>
>> 1) glusterfs snapshots in combination with dd. This unfortunatelly
was not possible so far as I could not find any info on how to make a
image file out of the snapshots and how to automate the snapshot
procedure.
>>
>> 2) rsync the mounted file share to a second directory and do a tar
on the entire directory after rsync completed
>>
>> 3) combination of 1 and 2. Doing a snapshot that gets mounted
automaticaly and then rsync from there. Problem: How to automate
snapshots and how to know the mount path
>>
>> Currently I am only able to do the second option, but the fist
option seems to be the most atractive.
>>
>> Thank you for any help on this.
>>
>> _______________________________________________
>> Gluster-users mailing list
>> [email protected]
>> http://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
> _______________________________________________
> Gluster-users mailing list
> [email protected]
> http://www.gluster.org/mailman/listinfo/gluster-users
_______________________________________________ Gluster-users mailing
list
[email protected]http://www.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-users